Skip to main content

Defensive System Integration

A big part of my current job is getting different systems to work together and sometimes to work in a way not entirely intended by the original authors. For example, getting a SSO server to share account data with a CRM platform, or getting any "enterprise" system to have a reasonable user interface (enterprise software is always ugly by default). One important consideration is how much I trust the system I am working to integrate: it's more work to be paranoid, but sometimes the software is out to get you.

I tend to trust popular open source libraries, such as those included in the Apache family like Lucene, Hadoop or Cassandra. Level of activity is an important indicator of a high quality open source project. I also tend to trust self-contained libraries more than external services, since many network and availability failure modes just don't apply when code runs in the same runtime as my own application logic.

Conversely, I distrust closed vendor systems and open systems where the bulk of the code comes from a vendor that supports the system. There are some exceptions to this, for example vendor-managed  relational databases like Oracle DB or heavily-used libraries like the Amazon AWS SDK. If the vendor's interface is changing rapidly, I will be more cautious in my approach.

Based on my level of trust, I then apply some design rules to protect the system I am building:

  1. I always decouple the supplied interface from my domain logic. I typically build my own class to encapsulate objects and operations in the supplied interface.  None of the other code in my system is allowed to interface directly to the vendor interfaces. Then if the vendor's interface changes or if I need to use it differently in the future, there is a place to work without impacting the rest of my application logic. This technique also allows me to create a consistent domain model even when supplied libraries use very different paradigms.
  2. When working with any third-party system, I try to minimize changes and patches to the supplied code. This often means building a more conventional application that wraps or interfaces with the external system. Sometimes patching the vendor's code is unavoidable,  but I prefer to push vendors to fix their own bugs and to run their systems in an unaltered condition.
  3. If I really distrust a service, I plan for it to fail.  There are different ways a service fails, such as simply not responding, taking a long time to produce results or returning invalid results. Undesirable behavior is often intermittent and unprecedented.  Judicious use of timeouts is a good first step when responsiveness is a concern. Automated tests and in-application monitoring can help mitigate invalid results. At the extreme, my system can take corrective actions to heal the failed component or system, for example the system could restart a failed component. There is no practical way to guarantee good behavior from a vendor product, but I try to manage the fallout when things go wrong.

Defensive design for third-party products allows software developers to create a consistent domain model, to prevent future changes from cascading through the system and to enable systems to fail in a controlled way, perhaps even the ability for a system to heal itself.

Comments

Popular posts from this blog

ReactJS, NPM and Maven

I'm just starting to get into working with ReactJS, Facebook's open source rendering framework. My project uses SpringBoot for annotation-driven dependency injection and MVC. I thought it would be great if I could use a bit of ReactJS to enhance the application. If you're looking for a basic conceptual intro, I recommend ReactJS for Stupid People and of course the official documentation  is quite good. In full disclosure, I still have no idea how to do "flux" yet. As an experienced Java backend developer, I'm pretty decent at hacking Maven builds - which is precisely what this blog post is going to be about. First, a word about how React likes to be built. Like many front-end tools, there is a toolkit for the node package manager (NPM). From the command prompt, one might run npm install -g react-tools  which installs the jsx command. The  jsx  command provides the ability to transform JSX syntax into ordinary JavaScript, which is precisely what I want...

Solved: Unable to Locate Spring Namespace Handler

I attempted to run a Spring WebMVC application, and when starting up the application complained that it didn't know how to handle the MVC namespace in my XML configuration. The project runs JDK 7 and Spring 4.0.6 using Maven as the build system. The following is my XML configuration file: <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans"        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"        xmlns:mvc="http://www.springframework.org/schema/mvc"        xsi:schemaLocation="         http://www.springframework.org/schema/beans         http://www.springframework.org/schema/beans/spring-beans.xsd         http://www.springframework.org/schema/mvc         http://www.springframework.org/schema/mvc/spring-mvc.xsd">          <mvc:annotation-driven/> ...

Capture Everything

This week I've started planning for the next version of our data collection system. The key realization for me is that I do not know all the questions we will need to answer in the future. Our current focus is on specific sequences of click events, but in the future we might want to look at browser versions or behavioral patterns related to IP addresses. If we don't capture user-agent, for example, we won't be able to answer questions about browser versions. If we don't capture IP then we cannot look for patterns in IP addresses. We should store data in a way that maximizes the range of questions we can address in the future. In the past few years, the cost of storing data have continued to fall. We use AWS extensively.  Amazon S3 costs are very reasonable and guarantees a high level of availability. Also, lower compute costs and open source tools like Hadoop that process large data volumes have greatly increased our ability to extract valuable insights from data. So s...