In the past I have written about what a microservice architecture looks like and some of the tradeoffs that come with it. One of the biggest gains this architecture brings you is the ability for many people to be working on different projects at the same time. Inevitably the services you are writing will be used or you may need to consume someone else’s API to get your work done.
This is easy right?
Let us assume you need to write an API for another team. You will sit down, write out the spec on a ticket and you are off be off. It will have the following fields:
Sit down, write tests, then code (yes TDD!), and it’s done. Pull request up. Approved. Deployed.
Meanwhile the "consumer" is writing their client code based on what you put on the ticket. They have everything stubbed out in their tests and eagerly awaiting the "producer" to deploy their code. The API ticket is marked as deployed. Lets see how it runs.
400 - Bad Request. It doesn’t work.
Let’s take a look at the API spec again:
- Id - Was this a number? String? UUID?
- Name - What will this look like? All caps?
- Description - Can this be null? What about length?
- Created - Formatted string or unix time?
- Deleted - Same
- Version - Can this be null?
Looking at it now there may have been questions at the onset when writing this spec which could have prevented an error from occurring. But looking at this from a higher level you can see that the issue here was lack of communication and detail on the part of the "producer" and bad assumptions by the "consumer". Both parties had done development and written passing tests and yet there was still the error when it went live. Could this have been communicated through something other than a ticket or a document?
Let’s assume there is an existing API that a team supports. A ticket comes across that says, “Remove Version Field From Resource”. A team member grabs it and it is done in a matter of minutes.
Pull Request up. Approved. Deployed. Easy. Lunch.
After lunch the Reporting Slack channel is blowing up. “Why are the APIs failing?” “Why do we see 400s?”
Then someone sees that a ticket was deployed around removing a field. A whole afternoon's worth of work was lost putting out a fire. Guess the ticket was not so easy after all?
Once an endpoint is made public there always needs to be an assumption that someone is using it and therefore backwards compatibility needs to present to ensure we don’t cause failures on other teams. If a resource needs change for a big reason other teams will then need to understand the endpoint is going to be deprecated and need to change.
Versioning is a common strategy to make sure these kinds of issues don’t occur but maintaining and supporting so many endpoints is prone to problems as your code base grows. So is there an easy way to tell if you are going to break someone else's code before you deploy?
Consumer Driven Contracts
So in the previous two examples we saw common issues that occur when developing microservices. They fall into two categories, the first being the lack of a contract and the second is a violation of the contract. By contract we simply mean an agreed upon payload and endpoint between APIs.
“But you said this was the issue in the first example!” you might be thinking and you’re right. Notes on a ticket doesn’t really constitute a full contract (more of an agreement). If in real life you break a contract there is some sort of penalty associated with the violation. For the contract to be strong it needs to be detailed and maybe address edge cases and basic use cases.
So how do you penalize a developer for breaking a contract? Escalate it to management? Publicly shame them? Make them buy you a beer? All of those options seem very… personal and wouldn’t build for good team relationships (except maybe the beer). What about a test?
No, I’m not talking about the type of test we talked about in the first example where two developers go off into a corner and come back with two different types of resources. There are testing libraries that exist that allow two developers to sit down and create a shared contract. Then go off and build their systems with the confidence that they are working on the same type of code.
This provides an opportunity for Test Driven Development. The libraries automatically make tests that run against the producer to ensure they are fulfilling the contract, meanwhile tests can be written against the stub to ensure the “consumer” is using the API correctly. When the two are done they should work seamlessly because they have been developing against the same code.
Furthermore, we have a built in smoke test on the producer code now because if the endpoint changes in any way we will see a test failure because the contract will have been broken. This broken test, if left unchecked can be the penalty mentioned earlier in the form of the very public shaming of breaking the build.
What about AMQP?
Ah, you thought that I’d forgotten about Event Driven Architectures didn’t you? How are you to write a contract around an AMQP message? Well both of these libraries have you covered. You can write a similar contract that points to a specific exchange and payload design. This will allow for the same sorts of checks to happen on the producer at test time and allow a mock for the consumer to use.
Some libraries allow for edge cases to be tested on the client end which can help a developer build a more robust system. Spring Cloud Contract is built on a library called WireMock which has a special library that can simulate some common HTTP errors. This can be especially useful to consumers who may want to test bad URI routing or 403 errors against a system without actually needed another API to be present.
Sure, this seems great, but why can we not just use integration tests? I mean we are integrating with another system after all so would an integration test make sense? Yes and no.
There is a large overhead cost for integration tests. Often these tests can be flakey, long running, with the moon in the right phase, so we like to keep them to a minimum. Taking a gander at Ye Olde Testing Pyramid we can see that integration tests are to be smaller while unit tests are the largest.
In the middle you’ll see consumer driven tests. This is where the meat of the business logic lives and any sort of inputs should be mocked. The business logic does not care where the data comes from but acts rather like a machine that can return outputs in an expected way.
Consumer driven contract tests can provide for us a realistic mock for the consumer by mocking what the interactions with the API system. It also has the added benefit of being a form unit test around the “producer” to ensure the outputs are correct.
In the end integration tests are not unnecessary but should be kept to a minimum. What is important however is finding a common way to make sure the work is being accurately documented so both parties can be in agreement on what they are working on together.
Communication is key in this line of work and we should be looking to improve communication lines between teams wherever necessary. Sitting down and writing a contract together can reduce large amount of ticket thrash, upset managers, and frustrated developers.