Keyvisual schmal coupling-metrics 1280x361

    Journey to Loosely Coupled Services and Data Consistency

    The content reference here like the pain points and the based architecture are purely imaginary. It is just to illustrate the solutions presented that are a mixture of past experiences and current applicable architecture and patterns. So that we will be able to reach to a conclusion on Event-driven Microservices Architecture with Saga pattern approach.

    In this article you will read about

    • Problem #1: Growing number of users leads to increased complexity
    • Problem #2: Scaling up the application is expensive
    • Problem #3: Isolating failures
    • Benefit of Event Driven Architecture
    • What is Saga Pattern?
    • Moving further with Saga

    It is always a good idea to understand the problem first before coming up with a solution. We should not dive directly into implementation when we do not have a specific problem to solve with a popular architectural approach.

    Imagine you are maintaining a monolith system and it works and delivers perfectly as expected, do you need to change it? The answer is NO for sure unless there are pain points that we want to solve. Then imagine if you have the following pain points:

    “Users are growing rapidly over time that leads to frequent slowness and downtime of the whole system. It is too expensive to create multiple instances of this big application to mitigate the issue. It is also difficult to do operational work in the system at the same time as it will affect the user experience of our customers during working hours.”

    What are the problems here?

    Problem #1: Growing number of users leads to increased complexity

    We might have the following current state of the system.

    Figure 1

    At first glance, the bottleneck is on the database. If you think of it further, and let’s say the result of the analysis is that there are lots of transactions on the database, then the solution will be to apply some cache strategy on the database.

    There are different solutions where we can do some cache strategy on the data. For example, if the system uses hibernate, by default it is already using the first level cache but it is session specific. What we can do in our problem here is to enable the second level cache and query cache. There are different libraries that are available that supports second level cache in hibernate. Let’s take for example the Ehcache library. This is how it works.

    Figure 2

    Each instance of the application has its own second level cache area and each of these caches are communicating with each other for transferring updates of the data. Below are simple description of what they are communicating:

    • Puts a new record in the cache when a record is requested from the database in one of the application and informs the other cache to put the record as well
    • When a record is requested and is already available in the cache, it will use that information instead of getting a request from the database
    • Also, when there’s an update of the record or a record is deleted, it will update its own cache of the instance where the request is happening and then informs the other caches to do the same as well.

    Great! We manage to minimized the database transactions but it doesn’t end there. More users means more requirements means more modification of the system to cater the client’s needs. That adds complexity in the system.

    Problem #2: Scaling up the application is expensive

    Creating a new instance of the application is not ideal. Not all of the modules in that application needs to be scaled out. At this point, we will need to look deeper into the application. Let’s say the application consist of the following modules or features:

    • Homepage
    • Basket
    • Backoffice

    And we would probably like to separate the frontend (UI) from the backend (services). For example, most of the processes that make the system slower or cause downtime are in the homepage module, below might be how the system will look like when we modularized them:

    Figure 3

    Okay, now you can scale up only a specific module or feature but the architecture now looks more complex and we are actually building a distributed monolith. A Distributed monolith has both the disadvantages of a Microservices and Monolithic architecture like:

    • Higher chance of communication failure between different services
    • Difficult to manage while the services gets larger
    • Tight coupling between modules

    Looking on the current state, you might want to go back to a single monolith application.

    Problem #3: Isolating failures

    Customers and system operations share the same system. But the operation team needs to schedule or minimize data processing into the system as it might cause the system to slow down or at worst unresponsive. Let’s say there is a need to register a large number of users (or a bulk registration) into the system, if we based on the last state of the system, the point of failure is the Backoffice service. Homepage and Basket modules will then be unresponsive as they need to call Backoffice for further processes, while at the same time the Backoffice service is busy processing a bulk of request.

    There will be several root cause of failures here:

    • Database will be unresponsive
    • Updating information to another service will not be possible
    • Retrieving information from another service will not be possible

    To solve the unresponsive database, you might go for having a dedicated database for each module. But this will still give problems on updating or retrieving information from another services. This is where we will need the Event Driven Architecture.

    Benefit of Event Driven Architecture

    It allows system to have loosely coupled services. It uses messages/events to communicate with other services. There are three key components in this architecture:

    • Producer → produces an event to the router when there are changes happening in a service
    • Router → receives the event and routes it to the appropriate consumer
    • Consumer → consumes the event being routed and updates its information

    Let’s improve the system by applying this architectural approach:

    Figure_4

    We now have designed an Event-driven Microservice Architecture. But wait, it looks like we will face data consistency issue here among different services. Then, this is where we will need the Saga pattern.

    What is Saga Pattern?

    Saga Pattern is an approach in managing data consistency across loosely coupled services. A service has its own local transaction and a Saga is a sequence of local transactions. It is usually beneficial for a long running transaction.

    There are two types of Saga Coordination mechanism: Choreography-based and Orchestration-based.

    What are the differences between the two? Below is a simple comparison.

    Saga Coordination mechanism: Choreography-based and Orchestration-based

    More of Saga Explanation…

    If you want to know more about Saga pattern, you can check Chris Richardsontalk about "Using sagas to maintain data consistency in a microservice architecture". Also, check out his trainings here: http://www.chrisrichardson.net/training.html Subscribe now

    Moving on, there are key points that needs to take note when using Saga pattern especially for Orchestration-based, and these are the following:

    Saga Orchestrator

    • It is a persistent object that invokes the participants and manages the state of a saga

    Saga Participants

    • involved services of a saga orchestration

    Messaging Channels

    • It should have a dedicated request/command channel (point-to-point)
    • It should have a dedicated response/reply channel (point-to-point)

    Messages

    • It must be atomic
    • Should take note to identify the following type of messages:
      • messages with compensating transactions
      • pivot messages that cannot be undone
      • retriable messages that is safe to re-execute the command

    Saga Reliability

    • Use database table as message outbox

    Looking back…

    It reminds me of working on a system of a financial service company where the system communicates to different banks and payment networks (MasterCard, Visa, etc.).

    For example: When a customer pays in a terminal on a different bank that allows a certain payment network in the customer’s bank card, then it goes to this system which has a connection to different banks and payment networks. The transaction flow should only take within 3 seconds!

    looking back_ figure 6

    What I noticed with the system was that it has the following:

    • A standardized message
    • A request has an equivalent response message
    • Request and response message has its own separate channels and for each banks or payment network it connects
    • There is an equivalent reversal message that serves as a compensating transaction for transactions that can be undone
    • It persist the messages for reliability
    • The message router is responsible for sending the message to and back on the correct bank or payment network

    Let’s go back to our problem and try a simple use case. Let’s say a customer would like to register in the system. In the registration process, the customer access the registration page and submitted the information (backoffice service), and then expected to redirect to the homepage (homepage service) in which the user is logged-in and its session is already associated to a basket (basket service), below will be the sequential diagram of the transaction flow (given with 3 retries on the UI for checking the state of a saga before redirecting to the homepage).

    figure 5

    Enlarge image? Click here

    In the example above, there are 2 types of Saga Orchestrator:

    • CustomerSagaOrchestrator (with Orange color) → responsible for making sure that the needed information on customer registration are setup across the system

    • SessionSagaOrchestrator (with Blue color) → responsible for making sure that the sessionkey is spread across the system especially those services that are publicly available

    The CustomerSagaOrchestrator belongs to the Backoffice service. When it receives a registration request, it triggers the CustomerSaga and then it (1:CustomerSagaOrchestrator) create the Customer data, which is a local transaction, and then send a command to the Homepage service to request an Account creation. Since the sending of command is done asynchronously via point-to-point queue to the Homepage service, Backoffice service can now then respond to the UI. The UI will just have to wait for the Saga to be completed by polling to the Backoffice service for the customer creation status.

    On the Homepage service on the other hand, it receives a command for an Account creation. It will (2:CustomerSagaOrchestrator) create an Account and then sends a reply event back to Backoffice service, asynchronously, to inform that the Account was successfully created.

    Back to the Backoffice service, at the background, it receives the reply event that the Account creation was successful and now it can proceed to the next step where it sends a command to the Homepage service again. This time requesting for Session creation.

    Asynchronously, Homepage service receives this command to create a Session. Since the Session creation is associated to another saga, it will trigger the SessionSaga and then (3:CustomerSagaOrchestrator, 1:SessionSagaOrchestrator) creates a Session data, which is a local transaction for the SessionSaga. While another saga is triggered, CustomerSaga will just have to wait of the the SessionSaga to be completed.

    To continue on the SessionSaga, Homepage service will send a command to the Basket service requesting for a Basket creation. Still asynchronously, Basket service receives this command and (2:SessionSagaOrchestrator) creates the Basket data. Then it will send a reply event back to the Homepage service to inform that it was successfully created.

    Back in Homepage service, it receives the success reply event from Basket service and so it will proceed to the next step where it (3:SessionSagaOrchestrator) updates the status of the Session that the creation has completed. As the Session creation was completed and there is an associated Saga waiting for its completion, it then send back a success reply event to Backoffice service indicating that the Session creation was successful.

    On the Backoffice side, it receives the reply event and can now proceed to the last step of its saga which is to (4:CustomerSagaOrchestrator) update the Customer creation status to completed.

    In summary, each saga has its own steps and destination which can be presented below.

    figure 7

    Moving Further with Saga

    Taking our very detailed sequence diagram of the customer registration process above, we can simplify it by focusing only the affected data.

    Figure 8

    Enlarge image? Click here

    In the simplified diagram above, it is pretty much clear that we can extract the Saga Orchestrator into its dedicated service. It will expose an API in which it receives an “execute” command to start the Saga, with an exception that the first step is already done from the source. Then it will follow every steps in it, sending commands to the participants, one at a time. The last step will then send a reply back to the originator of the “execute” command.

    Let’s update the diagram, in which, we will convert each data above as a dedicated API for each and apply the necessary commands and events in the communication.

    figure 9

    In the diagram above, we have clearly identified the communication between different APIs, grouped by domains and saga orchestrators, to ensure data consistency across different services. We also have identified which API we need to expose for the Saga Orchestrator API.

    The benefit of extracting the Saga Orchestrator as a separate API is that it is easier to maintain and understand the flow. For example, if we need to add more steps in our Saga Orchestrator, given that a new saga participant has an existing API and enough information is already available inside Saga Orchestrator API, then we only just have to manage it inside the Saga Orchestrator without affecting the source and participants.

    Our Final Architecture

    Considering the Saga Orchestrator APIs that was discovered above, we can now update the high level architecture view of our system and below will be the result.

    figure 10

    Having this kind of Architecture, with modularized API and are loosely coupled, we are able to move forward by improving the development team organization and processes. If you are doing an Agile development process using Scrum, then each Scrum Team are loosely coupled, who will then be responsible for a certain module or group of modules.

    From Monolith Architecture to an Event-driven Microservice Architecture, we managed to design a loosely coupled services and also maintained data consistency among them. It was not an easy journey but at least we were able to conclude that the best solution for the problem is switching to Microservices Architecture.

    I hope you find this useful. Please note that the Saga Orchestration sample presented here are mainly focused on Retriable Transactions. There are more to consider on the Orchestration-based Saga Pattern like dealing with Compensating and Pivot Transactions in which we can tackle that next time.

    Who writes here?

    MicrosoftTeams-image (27)

    I'm working as a Software Developer with Mercateo. The team I belong to maintains the Legacy System on the procurement system side and at the same time works on the new project to improve the user interface in that system. I am passionate about EDA topics. Mercateo supports trainings to expand my knowledge in this area and this is what I like most about working here.

    Feona May Samson