Wednesday 13 March 2013

Event-sourced architecture, travelling Java -> Scala (Part 1: Into)

Intro

I ve been interested with event-sourced approach during previous 3 years. Actually I had the feeling that regular CRUD application isn't the best one candidate to redesign into event-sourced architecture. Previous week my friend shared with me the source code of interesting commercial project - engine for online game. Actually it follows the best practices and even implements the dreams of modern Java coders. Of course there is some legacy code - but project management pushing the team to follow processes based on BDD and TDD, that keeps technical debt in controlled borders. Code is in good shape (well refactored), covered with specs and tests. From the start project had a Java code only - but for now it's migrated mostly to Scala. Of course language is just one of the tools and it's more important to know the whole technology stack. It is: Spray -> Akka -> JPA -> MySQL. In the current moment team is actively looking for the ways of performance improvement. When I reviewed the business of the application - it is the best one candidate for event-sourced approach. Keep the whole instance world as immutable state in the memory - and track all actions as the log.




Spray -> Akka are the best friends here, having nothing against them on the way to event-sourced approach. But mutable data, that is what has to be changed.

But lets start from the start. We all are experts in Request Driven architecture, when each request to the system gives a birth to response. Its very well fit to the most of business when operations are presented via request-response pare, where response van be an error for example (even validation logic often implemented via errors). Synchronous approach is easy to understand. One process inside the system calls another one and waits for the response, and if he was called itself - it blocks the client as well. Throw the time developers found some best practices, like organise calls into direct line, using 3 layers pattern. For example in J2EE application the client request is coming throw these patterns MVC->Service Facade->Service->DAO. While dedicated thread is wrapping the request-response session from start to end, it's great idea to start using the Dynamic Scope Pattern: "Thread Local Value".

User gives the life to thread on server.

This approach became the best practice - and even allowed to generate templates while implementation phase, for example Frameworks like RoR, Grails, Roo etc can generate a lot. If architecture rules are followed well, than between Resource Layer (DB) and Request - there is a chain of calls, that operates with mutable data and saves the state into persistence storage. It's easy to build concurrent system until persistance layer. The most approaches delegates concurrency resolving to the DB. Actually when we are talking about shared memory in well designed system - it is true only for Resource layer (DB in the most cases). Other layer are stateless and must work with own instance (copy) of data. Of course data can be changed in few places, but there should be only one point in synchronisation. Changes are synchronised with help of transactions and transaction isolation levels. In this model the data is mutable in the database level, and if there are concurrent queues/commands to the same data - the concurrency usually is resolved via blocking (transaction isolation). It works well for low loaded systems. During the requirements increasing to performance we can find that huge amount of time system is spending the time for "transaction ceremony". Even if there are no concurrent calls - working system is spending some resources for the lock-release procedures. As a workaround for low and medium loaded systems - there is a optimistic locking approach. Still we have the same architecture, but different point of concurrency checking:
Server using low transaction isolation level with optimistic locking.

It doesn't fit for the heavy concurrent systems if we put it without automatised resolving (imaging the world where your distributed source code repository weren't able to automatically merge non conflict files), when user is required to merge data few times, or in case of rational condition there is a possibility to spend a lot of time in tries to apply the changes. Of course if merge can be done in automatic way - it will improve the situation. We almost come to Software Transaction Memory.
Actually STM approach isn't so easy to implement on the project, it's even isn't always possible to provide the solution. As the most of modern concurrent models it comes from functional programming, and based on immutable data approach. I would recommend to read great article from Rich Hickey. It should be a great start point in changing the mind on state and identity.
The second picture covers the best practices in custom web application development, it great but it's not hight scalable if we have to support hundred thousands connections per second. Let's look into evolution of client server HTTP applications in next post.



No comments:

Post a Comment