Sudo for life: 2018

Tuesday 11 December 2018

Hylomorphism in 1 minute

A Hylomorphism is Catamorphism compose to Anamorphism function.

It's presented as

Hylomorphism = catamorphism(anamorphism(x))
or
Hylomorphism = fold(unfold(x))

Basically it constructs (unfold) complex type (like trees, lists) and destructs (folds) back into representing value.

For example to get factorial from N we can
a) unfold N to list of (n),(n-1),..,0
b) fold with prod function the list from previous step.

And possible Scala example implementing function for getting factorial from N with help of previously introduced list's Catamorphism and Anamorphism is:

Anamorphism in 1 minute

An Anamorphism (from the Greek ἀνά "upwards" and μορφή "form, shape) over a type T is a function of type U → T that destructs the U in some way into a number of components. On the components that are of type U, it applies recursion to convert theme to T 's. After that it combines the recursive results with the other components to a T , by means of the constructor functions of T .

For example if we want to build List(N, N-1, …, 1) from N we would use anamorphism from Int to List[Int]. It’s the opposite operation to fold - unfold.

Catamorphism in 2 minutes

A Catamorphism (κατά "downwards" and μορφή "form, shape") on a type T is a function of type T → U that destruct and object of type T according to the structure of T, calls itself recursively of any components of T that are also of type T and combines this recursive result with the remaining components of T to a U.

And it's just an official name of fold/reduce on higher kinded types.

For example getting the prod from list of integers is Catamorphism on the list:

Catamorphism in programming can be used to fold complex structures (lists, tress, sets) into their representation via different type. As an example list catamorphism can be described as

And if we want to fold the list into prod from elements we would use:

Tuesday 4 December 2018

Isomorphism in 3 minutes

An Isomorphism (from the Ancient Greek: ἴσος isos "equal", and μορφή morphe "form" or "shape") is a homomorphism or morphism (i.e. a mathematical mapping) that can be reversed by an inverse morphism. Two mathematical objects are isomorphic if an isomorphism exists between them.

Isomorphism between the types means that we can convert T → U and then U → T lossless.

For example Int to String conversion is sort of isomorphism, but not all the possible values of String can be converted to Int. For example when we try to convert “Assa” into Int we get an Exception. If we want to use identity element for example 0 for any non-number String - Int2String won’t be Isomorphic anymore.

The real isomorphic mapping from String to Int can be done via co-algebra (list’s co-algebra):

You’ve probably generated the tons of Unit tests for JSON → String and String → JSON isomorphism prove.

Singleton types are always isomorphic to itself, for example type Unit has only one set’s member Unit or () and it’s always isomorphic to itself. Scala compiler has a special flag -Ywarn-value-discard that checks method returns type may break isomorphism rule (make sense only for non side-effect calls).

Currying functions are isomorphic to each other:

Function1 in Scala doesn’t have the curry method and that is why curry isn’t isomorphic for all the Scala’s functions - but it could be if curry from Function1 returned Function1[T, Function1[T1, U]].

Isomorphism as applied to web development means been able to render pages on the both server and client sides. It’s used in context of NodeJS ecosystem because of been able to reuse libraries/frameworks in backend and frontend.

Friday 2 November 2018

JavaScript: the legacy and pain in tc39

"You Might Not Need TypeScript." © Is a bad statement.
JavaScript is evolving and expanding with the features but.

Let's look into the latest language feature been implemented in Chrome browser: Class Fields.
There is a link to Technical Committee 39 developing the specification itself. According to github discussion has been started 4+ years ago, we should expect some mature and important language feature.

The ECMAScript proposal “Class Fields” is about fields declarations for classes, for example to declare public instance property we can use:
Easy isn't it? I would expect it's a sugar for:
Actually it's not, the code behaves unpredictable when is used in class inheritance. If we define Parent class with some property and extend it in Child class with overriding the property - it behaves as expected:
But let's refactor parent class and keep expected backward compatibility:
Oops, it doesn't work as expected. It's because of Object.defineProperty is used over normal set operation. It becoming the major difference because property addition through set creates properties which show up during property enumeration, whose values may be changed, and which may be deleted. Object.defineProperty is defining them lazily, that brings to the cases when it can be shadowed by set operations. Here are more details: [[Set]] vs [[DefineOwnProperty]].

Update: Babel 7.1.0 changes default behaviour from [[Set]].

The legacy doesn't allow language to grow in fair and proper way, I have a feeling that JavaScript is moving in the same direction as a Perl done, but JavaScript is protected from been abandonded one's days because it's a web language.

Tuesday 30 October 2018

#2018 Trends

It's almost November but feels like a winter, I've tried to create cheat sheet with been trendy buzzwords/streams this year I had to face or work with. Did it for myself to read/refresh books/blogs during the upcoming vacation but might somebody will find it useful to be shared:

Java from Oracle isn't free anymore
Java as a language is lagging behind all others, Lombok and Spring Roo aren't demanded solutions. I still bet to Scala
Kotlin got popularity, finally 1.3 has been released
Rust 1.3 has been released
DevSecOps over DevOps, it's mandatory to invest into security
Prometheus over Graphite
Microservices is still the buzzword
Docker vs Podman, project CRI-O, OCID (Open Container Initiative Daemon) - well Ops in nova days is something that is quickly changing
IBM got Redhat, are there any other open-source independent whales?
Clouds becoming private for intermediate size companied
Ansible over Chef and Puppet
Python is great again
AI buzzword over data analysis (k-nearest neighbors classifier is accepted as magic spell)
Data Scientist and Data Engineer are demanded, to get outside of Hadoop, Spark, Kudu, Impala, Kafka, Ignite, Geode etc is a separate profession
Engineering Culture over Agile to address Conway's law
Vue, React are gaining popularity
Kafka as a glue for everything. Got the competitor nats.io streams but for streaming it's still astable (by the way implemented in Go)
Yaml is for configuration
GraphQL is preferred to REST
Cassandra is preferred to MongoDB
GraalVM: is too a day before the fair to start looking into
JVM: still there is a demand for green threads

Tuesday 9 October 2018

Scala Сheat sheet: Context Bound of multi typed kind

Because of rarely using this functionality every time has to find out how to use multi type for context bound.

For example there is class extending Function1:

abstract class Mapping[In, Out] extends (In => Out)

If we want to use Mapping[In, String] as a Context Bound for other class we should use type Lambda (type-level Lambda):

abstract class StringRequestResponse[In: ({type M[x] = Mapping[x, String]})#M]

In case of return should be parametrised as well:

abstract class RequestResponse[In: ({type M[x] = Mapping[x, Out]})#M, Out]

Revise Kotlin 1.3 from Scala experience perspective

Thanks to Google Kotlin is producing more and more informational noise. It's interesting to make an assessment spending less than hour to find out where the language is now and comparing it with Scala's experience. I'm not going to review feature by feature - but will try to get impression in offbeat way - choosing the most interesting/unaccustomed feature in 1.3 and trying to assist language's way of development comparing with pseudocode in Scala.

Let's look into upcoming release 1.3. Kotlin Contracts is looking interesting and probably the biggest KEEP change in the release. Looks like something language specific (haven't seen it in other languages) that improves language itself and may bring some light to the Kotlin's "way of thinking".

Let's run through the proposal to and try to understand what it improves and why.

The first example/motivator is:

Well, it's extremely hard to get why should we write the code like that. There is some background on how Kotlin implements calls of closure blocks (there was similar SIP-21 - SPORES in Scala, but didn't gain popularity), skipping that - for particular example it feels more natural to use functional approach:

I feel like Kotlin tries to make this code valid:
Kotlin doesn't require us to define val and initialise it immediately, but I don't feel it's nicer and better readable. There is an extra price - method run has a

contract {

callsInPlace(block, InvocationKind.EXACTLY_ONCE)

}

The next one example is:

As for developer with Scala experience it's hard to understand the problem's domain - but keeping in mind that Kotlin is providing safety for null references problems - there is some sugar for the cases when safety is already checked against the reference and it follows some rules (not null in the example). Looks fine and it's imperative alternative to Monad's approach (will talk about this later). But if it were the production code I would prefer to avoid throwing the exceptions and separate execution of the side effect (println). Same as in previous example there is small complexity via introducing:

Next one example is:

Looks like pattern matching customisation - I had intuitive feeling that it's the way to handle the Union Types but it isn't. As for the guy without commercial experience with Kotlin - all the examples are too artificial to evaluate the syntax sugar coming with contracts. For example the last example looks much better if the pattern matching is applied

Still it could be covered via Option/Either Monads or if there are more possibilities - better to look into Coproduct solutions. All the other examples are following the same paradigm if it were the real product's source code I would prefer to use best practices from Lambda and Categories patterns. Especially that handling nullable (empty) values has the same importance as validation/parallel validation or applying side effects in a good way. There is probably implicit advantage of Kotlin's sugar - allocate less memory and it would be great - but as we will see as soon as we require some feature like ?.let the extra memory allocation is inevitable.

Summary

Overall impression is quite positive. Kotlin is inventing some alternative to Lambda + Category solutions for specific problems. Nullable reference is really weird solution that appears to be the centrum of all the problems/improvements in the given examples. Fortunately I haven't met NullpointerException quite some time in Scala - but should admit that one of the most used Monads is Option. Another positive impression is: it's possible to write code in Kotlin from the first minute.

Bonus: Weird nullable types

The Null Safety is looking too noisy but we will play around this area comparing with Scala. Let's imagine we need to read (from property file) three optional variables and build optional url object based on those. Pretty easy approach checkin all the 3 variables against null and creating the object, Scala allows to write something like:

Or not using the for-comprehensions but Applicative's mapN:

Let's find out can we do something similar. Kotlin doesn't support for comprehensions out of the box. Starting with brute-force solution for 2 params:

Let's use ?.let method:

Looks fine for the case with two params - but wouldn't for more. Let's try to play with Monad's flatMap and unit:

Looks like Maybe/Option Monad:

Looks better - but wouldn't for three and more params. Can we use mapN?

And the usage is:

Works for Pair type only - for more params we need a custom implementation per type, but idea is clean - Kotlin can support functional solutions with Categories. I'm pretty sure there should be dozen good Functional Programming libraries in Kotlin like Arrow.

And as a bonus to bonus: https://www.codewars.com/kata/tricky-kotlin-number-8-simple-for-comprehension/

Thursday 4 October 2018

Protobuf getting metadata at runtime

Sometimes it's important to get the list of fields from Protobuf message.

The good example is Kafka with Protobuf serialiser - when the code generator to Java isn't used - but desc file should be generated anyway. It worse to invest into pact testing of the produced messages - check if the number of sent fields is within the range of defined in proto schema.

Unfortunately Protobuf for JVM doesn't support reading metadata from proto files, but if it's compiled to desc file - it's possible to read it via JVM API - that isn't intuitive:

Friday 21 September 2018

Down-streaming with Event Sourcing application.

How to apply side effects in Event Sourcing application with desirable/expected consistency level?
It's common and logical to put side effects as a Sink/End of Persistent stream. Usually side effects are dispatched in event-persisted callback handlers. These handlers can also spawn new one sub-streams to react on applied events.

There are different tactics to integrate down-streams to the business flow and this is serious architectural question that makes influence on all the application's design layers.

To review all the possibilities I'm going to use an example of typical event-sourcing application. Let's assume it implements some business utility. After each event is dispatched we are sending the notification to Business Intelligence services for analyzing and reporting. Imagine the day when you find out - that Business Intelligence operations are fast enough and can produce real-time feedback which can increase the profitability of our business up to X%. We are using BI as an example and placeholder - but in common it can be any downstream service reacting to our business events. Down-stream service are the ones that consume the upstream service.
Just to rephrase it and agree on terminology - I will use BI in all the examples - but it could be any down-stream service.

Given the terminology 'upstream' and 'downstream' it may help to make an analogy with a river. If you drop a message (data) in the river it flows from upstream (initiator) to downstream (receiver).

Here is the typical event-sourcing flow from request to response including other side effects:

Figure 1. Typical event souring flow

After stream emits a command - it should:

Validate input against the current state and business rules then generate mutation events.
Persist an event(s)/state.
Respond to event producer if needed.
Execute other important side-effects
Send events/messages/state to BI (down-stream)

Here is pseudo-code that implements the behavior using akka-persistence:

It's quite logical to apply all side-effects after the mutation has been already persisted (as we operates up-stream and down-stream patterns). In step 3 we send response to the client. Our actor is abstracted from delivering this message, in particular case we rely to "at most once" akka's guarantee. If client expects the different one then we should build business protocol to provide extra guarantees - for example handle timeouts etc.

Here is sequence diagram that helps us to find the flow's operations with different consistency levels:

Figure 2. Consistency per stream flow

The legend for Figure 2 is:

A to B is under your API interface consistency - for example it could be Web-sockets protocol with some custom retry policy. We are going to ignore this step flow in examples, ideate it has tolerable consistency and our Architecture design responsible only for the flow from B to E.
C during persistency execution we rely on our persistence journal's consistency - for example Cassandra's.
C to D - After journal persisted event - the flow returns to app. "at least once".
E - is downstream - it can start the new sub-stream with A to D steps or rely on some other delivery tools, for example Kafka.

As we can see the weakest consistency is between C and D and it determines consistency to BI down-stream. Actually our example implements the first solution with the lowest consistency level but it has own pros as well.

Figure 3. BI side-effects with at most once guarantee

Solution #1: At-most-once, i.e. no guaranteed delivery

If we send events to BI after the persistence step:

The response time from client perspective doesn't depend on down-streaming performance
BI delivery failures don't influence on our business flow
BI can miss out events (at most once delivery guarantee)
Low latency delivery to BI downstream

We can miss out downstream events - for example if we use Kafka producer and rely on Kafka's persistence - when producer.flush operation is failed we might lose the events.

Solution #2: Possible redundant events

Figure 4. Allow redundant events

If the fact of receiving the events has a priority over dispatching them - it's possible to change the guarantee to "at least once" allowing the duplicates (in case of client retries) and inconsistency between (not all the events are persisted - but all are reported).

This solution provides:

Better down-stream but worse response latency (persist waits the flush for downstream completed).
At least one guarantee for downstream - but possible inconsistency to state.
Can be combined with solution #1 when different events have different guarantees.

Solution #3: Connector to persistence layer

Figure 5. Connector to persistence layer.

Sometimes down-streaming the data is becoming critical for the business flow and more reliable guarantees are wanted. It's possible to use the tools that allow to subscribe for persistence and convert them into stream.

There are plenty of market ready solution. For example if Cassandra is used as persistence for your journal and BI is a Kafka's consumer - then you can look for different Kafka to Cassandra connectors. In case of Akka is standing for event-sourcing in your design - look into Persistence Query

The difference to solution #1 is:

Better delivery guarantee to BI (could be tuned to at least once or exact once)
Decline in latency

Solution is primely except the cases when latency of BI events is critical.

You can try to implement custom solution - for example for Cassandra it's possible to read commit log and send the entries into Kafka. It will allow you to tune the latency between Cassandra and Kafka on the lowest possible level - but it will be always a tradeoff - the quicker you make Cassandra flush the data to journal - the slower will be your persistence - but faster stream to BI.

Solution #4 Integrate into persist step

When down-streaming the events is critical part of your flow and not been able to proceed with it means unrecoverable failure for application - then it's logical to amalgamate persist action with down-streaming.

Figure 6. Integrate into persist step

It means we either persist and send events downstream or fail. In case of akka-persistence we can use the journal that persists the events and sends them to Kafka. This combination can be met with DuplicatingJournal and StreamToKafkaJournal.

This solution is:

Failing the flow if either persist or send event failures (of course each of them can handle some retry policies etc.)
Has a delivery guarantee equaled to chosen downstreaming tool (for Kafka can have "at least once" and "exact once"
Response and other side effects might have worse latency than in solution #1 because of persistTime = max(journalPersistDuration, eventSendFlushDuration)
Doesn't implement Atomicity guarantees - but explicitly fails in case of inconsistency.

Solution #5 Custom protocol

Let's try to solve the the case when down-streaming consistency and latency are evenly critical - but we can accept eventual consistency in case of failures and quasi real-time for the ninety-nine out of a hundred.
This means we want to continue business flow execution even if down-stream is failing - but we rely on eventual consistency of down-stream - it should recover and continue from the moment it failed as soon as it has been recovered.

Figure 7. Custom protocol

This is the most expensive solution as it requires implementation of manual delivery protocol to down-stream. Unfortunately Akka's At-Least-Once Delivery is insufficient for our case.
As an example we will use finite-state machine for BI consumer and coupling it's logic to the business flow.

Lets look in details:

Command is emitted.
Each command after validation is checked is it important to be an initialization marker for business flow. Examples of initialization commands are: start of transaction, user creates shopping card, online game round started etc. Initialization event should be delivered with the most desirable guarantee - but it should not make huge impact on main flow latency - because they take only small percentage from all the events.
If event is initial then it has delivery guarantee at least once and business flow just awaits down-stream flush successfully completed. If the event isn't initial then it sent to down-stream in "fire and forget" mode.
Events are persisted.
Side-effects are applied and one of them is sending message to BI. Fire and forget mode like in solution #1.
Sending the response to client.
The most complicated part is BI - it has implements finite state machine that validates the messages been received.

As an example of that solution we can implement FSM protocol for the service that can adds together integer values and support commands:

InitializeTransaction
Operand(Int)
GetState

Rough algorithms is:

Business flow receives "InitializeTransaction" command
Because we marked this command as initial - it should be sent to BI with at least once consistency.
BI awaits with timeout for the next event - EventPersisted(TransactionInitialized) - in case of timeout it should explicitly fetch the state of transaction from the business flow.
After event is produced by command is persisted - business flow should send the event to BI with fire and forget style (no delivery guarantee).
BI after each event sets timeouts for expecting the next one or final state - it guarantees the eventual consistency between business service and BI down-stream.
BI iterates all the 3 to 5 steps until it gets the final marker - transaction completed.

Solution #6 Kafka streams

Use Kafka streams. Kafka streams became self-sufficient platform for building event sourcing applications and doesn't require hybridizing with other frameworks like akka-persistence.

Summary:

If you don't have to rely on strict delivery guarantees for down-streams use a mix of Solution #1: At-most-once and Solution #2: Possible redundant events.

If delivery latency isn't critical - but consistency is - use Solution #3: Connector to persistence layer. It should fit to the most of business cases and supported out of the box by the most of event-sourced frameworks.

Solution #4 Integrate into persist step emulates transaction without rollback possibility and is suitable only for special business cases.

When down-streaming latency is critical Solution #5 Custom protocol or Solution #6 Kafka streams are the ways to go. Implementation and support of custom protocol is the most expensive, additionally it validates your main business flow and theoretically can provide good monitoring feedback. Using the Kafka streams has own limitations and isn't suitable for all the business domains.

Monday 17 September 2018

ScalaCache - conditional caching

Sometimes it's important to avoid caching some subset from return values. It's easy to implement it using Memoization method memoizeF. When using M[_] like Future - the Failed case won't be cached - and it's possible to convert any value to Failed with special exception marker that contains the value itself and in meantime blocking it from been cached.

Sometimes it's preferable to avoid caching some subset of possible values without deviation to failed case of higher kind wrapper. In case if this condition can be delineated by predicate and you don't want to play with implicit mode: Mode[F] you can mixing small trait to your cache:

This example is based on Caffeine and it isn't caching negative integer values:

Wednesday 5 September 2018

javax.ws.rs.container.ContainerRequestFilter detracts @Suspended

If your are still abused to use Java API for RESTful Web Services but building non blocking api with help of:

@Suspended asyncResponse: AsyncResponse

Keep in mind that javax.ws.rs.container.ContainerRequestFilter doesn't support non blocking nature and in case it's a part of invocation chain - it will block the calling thread and make usage of @Suspended

The best workaround is to migrate into functional style code - and use loan pattern if possible.

Code can look like:

@GET
@Produces("text/html")
def handleRequest(@Suspended res: AsyncResponse): Unit = {
authenticateBasicAsync(realm = "secure site") {
userName =>
res.resume(s"The user is '$userName'")
}
}

This code will be easier migrate in future to akka-http or other async http libraries.

Tuesday 4 September 2018

Сheat sheet: Team assessment

Here is the summary of my experience on what every team should evade or at least minimize. It's agnostic to methodology or it's absence on the project - just set of common sense advises.

It helps assist the team during on-boarding or check it periodically - for example on retrospective meetings.

Team should avoid:

Unclear goals
When the methodology for achieving the goal has not been agreed upon.
Absence of task priorities.
Inability to tell justified "No" to extra load.
A large number of tasks that require high concentration.
The prospects of the tasks aren't visible (when the ultimate goals of the task are unclean).
Lack of motivation.
Noise in the working space.
Insufficient coordination between the colleagues.
Insufficient delegation of work.
Insufficiently organized storage of information (and knowledge sharing).
Too many of meetings.

Thursday 23 August 2018

Fix known problems before writing new code.

It's hard to predict the price of keeping the build in "red" state - especially in micro-services environment. In chaotic world the risk to get unpredictable devastating effects because of inability to deploy/test the latest source code changes are growing in time exponentially.

The same is for the bugs related either to internal processes or product itself. If your project isn't hardly using the Business Intelligence, operates with hundreds live metrics to measure bug impact - assume that defects has a priorities over the features from you backlog. In the worst case scenario teams are introducing the ceremonies to address workarounds for known bugs, because those bugs aren't the priorities.

Bugs aren't always reflected in your kpi(s) or metrics even if your think that BI model is close to real feedback - Customer Satisfaction is something that has a huge lag in measurements.

I found that team using their own product gives higher priorities to quality over adding new features.

For the business perspective in long term "eliminate known defects before working on new features" rule has more benefits. It allows to keep the project under control with lower price. As long as team accepts the know problems as a risks - they usually forget that incremental operation on risks probabilities is following the rule of multiply the individual probabilities and keep the project to unmanageable way in case "shadow" abruptness.

From technical perspective it's easier to follow those rule as long as continues integration rule requires:

Fix Broken Builds Immediately.

Team just needs to add the extra rule:

Fix know problems before writing new code

Friday 17 August 2018

Speedup tests execution in ScalaTest

Sometimes the duration of automated tests written with ScalaTests library can be reduced in times easly.

Here are the few steps I found to make it fast and easier.

1. Maven: use the latest version of maven itself 3.5.x and scala-maven-plugin 3.4.x Be careful with other plugins aren't supporting parallel execution, if you see this message in build log, carefully review the plugins list

[WARNING] *****************************************************************
[WARNING] * Your build is requesting parallel execution, but project *
[WARNING] * contains the following plugin(s) that have goals not marked *
[WARNING] * as @threadSafe to support parallel building. *

2. Switch on parallel mode and disable JVM forking:

Maven and sbt:

ScalaTest's normal approach for running suites of tests in parallel is to run different suites in parallel, but the tests of any one suite sequentially. This approach should provide sufficient distribution of the work load in most cases, but some suites may encapsulate multiple long-running tests. Such suites may dominate the execution time of the run. If so, mixing in this trait into just those suites will allow their long-running tests to run in parallel with each other, thereby helping to reduce the total time required to run an entire run.

3. Analyse the shared state. Switching the tests into parallel mode sometimes introduces race conditions. If test methods are sharing mutable state in class fields. For example:
The best solution is usage of shared fixtures or test scopes. If you don't want to refactor your test in BigBang approach it's easier to extend the Suits/Specs with org.scalatest.OneInstancePerTest.

Trait that facilitates a style of testing in which each test is run in its own instance of the suite class to isolate each test from the side effects of the other tests in the suite. If you mix this trait into a Suite, you can initialize shared reassignable fixture variables as well as shared mutable fixture objects in the constructor of the class. Because each test will run in its own instance of the class, each test will get a fresh copy of the instance variables. This is the approach to test isolation taken, for example, by the JUnit framework.

4. Reduce the number of heavy resources via re-sharing them. Sometimes there could be great performance boost when some state of tests is shared between the suits. For example embedded db instance - if it's started before each test and switched off after - it could require more time than test execution itself.
If you applied the forking policy from second recommendation to at most once - than all your tests are running in the same JVM and it's possible to share some class instances.
There are some cons with following to this approach - tests aren't independent anymore. Every test should be implemented with keeping in mind that resources are mutated in parallel.
For example Unit tests that counts the final number of rows in table - should be updated to check by the number of rows by condition. In meantime in before[Each/All] should not clear the whole state of shared resource but the parts are related to current test case only. Here is example how to share the embedded cassandra instance between the Specs:
5. Migrate to async. In case of testing the async services/functionality instead of blocking the test for checking the results use *Async implementations for the Specs: AsyncFlatSpec, AsyncFunSpec etc. More details are here.

6. Review your before/after[Each/All] methods. Sometimes the execution of those makes major influence into whole test run. If your close the resources - it could be not mandatory to do it - because of short running JVM process is spawned only for test run. If your delete all the data to clean up the state before the tests then think on how it's possible to isolate the tests - via UUID or unique table names etc, this is mandatory step for recommendation #3

7. Tune the JVM params, test goal in maven is short duration Java process and these JVM args are representing this fact:

-XX:+TieredCompilation -XX:TieredStopAtLevel=1

Tuesday 7 August 2018

Type level programming for Streams

Programming at this level is often considered an obscure art with little practical value, but it need not be so.

Conrad Parker (about Type-Level programming in Haskell)

There are a lot of domains where events are coming in stream-like way. The best one example are services based on http protocol.

The languages like Java with imperative programming legacy background are providing a sort of abstraction over stream nature of Http services - where stream is converted to branch-like structure and each leaf is handling in-variants for possible inputs. It's became popular to design systems in RMI way.

Regular handler for HTTP or RPC service is looking like:

It means that all the magic with routing HTTP requests to methods are provided by the frameworks.

Of course there is a hidden complexity behind this - like Filter, Handlers etc.

While Reactive and Functional Programming gain some popularity, streaming DSL's gain popularity in solving Flow handling complexity.

Those flows are well self explaining and easier to maintain/change in future.

The most of primitive domain problems can be easy solved via stream of events from Input to to Sink, like:
In ~> Step1 ~> Step2 ~> StepN ~> Sink.

Even if your DSL isn't sexy enough to look like math diagram, you can use some tools to visualise flow in nice ways (for example: travesty)

Of course there is always extra complexity like filtering, conditioning, throttling, backpressure etc. Stream like data handling is looking natural and mostly befit the functional programming, where every flow step is a function that accept In type and returns Out type:

type Step1 = In1 => Out1

Piping the step functions is looking natural solution. Lest assume concrete input types: ShipmentsRequest, TrackRequest, PricingRequest and Ping:

The simplest way to handle those types is operate with Any Type:

Using Any makes possible to pipe (compose) the functions. Unfortunately compiler doesn't help us with any extra checks to make piping safe.
We can use best practices and operated with the marker trait:

Now compiler helps us to find if we have forgotten to handle Ping message:

It's still just a compiler warning - and it's not always possible to escalate it to the error - if you are not allowed to change compiler arguments. It's not always the case that all incoming message could extend some base trait.

There is other way to present input types - via union type. For example Haskell support declarations like this:

In Haskell it's evolved into Data Kinds language extension. Scala is planning to support this on language level as well in Dotty compiler as Union Types.
There is a fork of Scala - Typelevel Scala that aims to bring it early than Dotty. We will use Shapeless library for type level programming solution.
In Scala we can define our input type as a chain of embedded Either types:

Shapeless library implements some syntax sugar for that solution, here is the same code expressed via the Shapeless:

Than Step 1 can be defined as a Poly1 function determined for all unions (coproducts) types from In1:

As a result type of that step is ShipmentsResponse or TrackResponse or PricingResponse or Pong. The handling for all the cases is checked during the compilation time.

The full example for akka streams can look like:

Tuesday 31 July 2018

scala.concurrent.Future leaks on timeouts.

Almost any business operation should have a reasonable deadline - if not completed before - it loses it value.
Imagine situation when we are building the service that should give a response in 500 milliseconds - if it's not ready we should return error.

Just to reduce the number of source code lines in example we will use Await.result - that allows to set the timeout for waiting.
If we define our heavy operation as:
We find out that despite service produces error on timeout - and we notify the users on that - we actually haven't canceled the heavy task itself. It keeps running despite the result isn't needed. Another simple example with akka-streams:
It's output is:
error: java.util.concurrent.TimeoutException: The stream has not been completed in 100 milliseconds.
I'm executed anyway

There is no way to get ride off resources leak except the changing the way how heavyOperation is working. For example if it's query to Cassandra - it should be regulated via timeouts, retry policy and other setting to limit the maximum execution time to expected duration.
Sometimes it's not that easy to predict and limit execution time as a result to avoid resource's leak. Akka streams proposes Back-pressure solution that actually isn't easy to apply for all the cases.
There is another way to cancel the jobs explicitly, for example Monix library implements Cancelable Of course cancelable is something that you have to manage explicitly - the way you would like the job to be cancelled. But the handling of timeouts is easy to use operation:

If we want to stop thread execution from our synthetic example we can use Task with cancelable behaviour:
- of course the real cancelation is sometimes hard to implement it correctly.
How do you handle the timeouts for long running tasks?

Friday 27 July 2018

Testing Promise with Jasmine

There is a good library that provides syntax sugar for Promise validations via Jasmine jasmine-promise-matchers. It comes hard to test complex result objects, example proposes to use jasmine.objectContaining from jasmine-matchers, but the easier way for the same is just to map promise into tested value and use toBeResolvedWith on a projection from pormise.

Tuesday 10 July 2018

Ceremony Δ

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

The Zen of Python

...Scala's syntax is a bit more flexible than many people are used to..

underscore.io blog

When programming language gets the new features - it has to pay some price in complexity - but extra in ceremonies aspect. I call this Ceremony Delta, the learning curve for the language has some dependency on it - because of developer should study the implicit conventions.

Dynamically growing ecosystem isn't able to follow the bottom row closely for a long distance. Naturally it tends to jump away. It takes a lot of power to keep it on "The shortest way is the right one". Of course the language authors are allowed to review concepts and release the "brand new" language version.

Scala as a language with FP and OOP paradigms support was always hard to keep balance.
Unfortunately it's bring the requirement to invest into best practices.

It already has a big scope of implicit ceremonies that is growing.

Sometimes compiler helps us to find the problems:

eitherResult match { // [warn] match may not be exhaustive.  
    case Right(value) => assert(value == expected)
}

The example with matching only some part actually is the shortest one - especially with some context that it's a part of Unit test - and having failures for Left cases is acceptable

This code still compiles but doesn't make sense:

val intList = List(1, 2)
intList.filter(_ == "4") 
// comparing values of types Int and String using `==' will always yield false

Closure is a shortest way to filter the list. As a solution there are different libraries implement === method that is type safe.

Here are example of "smell code" that is easy to detect for experienced developers - but compiler stays silent

// Declaring public variable that is expected to be injected // that is why it's initialized by null
@Inject var service: Service = _

Some(null) // ridiculous declaration 

Future.failed[T](new Exception) foreach doSideEffects // using foreach for side effects - but it's ignoring errors

Thread.sleep(1000) // Let the word wait for 1 second

val result = for {
  part1 <-callService1  // Future[T]  
  part2 <- callService2 // Future[T]
} yield (part1, part2)  // part2 starts execution after part1 has been completed// Tuple is used to combine the result

Using null from JVM sometimes is the shortest way, making side effects on Future in foreach callback is the easiest approach - but not the right one.

Sometimes it's not that easy to find the problem even for experienced developers and it's definitely of of the compiler's responsibility:

// despite method returns a Future - it hides the calling // of blocking code
def asyncCallService(params: Parameter): Future[T] = {
  val extraParam = callSomeBlockingCode() // usingBlockingApi
  callService3(extraParam, params) // returns Future[T]

To reduce the Ceremony Δ there are many domain specific libraries addressing boilerplate places. The most of the examples for the bad usages of scala.concurrent.Future are solved in ScalaZ Task, cats-effect, Monix etc libs, that still is increasing the learning curve for language but keeps Ceremony delta low. The example with making side-effects in map-flatmap can't be solved with libraries and probably requires language support. It leaves as best practices recommendation on use a pure functions in map/flatmap in cats-effect library - that is ceremony we have to follow.

Monday 9 July 2018

triple equals conflict scalactic and cats

ScalaTest library depends on org.scalactic.Equalizer.
When cats library is used via import cats.implicits._ - the both are providing function ===.

Of course compiler won't be happy with multi implicit conversions:

Note that implicit conversions are not applicable because they are ambiguous:
both method catsSyntaxEq in trait EqSyntax of type [A](a: A)(implicit evidence$1: cats.Eq[A])cats.syntax.EqOps[A]
and method convertToEqualizer in trait TripleEquals of type [T](left: T)SomeUnitTest.this.Equalizer[T]
are possible conversion functions from Int to ?

As a workaround I recommend to disable cats implementation - it's a bit worse than scalactic:

import cats.implicits.{ catsSyntaxEq => _, _ } // import all implicits except triple '='