Sudo for life: August 2018

Thursday 23 August 2018

Fix known problems before writing new code.

It's hard to predict the price of keeping the build in "red" state - especially in micro-services environment. In chaotic world the risk to get unpredictable devastating effects because of inability to deploy/test the latest source code changes are growing in time exponentially.

The same is for the bugs related either to internal processes or product itself. If your project isn't hardly using the Business Intelligence, operates with hundreds live metrics to measure bug impact - assume that defects has a priorities over the features from you backlog. In the worst case scenario teams are introducing the ceremonies to address workarounds for known bugs, because those bugs aren't the priorities.

Bugs aren't always reflected in your kpi(s) or metrics even if your think that BI model is close to real feedback - Customer Satisfaction is something that has a huge lag in measurements.

I found that team using their own product gives higher priorities to quality over adding new features.

For the business perspective in long term "eliminate known defects before working on new features" rule has more benefits. It allows to keep the project under control with lower price. As long as team accepts the know problems as a risks - they usually forget that incremental operation on risks probabilities is following the rule of multiply the individual probabilities and keep the project to unmanageable way in case "shadow" abruptness.

From technical perspective it's easier to follow those rule as long as continues integration rule requires:

Fix Broken Builds Immediately.

Team just needs to add the extra rule:

Fix know problems before writing new code

Friday 17 August 2018

Speedup tests execution in ScalaTest

Sometimes the duration of automated tests written with ScalaTests library can be reduced in times easly.

Here are the few steps I found to make it fast and easier.

1. Maven: use the latest version of maven itself 3.5.x and scala-maven-plugin 3.4.x Be careful with other plugins aren't supporting parallel execution, if you see this message in build log, carefully review the plugins list

[WARNING] *****************************************************************
[WARNING] * Your build is requesting parallel execution, but project *
[WARNING] * contains the following plugin(s) that have goals not marked *
[WARNING] * as @threadSafe to support parallel building. *

2. Switch on parallel mode and disable JVM forking:

Maven and sbt:

ScalaTest's normal approach for running suites of tests in parallel is to run different suites in parallel, but the tests of any one suite sequentially. This approach should provide sufficient distribution of the work load in most cases, but some suites may encapsulate multiple long-running tests. Such suites may dominate the execution time of the run. If so, mixing in this trait into just those suites will allow their long-running tests to run in parallel with each other, thereby helping to reduce the total time required to run an entire run.

3. Analyse the shared state. Switching the tests into parallel mode sometimes introduces race conditions. If test methods are sharing mutable state in class fields. For example:
The best solution is usage of shared fixtures or test scopes. If you don't want to refactor your test in BigBang approach it's easier to extend the Suits/Specs with org.scalatest.OneInstancePerTest.

Trait that facilitates a style of testing in which each test is run in its own instance of the suite class to isolate each test from the side effects of the other tests in the suite. If you mix this trait into a Suite, you can initialize shared reassignable fixture variables as well as shared mutable fixture objects in the constructor of the class. Because each test will run in its own instance of the class, each test will get a fresh copy of the instance variables. This is the approach to test isolation taken, for example, by the JUnit framework.

4. Reduce the number of heavy resources via re-sharing them. Sometimes there could be great performance boost when some state of tests is shared between the suits. For example embedded db instance - if it's started before each test and switched off after - it could require more time than test execution itself.
If you applied the forking policy from second recommendation to at most once - than all your tests are running in the same JVM and it's possible to share some class instances.
There are some cons with following to this approach - tests aren't independent anymore. Every test should be implemented with keeping in mind that resources are mutated in parallel.
For example Unit tests that counts the final number of rows in table - should be updated to check by the number of rows by condition. In meantime in before[Each/All] should not clear the whole state of shared resource but the parts are related to current test case only. Here is example how to share the embedded cassandra instance between the Specs:
5. Migrate to async. In case of testing the async services/functionality instead of blocking the test for checking the results use *Async implementations for the Specs: AsyncFlatSpec, AsyncFunSpec etc. More details are here.

6. Review your before/after[Each/All] methods. Sometimes the execution of those makes major influence into whole test run. If your close the resources - it could be not mandatory to do it - because of short running JVM process is spawned only for test run. If your delete all the data to clean up the state before the tests then think on how it's possible to isolate the tests - via UUID or unique table names etc, this is mandatory step for recommendation #3

7. Tune the JVM params, test goal in maven is short duration Java process and these JVM args are representing this fact:

-XX:+TieredCompilation -XX:TieredStopAtLevel=1

Tuesday 7 August 2018

Type level programming for Streams

Programming at this level is often considered an obscure art with little practical value, but it need not be so.

Conrad Parker (about Type-Level programming in Haskell)

There are a lot of domains where events are coming in stream-like way. The best one example are services based on http protocol.

The languages like Java with imperative programming legacy background are providing a sort of abstraction over stream nature of Http services - where stream is converted to branch-like structure and each leaf is handling in-variants for possible inputs. It's became popular to design systems in RMI way.

Regular handler for HTTP or RPC service is looking like:

It means that all the magic with routing HTTP requests to methods are provided by the frameworks.

Of course there is a hidden complexity behind this - like Filter, Handlers etc.

While Reactive and Functional Programming gain some popularity, streaming DSL's gain popularity in solving Flow handling complexity.

Those flows are well self explaining and easier to maintain/change in future.

The most of primitive domain problems can be easy solved via stream of events from Input to to Sink, like:
In ~> Step1 ~> Step2 ~> StepN ~> Sink.

Even if your DSL isn't sexy enough to look like math diagram, you can use some tools to visualise flow in nice ways (for example: travesty)

Of course there is always extra complexity like filtering, conditioning, throttling, backpressure etc. Stream like data handling is looking natural and mostly befit the functional programming, where every flow step is a function that accept In type and returns Out type:

type Step1 = In1 => Out1

Piping the step functions is looking natural solution. Lest assume concrete input types: ShipmentsRequest, TrackRequest, PricingRequest and Ping:

The simplest way to handle those types is operate with Any Type:

Using Any makes possible to pipe (compose) the functions. Unfortunately compiler doesn't help us with any extra checks to make piping safe.
We can use best practices and operated with the marker trait:

Now compiler helps us to find if we have forgotten to handle Ping message:

It's still just a compiler warning - and it's not always possible to escalate it to the error - if you are not allowed to change compiler arguments. It's not always the case that all incoming message could extend some base trait.

There is other way to present input types - via union type. For example Haskell support declarations like this:

In Haskell it's evolved into Data Kinds language extension. Scala is planning to support this on language level as well in Dotty compiler as Union Types.
There is a fork of Scala - Typelevel Scala that aims to bring it early than Dotty. We will use Shapeless library for type level programming solution.
In Scala we can define our input type as a chain of embedded Either types:

Shapeless library implements some syntax sugar for that solution, here is the same code expressed via the Shapeless:

Than Step 1 can be defined as a Poly1 function determined for all unions (coproducts) types from In1:

As a result type of that step is ShipmentsResponse or TrackResponse or PricingResponse or Pong. The handling for all the cases is checked during the compilation time.

The full example for akka streams can look like: