Spring IO is back!

Marked in red on the calendar of every JWorks consultant: the yearly edition of Spring I/O. This year, we weren’t going to wait for the explicit approval of our manager and we ordered 27 early bird tickets as soon as we could and booked our flights to sunny Barcelona! It promised to be a special edition, since everything was gonna be bigger and better: the venue, the speaker roster, the food, the atmosphere.

The Palau de Congressos de Barcelona is a much bigger venue than the one we’re used to from previous years. This is why the organizer Sergi Almar was able to accomodate 1000 attendees this year (twice as much as the year before!), which shows how much interest there is in the Spring community. It was a really good location, it has ample space to grow in the coming years and the catering was also of good quality. Next year we’ll most likely get the same venue, but the event will probably overlap with the nearby Barcelona International Motor Show, which takes place every two years. Free test drives during a conference? Yes, please!


Spring IO 2018 Photo Collage


JWorks at Spring I/O 2018


We’ll talk about some of the presentations this year, but it is definitely not a complete list. There were so many interesting talks, we’re actually going to need quite some time to rewatch them all on Youtube!

Let us know if we missed anything by filing an issue or contacting us at our general JWorks email. We will probably still update this blogpost with other talks. Links to videos and resources might still be added as they become available on the Spring I/O Youtube channel.

Day 1: Talks & Workshops

Implementing DDD with the Spring ecosystem by Michael Plöd

Michael Plöd

After the keynote session, one of the first talks was given by Michael Plöd. He talked about implementing Domain-Driven Design (DDD) using the Spring ecosystem, leveraging various Spring technologies such as Spring Boot, Spring Data and Spring Cloud.

Michael attributed the inspiration for and ideas around DDD to the following two books:

  • Domain-Driven Design: Tackling Complexity in the Heart of Software, by Eric Evans.
  • Implementing Domain-Driven Design, by Vaughn Vernon.

Domain-Driven Design is currently a very popular way of implementing and looking at microservices. However, he immediately made an important disclaimer:

Everyone should be aware that DDD is not a silver bullet to be used in all projects

One should not force DDD on problems that aren’t suited for it.

Another important thing to remember is to model your microservices along business capabilities. If your microservices are highly coupled on a business level, all that fancy technology in Spring Boot won’t help you. We will use Strategic Design to find a solution that takes into account business capabilities.

Bounded Context

Every sophisticated business (sub-) domain consists of a bunch of Bounded Contexts. We can, for example, create linguistic boundaries using Bounded Contexts if the solution has two types of “accounts”: a BankAccount and a UserAccount. Each Bounded Context contains a domain model and is also a boundary for the meaning of a given model. We don’t nest Bounded Contexts.

Inside of a Bounded Context, it’s important to not repeat yourself. On the other hand, between several Bounded Contexts, repeating yourself is allowed for the sake of decoupling.

Tactical Design

Systems, and this applies to both monolithic and microservice architectures, should be evolvable. DDD offers a set of patterns, which are the internal building blocks of the Tactical Design part of DDD, that helps us in this regard.

XX

Michael now talked us through each of these concepts:

  • Aggregates:
    • Entities
    • Value Objects
  • Factories
  • Repositories
  • Services

Entities

Entities represent the core business objects (not data objects) of a Bounded Context’s model. Each of these has a constant identity which should not be your primary database key but rather a business key. Each Entity also has its own lifecycle.

Value Objects

Value Objects derive their identity from a combination of various attributes. As an example, Michael brought up the representer object he was holding: it costs 80 euros so this representer object could be identified by the value of 80 and the currency Euros. We do not care about which ‘80 euros’. Value Objects do not have their own lifecycle: they inherit it from Entities that are referencing them.

It’s also important to note that for example a Customer can be an Entity in one Boundary Context but be a Value Object in a totally different Boundary Context.

Take note that your DDD Entity is not your JPA Entity. Because the JPA Entity is a data entity while the DDD Entity is a business entity. Don’t mix these types.

Aggregates

Aggregates group Entities and Value Objects together. The Root Entity is the entry point in terms of access to the object graph and for the lifecycle. For example, you aren’t allowed to enter a loan application form through the loan for instance: you would also have to go through the loan application form.

Best Practices for architecting Aggregates
Small

Prefer small Aggregates that usually only contain an Entity and some Value Objects. Don’t build big reference graphs between Aggregates.

Reference by identity

Do not implement direct references to other Root Entities. Prefer referencing to identity Value Objects.

One transaction per Aggregate

Aggregates should be updated in separate transactions which leads to eventual consistency.

Consistency Boundaries

Take a look which parts of your model must be updated in an atomically consistent manner.

Best practices for implementing Aggregates

Code can be found on the author’s DDD-with-Spring GitHub project where he implemented a credit loan application consisting of three Spring Boot applications.

Visibility

Don’t just make everything private and expose everything with public getters and setters. This is, in Michael’s words, the “shortcut from hell” because you aren’t doing information hiding and are exposing everything to the outside world.

References

How do we hook these up?

  • Referencing them from one Value Object to the other.
  • Create intermediary Value Objects to bind them together.

Michael prefers Aggregates that do not reference themselves. They are hooked together with a few shared Value Objects which leads to more decoupling. There are 4 Aggregates in the application and we add a Value Object, personID, to hook AgencyResult and Applicant together. The ApplicationNumber object brings Applicant, Financial Situation and ScoringResult together.

Keep your Aggregates Spring free.

Aggregates should be plain old Java.

Packages

When working with Aggregates, place each Aggregate in its own package and work with package level visibility in terms of information hiding.

Creation of Aggregates: there are two options
  • Use the Root Entity directly.
  • Explicitly create an aggregate concept around your Entities and Value Objects.

Make up an educated decision of your own.

Builder pattern.

The Builder pattern works very well with Aggregates as a substitute to the DDD factory. All Aggregates have Builders in this author’s project.

Use an annotation @Aggregate and @AggregateBuilder.

Why? To have a code review system in place that checks whether Aggregates are publicly visible and other non-Aggregate classes are packaged protected. Michael recommends ArchUnit, a unit testing tool for software architectures to verify visibility of classes and other architectural rules.

Application Services

The ScoringApplicationService class holds a service that orchestrates between a lot of Aggregates.

Repositories

In Spring Data, one uses Spring Data JPA repositories with JPA Entities. But remember these JPA entities shouldn’t be your DDD Entities.

Architectures

The hexagonal onion architecture is not your only option and is not suitable for everything.

DDD Architecture

CRUD

If you use a CRUD architecture, Spring Data REST or a context that doesn’t run on business Entities or Aggregates may be suitable.

Query Driven contexts

All the logic resides within queries.

Domain Events

For communication between Bounded Contexts there are two possible differentiations:

  • Orchestration.
  • Choreography.

Orchestration is about synchronous calls going somewhere.

Choreography is about events: domain events, event sourcing and event storming. Choreography turns around the call flow so for example: the credit application submits a Credit Application Submitted Event and the scoring component reacts on that Event. You model the information about your activity as a flow of discrete events.

Options for Event Payload

Event Payload Options

Full Payload.

Put everything we filled out in the credit application in there and work with it.

REST URL.

Use RESTful URLs to REST resources for the event; not the Spring Data REST repository.

  • Empty.
  • Mix.

Infrastructure

Infrastructure Options

Apache Kafka or message brokers are not the only options for infrastructure. You can also work with:

  • Brokers (for example Kafka or RabbitMQ): use Spring Cloud Stream.
  • HTTP Feeds (for example Atom): use Spring MVC with Quartz and Rome libraries.
  • Internal Event Bus: use Spring Application Events. Eventing within same JVM, so not using an external system.

The credit application offers a HTTP feed using Atom that provides new credit agency ratings. Feed polling happens by a combination of REST with Atom: using Spring MVC and the Rome library (to create Atom feeds).

At the end of the talk, Michael referenced the ddd-by-examples GitHub project as a great resource.

Michael is currently writing a book, Hands-on Domain-Driven Design by example, for which you can get notified upon release by signing up on Leanpub.

The slides of this talk can be found on Speakerdeck.

Migrating legacy enterprise Java applications to Spring Boot by Mark Heckler

Mark Heckler

Mark explained how easy it can be to migrate an existing legacy Enterprise Java application to a modern, state-of-the-art Spring Boot app. Many people think that migrating these kinds of applications is impossible or very hard without rewriting the whole thing, but Mark gave us some very good pointers on how to do it quickly and efficiently:

  • Generate a new skeleton project from start.spring.io
  • Use schema.sql and data.sql data sheets to migrate and test your database
  • Use Kotlin to vastly simplify your code by using data classes to simplify access to members and constructors, and by moving the constructor definition in the same line as the class definition
  • Using Spring Data, no more need to use PersistenceContext or EntityManager
  • Using Spring MVC with @RestController, no more need to declare @Produces or @Consumes

Benefits

  • Less code: your code vastly diminishes using Kotlin and data classes
  • The Spring Boot + Kotlin combination greatly reduces the amount of boilerplate code
  • Business logic and Service Layer of the old application remains the same and is better encapsulated
  • Code becomes easier to maintain, easier to test
  • Spring Boot offers more and better deployment options

JWorks consultants have done these kinds of migrations at multiple clients, with great success rates. Non-technical people like functional analysts, product owners and business experts are continuously amazed at the speed with which we are able to do this. Technical people that have been doing JEE development for years ask us how they can learn Spring Boot.

Spring Security 5 Workshop by Andreas Falk

Andreas Falk

The target of this workshop was to learn how to make an initially unsecured (reactive) web application more and more secure step-by-step. It was a very well prepared workshop and I really enjoyed the interactivity with Andreas, he answered questions on the fly and helped us understand some of the finer details and changes in Spring Security 5, especially when using it with Spring Boot.

I don’t want to diminish the excellent Spring Security workshop from Andreas Falk by copying anything from him. He deserves all the credit for his amazing work so I’m just gonna link to it here:

https://andifalk.github.io/spring-security-5-workshop/

Thank you Andreas for this great resource!

Spring Framework 5 - Hidden Gems by Juergen Hoeller

Juergen Hoeller

Since almost every feature was backported to 4.3, most of them are already known to the general public. Though there are 7 areas of refinement within 5.0 that aren’t widely known to the public.

Commons Logging Bridge

So the Spring team came up with a new dependency called spring-jcl which is actually a reimplementation of a logging bridge. It is a required dependency and is here to help streamline the logging functionality. The main difference with this way of working is that you don’t need to go through a dependency hell where you would manually add exclusions to ignore certain logging dependencies. Just add the logging library to your classpath and everything will switch to the logging implementation of your choice. It now has first class support for Log4J 2 (version 1 has reached its end of life), SLF4J and JUL.

Build-Time Components Indexer

The file system traversal for classpath scanning of all packages within the specified base packages using either <context:component-scan> or @ComponentScan might be slow on startup. This is especially true if your application is started for a small period of time or where I/O is very expensive. Think short-running batch processes and functions, or applications being started and stopped on Google App Engine every 2 minutes. The common solution was to narrow your base packages, or even to fully enumerate your component classes so you would skip scanning all together. Starting with 5.0 there is a new build-time annotation processor that will generate a META-INF/spring.components file per JAR containing all the classes which in turn will be used automatically at runtime for compatible component-scan declarations.

Nullability

The new version contains comprehensive nullability declarations across the codebase. Fields, method parameters and method return values are still by default non-null, but now there are individual @Nullable declarations for actually nullable return values for example. For Java this means that we have nullability validation in IntelliJ IDEA and Eclipse. This allows the Spring Team to find subtle bugs or gaps within the framework’s codebase. It will also allow us, as developers, to validate our interactions with the Spring APIs. When you’re writing code in Kotlin it will give you straightforward assignments to non-null variables because the Kotlin compiler will only allow assignments for APIs with clear nullability.

Data Class Binding

Spring Data can now work with immutable classes. No need for setters anymore since it can work with named constructor arguments! The property names are matched against the constructor parameter names. You can do this by explicitly using @ConstructorProperties or they are simply inferred from the class bytecode (if you pass -parameters or -debug as compilation argument). This is a perfect match with Kotlin and Lombok data classes where the getter and setters are generated at compile time.

Programmatic Lookup via ObjectProvider

The ObjectProvider is a variant of ObjectFactory, which is designed specifically for injection points, allowing for programmatic optionality and lenient not-unique handling. This class had the following original methods: @Nullable getIfAvailable() and @Nullable getIfUnique(). With the new version of Spring these methods have been overloaded with java.util.function callbacks which empowers the developer to return a default value instead of returning null.

Refined Resource Interaction

Spring’s Resource abstraction in core.io has been overhauled to expose the NIO.2 API at application level, eg. Resource.getReadableChannel() or WritableResource.getWritableChannel(). They are also using the NIO.2 API internally wherever possible, eg. FileSystemResource.getInput/OutputStream() or FileCopyUtils.copy(File, File).

Asynchronous Execution

Spring 5.0 comes with a couple of interface changes that will help you with asynchrous execution:

  • The ListenableFuture now has a completable() method which exposes the instance as a JDK CompletableFuture.
  • The TaskScheduler interface has new methods as an alternative to Date and long arguments: scheduleAtFixedRate(Runnable, Instant, Duration) and scheduleWithFixedDelay(Runnable, Instant, Duration).
  • The new ScheduledTaskHolder interface for monitoring the current tasks, eg. ScheduledTaskRegistrar.getScheduledTasks() and ScheduledAnnotationBeanPostProcessor.getScheduledTasks().

Google Cloud Native with Spring Boot by Ray Tsang

Ray Tsang

On one hand this workshop lowered the entry threshold for newbies. On the other hand it provided insight about what services Google Cloud has to offer. Google Spanner, Pub/Sub messaging system, CloudSQL, Runtime Config. They were all addressed.

It’s quite cool to see how the team at Google managed to create decent Spring Boot Starters for all these services. They basically remove all the boilerplate code for you and offer you easy connectivity to all its cloud services. The sensible auto-configuration pre-fills most of the settings required to use:

  • PubSub as messaging middleware
  • CloudSQL as a managed relational database
  • Runtime Config as the backing store for your application configuration
  • Google Spanner as a horizontally scalable, strongly consistent, relational database

During the workshop, we created a guestbook application which consisted of a frontend and some backend microservices. The workshop builds this up neatly by adding features to the application step by step. Each step introduces you to another Google Cloud service. Those of you who want to make the workshop yourself, check out the link below.

Google PubSub

What stayed with me is Google’s Pub/Sub message-oriented middleware. A publisher that creates the messages sends them to a topic. Consumers can subscribe to this topic to obtain the messages. Publishers and subscribers are decoupled. Neither of them is required to know the other one. Subscribers will either pull messages or get messages pushed from the topic. PubSub messages will be delivered at least once, but can be processed multiple times by different subscribers. Unprocessed PubSub messages are only kept for 7 days.

Ray told us a fun story about how he wanted to really explore the capabilities of PubSub and see how many messages it could handle. They created this website called https://pi.delivery/ which calculates the numbers of pi. It’s really interesting to read how hard they were able to stress PubSub (hint: BILLIONS? TRILLIONS!)

Flight of the Flux by Simon Baslé

Simon Baslé

In this talk Simon went deeper into the inner workings of Spring Reactor.

The session started off giving a brief recap of reactive programming and reactive streams before delving deeper into the machinery behind Reactor 3.

Assembly time vs Execution time

When programming with Reactor 3 (and other functional reactive libraries like RxJava) the programming model is quite different compared to the classic imperative style. Basically your whole chain is lazy, you describe a sequence of operations and no actual processing happens until someone subscribes.

To give an example:

Assembly:

    this.myFlux = Flux.just("foo").map(String::length);

As stated above, all that really happens when this code is called is creating a chain of operators (under the hood this phase is also used for things like operator fusion, see below).

Execution:

    this.myFlux.subscribe(System.out::println);

When the subscribe method is called the actual chain is executed and the length of the string is printed in the console.

For those familiar with Java 8 it’s basically the same as Java 8 streams, nothing really happens until a terminal operation is used (collect, reduce, count, …).

One of the drawbacks of this is error handling. Since the error doesn’t happen until the subscription, it’s harder to see where the error actually happens. To alleviate this, Reactor provides a feature called assembly tracing which can be enabled with the checkpoint() operator.

Nothing happens until you subscribe.

Cold and hot observables

Previous statement holds for most observables typically encountered in a project. HTTP calls, data lookups, etc. No data is actually being produced until someone subscribes. Sometimes however an observable can be a constantly emitting event stream. When multiple subscribers subscribe on a cold observable, each of these subscriptions will trigger the whole chain from the start. A hot observable is constantly producing data and will only give the elements to the subscriber emitted after subscription time.

Scheduling

Reactor is concurrency agnostic which means it doesn’t impose a concurrency model while it does give the developer the tools to change Reactor’s executor behavior.

This is done using the scheduler abstraction: a scheduler defines the execution context, this can be the same thread, another thread or using a threadpool. The special operators publishOn() and subscribeOn() allow you to change the execution context of the current chain.

The publishOn() changes the execution context of the downstream operators.

For example:

flux.op1().op2().publishOn(scheduler1).op3().subscribe((result) -> doSomethingWith(result));

op1 and op2 will execute in the original execution context(usually the thread in which the subscribe is called). op3 and the action defined in the subscribe method itself will execute in scheduler1’s execution context.

The subscribeOn() changes the execution context of the subscription, meaning of the start of the execution of the chain.

For example:

flux.op1().op2().subscribeOn(scheduler1).op3().subscribe((result) -> doSomethingWith(result));

The whole chain will execute in scheduler1’s execution context.

Work stealing

Although previous topics surface easily and are simpler to demonstrate, this was perhaps the most abstract one of the session along with operator fusion. When using schedulers supporting parallel execution, Reactor uses so called ‘work stealing’ algorithms to balance the load on the different threads. If a thread is idle it can take over execution of tasks that were originally scheduled to be executed by a different thread. Under the hood, this is achieved by using a shared queue for the tasks and a drain loop.

Operator fusion

One of the big advantages of having a chain of tasks defined and split up in multiple steps is that it allows the engine to identify possible optimizations in the chain. Since each individual operator also has an overhead (eg. queue for work stealing), it’s sometimes more efficient to combine some operators and execute them as one.

For example:

map(a).map(b).map(c) => map(abc)

Reactor tries to achieve this by using a negotiation process between the operators.

Conclusion

This talk gave us more insight into the more advanced parts of Reactor, arming us with knowledge to tackle potential problems in a reactive environment and helping us understand Reactor’s deeper mechanisms.

For more information, Simon uploaded his presentation on Speaker Deck.

(Spring) Kafka - One more arsenal in a distributed toolbox by Nakul Mishra

Nakul Mishra

Nakul started by describing Apache Kafka, a very potent messaging system which allows you very easily to act as a throughput between your applications, as long as you stay away from recreating ESB anti-patterns with Kafka. Kafka is more than a messaging queue, combining speed, scalability and stronger ordering guarantees then traditional messaging keys. In order to benefit from this ordering, it is important to choose a correct partition key. Kafka puts more emphasis on smart consumers, meaning a more client centric approach vs the broker centric approach used by Message Oriented Middleware. By designing for retention and scale, Kafka gives consumers (clients) the time to process the messages they want to process whenever they want to.

It is also possible to use Kafka as a database, by having it process a stream of data in real-time using KSQL of which the results can very easily be pushed to external systems (HDFS, S3, JDBC).

To be scaleable, Kafka is kept simple at its core, all data is stored as a partitioned log. This means that writes are append-only and reads are a single seek-and-scan allowing the underlying filesystem to very easily handle the storing and caching of messages. Also when reading, data is directly copied over from the disk buffer into the network buffer bypassing the JVM, ideally for flooding your network.

Spring Kafka integrates Kafka with Spring giving you all the benefits of the Spring ecosystem. It also supports Kafka Streams since a few months. Testing is made easier by providing an @EmbeddedKafka and a TestUtils class:

@EmbeddedKafka(partitions = 1,
         topics = {
                 KafkaStreamsTests.STREAMING_TOPIC1,
                 KafkaStreamsTests.STREAMING_TOPIC2 })

Spring Kafka also has a starter available for Spring Boot which makes it very easy to get started with Kafka and start playing around. As always, just go to start.spring.io and get the Kafka dependency.

The slides of this presentation can be found at slideshare.

Breaking down monoliths into system of systems by Oliver Gierke

Oliver Gierke

The goal of this workshop is not to provide a clear architecture of the perfect application, but more to make you think. To let you reflect about your existing applications.

There is a shorter version of this talk here while this one consumed a full two hours. This gave us the possibility to have a more in-depth look at the code that Oliver prepared and look into potential problems, which would be much harder were it a one hour session.

The workshop basically is a summary of observations about the workings of monoliths and microservices. It all tends to boil down to the correct definition of bounded contexts within applications, how you can divide your application in logical modules and how these can communicate with each other.

First we will observe what happens when a monolith is transformed into a microlith AKA a distributed monolith. Subsequently we will improve the design of the monolith with these bounded contexts and end up with a modulith. This modulith is still a single application, a monolith, but with different bounded contexts each having clearly defined borders allowing us to easier divide the work over various teams.

From a modulith, one can go to a system of systems, a true microservice architecture. In a system of systems there are two ways you can implement the communication, either via messaging or via REST.

Refactoring to a system of systems  (from Oliver Gierke workshop)

The sample code of this workshop can be found on Github.

Monolith

The monolith is reasonably ordered and the bounded contexts have been split in various packages.

When building your Java application, you should make optimal use of the package options provided by Java as mentioned in this blogpost of 2013. It stated:

  • Make your code package protected whenever it does not need to be accessed from the outside, a good starting point is to make your repositories no longer public.
  • Whenever there is leakage over the bounded contexts, for example when LineItems contains Products, try to use IDs and not the actual objects of another bounded context, because whenever you update an object used within another bounded context, you will also leak into that context.

It is also noted that badly structured applications tend to be built from the bottom up, from DB to the top. This means that your design is going to be way too data-centric instead of focusing on the real business interactions you are supposed to handle.

Try to prevent using methods which update two bounded contexts simultaneously, as these methods have the reflex of drawing more and more code in, and they tend to grow like a cancer, killing your application from the inside out.

To summarize:

  • Move bounded contexts into packages
  • Inter-context interaction is processed locally and resulting in either success or an exception (method calls in the JVM are very efficient and executed exactly once)
  • Avoid referencing two domain classes over bounded contexts, it’s convenient but results in problems
  • When you leak into other bounded contexts, there is a great risk of creating circular dependencies
  • When there are no clear boundaries, adding a new feature often requires you to touch other parts of the system
  • A monolith is easy to refactor
  • By its nature, it has strong consistency but this is also a disadvantage as transactions become more brittle when they fail because of related business functionality

The monolith example code can be found here.

Microlith

Creating a microlith means splitting up your systems into various smaller systems. This doesn’t mean that suddenly all your problems have been solved.

If you do not correctly define your bounded contexts in order to minimize your communication, the chance is great that you have made a microlith with the following problems:

  • You are no longer able to use local transaction consistency
  • Local method invocation is transformed into RPC-style HTTP calls
  • You have translated the transactions of your monolith into a distributed system, needing HTTP to update each other
  • Remote calls are executed while serving user requests and this over multiple services
  • Running and testing requires the other services to be available
  • There is a strong focus on API contracts, which tend to be very CRUD-looking with a lack of business abstraction and hypermedia
  • Detecting breaking API changes is prioritized over making evolvable APIs
  • One tends to add more technology in order to solve issues: bulkheads, retries, circuit breakers, asynchronous calls, more monitoring systems, etc

It tends to minimize the risks of a rollback, but it does not really solve any issue, it just distributes your problems.

The first rule of distributed systems is: don’t distribute your system until you have an observable reason to.

If you did not define your bounded contexts properly, it is very difficult for you to observe how to distribute your system.

Example code of the microlith can be found here.

Modulith

We will start using events inside our modulith, as well as more domain specific methods, like an add() on an Order. This makes everything more abstract, making your domain objects much more than glorified getters and setters.

We don’t do CQRS or event sourcing, but we just use eventing as a way to signal events over bounded contexts.

These events make it relatively easy to split up the work, they can serve as either input or output for different services within the applications. Your units of work will have clear boundaries making testing and design easier.

The differences with a monolith are:

  • Focus of domain logic has moved to aggregates
  • Integration between bounded contexts is event based
  • The dependency between bounded contexts is inverted

Side Step: Application Events with Spring Data

This is a very powerful mechanism to publish events in a Spring application.

Whenever you need to send data to another bounded context, you trigger events. This has the advantage that your business services no longer needs to know about each other, they just need to trigger an event which gets picked up by the services which are interested in this event.

Transactional semantics are still retained because the eventing is synchronous, by default. This also applies for JEE eventing.

The @TransactionalEventListener annotation allows you to delay the execution of events, so for example, you can send out an email when an Order has truly been completed.

Side Step: Error Scenarios

When a synchronous event listener fails, this will be handled by the transaction, so no worries.

But when an asynchronous event listener fails, the transaction does not get rolled back and you will need to deal with retries.

You can make use of an Event Publication Registry when you use TransactionalEventListeners as these event listeners are decorated with a log before the commit, since the system needs to know the events need to be sent out to. When the event has been processed, the log will be cleared. If it doesn’t get cleared the system can keep retrying, so you don’t lose events.

Example code of the modulith can be found here.

System of Systems

Messaging

Whenever you make use of a message broker, you introduce a potential single point of failure, like with Apache Kafka or RabbitMQ. These brokers know about all the messages of all the systems and decide how long these messages will be retained.

Coupling does exist, although not explicit, but the message format will decide which version of a service can process these messages, just as with REST. Especially if you keep your events for a long time, which is possible with Kafka, you might need to think about transforming existing events.

But these messaging systems tend to be designed for scale.

Pro tip: make use of JsonPath annotations for the message payload in order to make it more robust.

Example code of a system of systems with messaging can be found here.

REST Messaging

If you use REST you will have to deal with caching, pagination and conditional requests.

Messages do not tend to be stored for long periods of time and most communication tends to be synchronous. One does have to pay attention on not to lose events as your application will immediately know if the message was processed correctly, incorrectly or timed out.

REST Polling

When using a polling mechanism, your producers do not send out messages to your consumers, but the consumers will poll the producers for new events to process.

This means that:

  • You do not need any additional infrastructure like an Apache Kafka or a Message Bus
  • Event publication is part of the local transaction
  • The publishing system (producer), controls the lifecycle of the events and can transform these if necessary
  • The events never leave the publishing system
  • There might be a bigger consistency gap, depending on how frequently the consumers poll
  • It does not scale that well

The example code with REST can be found here.

Conclusion

This was a great workshop which makes you think about the design decisions you have made for your applications. If you ever get the opportunity to participate in one of these workshops, do not hesitate to join as they are much more valuable than regular talks that can be viewed online as well.

Observability with Spring based distributed systems by Tommy Ludwig

Tommy Ludwig

Introduction

Tommy’s talk introduced three main pillars of observability: logging, metrics, and tracing.

Tommy explained that observability is achieved through a set of tools and practices that aim to turn data points and contexts into insights. Observability is something you should care about as it provides a great experience for the users of your system and it builds confidence in production where failure will happen. You ought to give yourself the tools you need in order to be a good owner in order to detect these failures as early as possible. Mean time to recovery is key here. He also quoted Werner Vogels’, the CEO of Amazon, “You build it, you run it” while also adding to it that you need to monitor it.

Within a Spring Boot project, we have access to Actuator and it is awesome. It comes with a lot of goodies out of the box. There is also Spring Boot Admin that makes it easy to access and use each instance’s Actuator endpoints.

Distributed systems make observing them hard by design as a request spans multiple processes. You therefore need to stitch these together in order to fully make sense of it. There are also more points of failure and adding multiple instances of the same service, for scaling reasons, will only increase the monitoring complexity.

Tommy named three sides to observability:

  • Logging
  • Metrics
  • Tracing

Logging

Logs are request scoped, arbitrary messages that you want to find back later. They are formatted to give you context via things such as logging levels and the timestamp. The issue with logs is that they do not scale, concurrent requests intermingle logs, and searching through them can be cumbersome. In order to tackle these issues you can make use of centralized logging while also adding a query capability to retrieve a collection of matching logs.

Within Spring Boot we can configure the logging via Spring Environment and via Actuator at runtime. Spring Cloud Sleuth is useful to add a trace ID for request correlation.

Metrics

Metrics aggregate time series data and have a bounded size. You can slice these based on dimensions, tags and labels. The main goal of metrics is to visualize and identify trends and deviations, and to raise alerts based on metric queries. Some examples of metrics are: response time, the response’s body size and memory consumed. In order to properly measure all this, you need to set up a metrics backend to which all applications publish their metrics data.

In Spring Boot 2, Micrometer is introduced as its native metrics library. Micrometer supports many metrics backends such as Atlas Datadog, Prometheus, SignalFX and Wavefront. A lot of the instrumentation is auto-configured by Spring Boot and custom metrics are added easily. These are configurable via properties and common tags such as the application name, the instance, region, zone, and more.

Tracing

Local tracing happens via the Actuator /httptrace endpoint and displays the latency data. With distributed tracing you can go across process boundaries which is useful as metrics lack request context and as logs have a local context but limited distributed info. You define the sample size of the request to trace yourself as you don’t want to trace everything especially if you have a high load. This sample size is configurable at runtime, especially handy to debug errors in production. Zipkin with its UI helps you to see the timing information visually and is a good tracing backend for Spring applications.

Using Spring Cloud Sleuth, the tracing instrumentation via Zipkin’s Brave is auto-configured. Via properties you can configure things such as the sampling probability and whether certain endpoints should to be skipped. It is also compatible with the OpenTracing standard that is being developed under the wings of the CNCF.

Correlation everywhere

Having set up all of these, you now have correlated logging, metrics and tracing across your system, and you can find the data from each based on identifiers.

Observability cycle

If an issue produces itself we can take the following steps to troubleshoot and bandage the situation:

  • The issue should have been reported via an alert or report
  • We check the metrics of our system
  • If needed, we check the tracing data
  • If needed, we check the logs
  • Based on the gathered information we can triage the issue and make adjustments to prevent a recurrence

Key takeaways

System wide observability is crucial in distributed architectures. The tools to help you with this exist and Spring makes it easy to integrate them in your system as the most common cases are covered out-of-the-box or are easily configurable. Use the right tool for the job and synergise across the different tools.

Day 2: Talks & Workshops

Machine learning exposed: The fundamentals by James Weaver

James Weaver

Machine Learning is a hot topic in tech land with all kinds of applications like predicting property prices, forecasting weather, self-driving cars, plants classification and so on. James gave a brief overview about the fundamentals of Machine Learning and its applications.

But how can we define Machine Learning? Andrew Ng, Co-founder of Coursera and Adjunct Professor of Stanford University defined Machine learning in his introduction course “Welcome To Machine Learning”1 as “Machine Learning is the science of getting computers to learn, without being explicitly programmed”. An example that Andrew gave was a cleaning robot that can tidy your house. Instead you program the algorithm explicitly on how it should clean. You can for instance let the robot watch you while you demonstrate the tasks on how it should clean and learn from it.

Later on he gave examples of different categories of machine learning.

Categories of Machine Learning

Supervised learning

This was the category where James gave the most examples of during his talk. Supervised learning is where you train your model with a dataset which contains the initial data and its correct answers. The more training data you have, the more accurate your predictions will be.

Regression example

An example he showed us was the prediction of housing prices using regression.

House price prediction (From Andrew Ng Learning course) (From Andrew NG’s Machine learning course)

In this example, the dataset consists of instances with a square footage (input) and price (output). With a regression, we can predict a continuous valued price.

Classification example
Iris flower classification using machine learning Source:Nicoguaro's Wikipedia media gallery (CC BY 4.0) An example of Supervised Learning using classification Iris flower classification using machine learning

Another example of Supervised Learning is to determine a certain species of an Iris flower. The algorithm tries to determine the species of the flower with the Sepal and Petal size as input.

Unsupervised Learning

For Unsupervised Learning on the other hand, you don’t give the right answers with your dataset. Your learning algorithm will try to find a structure in the given data.

A method to try to find a structure, is to do it by clustering. This means the data is ‘grouped’ in clusters together with data that more or less belongs to each other. Market segment discovery and social media analysis are examples of Unsupervised Learning.

Reinforcement Learning

By Reinforcement Learning, you give your algorithm rewards when it did something well. This type of learning is very popular in game playing. AlphaGo for example from Google Deepmind was taught by Reinforcement Learning.

Neural networks

Neural network example

The second part of his talk was about Neural Networks. (Artificial) Neural Networks are computing systems that are inspired by biological neural networks. It’s made up of highly interconnected processing elements or ‘nodes’ that can process information. A Neural Network consists of different layers. An input layer, one or more hidden layers and an output layer.

We can visually demonstrate how Neural Networks work with the help of deeplearning4j. You can clone and try out his example on https://github.com/JavaFXpert/visual-neural-net-server.

Let’s use the flower classification example with our neural network. Iris flower classification Neural Network Example

1: Welcome to Machine Learning (Andrew Ng)

Testing every level of your Spring Microservices application (Workshop) by Jeroen Sterken & Kristof Van Sever

Introduction

This workshop focused on testing the different levels of a microservices application. It was split up into two parts:

  • Testing within a single microservice
  • Testing the relationships between microservices with Spring Cloud contract

Testing a single microservice with Cucumber and JUnit

The presentation started off with some of the new features that JUnit 5 has to offer. Since JUnit 5 supports Java 8, it allows you to use lambdas in assertions, as well as using group assertions with the assertAll() method. It’s also possible to run tests multiple times with different parameters, by annotating them with the @ParameterizedTest and @ValueSource for the arguments.

Behaviour driven testing with Cucumber

Unit tests alone are not enough of course, you also need to test how different components work together. Usually it’s the developer who writes such tests, but it’s also possible for non-technicals to write such tests with Cucumber.

How does this work exactly?

Cucumber achieves this by using Gherkin, an English plain text language. It has .feature files where the different scenarios for a certain feature are described.


Feature: A new empty basket can be created and filled with Tapas

Scenario: Client creates a new Basket, and verifies it's empty
    When the user creates a new Basket
    Then the total number of items in the Basket with id 1 equals 0

The above is a simple example of how to describe a feature and scenario. The words in bold represent Cucumber keywords, called step definitions. It’s also possible to substitute Scenario with Scenario outline in case you need to test the same scenario with different values. You can put parameters inside angle brackets (<>), which are substituted with values that you define in an Examples data-table.

The next step is to annotate your methods with the Cucumber step definitions (e.g. @Given @When). The text that you provide the annotations with, should match the text in your .feature file so that Cucumber can glue the two together. In this case, the When the user creates a new Basket of the example above matches with:

@When("^the user creates a new Basket$")

The annotated methods should execute what you described in the feature files, so in this case the method looks something like this:

public void theUserCreatesANewBasket() {
    userBasketManagement.createNewBasket();
  }

To try it out for yourself, go to the Workshop repo. There’s a solution branch in case you’re stuck or wish to compare your code.

Spring Cloud Contract

One of the challenges of testing chained microservices is making sure that a microservice stub reflects the actual service at all times.

One way this can be achieved is by using Spring Cloud Contract. Spring Cloud Contract enables Consumer Driven Contract development, where one service (consumer) defines its expectations of another service (producer) through a contract.

The first step is for the consumer to write the test for the new feature, following the TDD approach.

Next, add the Spring Cloud Starter Contract Verifier dependency and maven plugin to your producer. Create a base test class to the test package that loads the Spring Context. Make sure to annotate it with @AutoConfigureMessageVerifier.

We should also add the contract to our resources on the producer-side:


Contract.make{
    description "should return a list of all tapas"

    request{
        method GET()
        url "/tapas"
    }

    response{
        status 200
        headers {
            contentType applicationJson()
        }
        body (
            [
                [
                        id: 0,
                        name: "All i oli",
                        price: 1.5
                ],
                [
                        id: 1,
                        name: "Banderillas",
                        price: 3
                ]
             ]
        )
    }
}

The above is an example of how a contract is defined, written in Groovy (although YAML is a possibility as well). It simply specifies that a GET request to /tapas should return the provided body as application/json.

Now it’s time to create the stub. Since you’ve already added the dependency and plugin to your producer, simply run your build for the plugin to generate the stubs. The built stub artifact will be stored in your local maven repository. The plugin will also create a test class that extends the base test class we created earlier containing the necessary setup to run your tests.

The next step is to add the Spring Cloud Contract Stub Runner dependency to the consumer and annotate your test class with @AutoConfigureStubRunner. By annotating your class with @AutoConfigureStubRunner and providing the groupId, artifactId and port on which the stub will run, your test class is configured to use the producer’s generated stub.

Interested to try out this Spring Cloud Contract workshop? You can find the Github repo here.

Got triggered?

All talks were recorded by the Spring IO team. You can view them on YouTube.


Dieter is a Principal Java Consultant at Ordina, passionate about all Java- and JavaScript related technologies. Aside from his day-to-day occupation as a consultant, he helps fellow developers as a Competence Leader for the Cloud & PaaS Competence Center by giving workshops, talks and courses about the newest technologies. In his spare time, Dieter enjoys playing soccer, running, (online) gaming and fiddling around with all kinds of fancy new software.

Yannick is a senior Java consultant and competence lead of the JVM languages competence center at Ordina Belgium. He’s very much interested in everything Java and Spring related as well as reactive programming, Kotlin, Lightbend technologies, software architectures, and coaching and enabling other colleagues.

Tom is a Senior Developer at Ordina Belgium, passionate about all software related to data. As competence leader Big & Fast Data he guides his fellow developers through dark data swamps by giving workshops and presentations. Tom is passionate about learning new technologies and frameworks.

Johan is a Java Developer at Ordina Belgium. He is passionate about technology and science since he was as kid and is always up for a challenge.

Jens is Java Developer at Ordina Belgium with a keen interest in frontend development as well. He’s always up for learning new technologies and is passionate about writing quality tests.

Tim is a Java Developer at Ordina Belgium. His main focus is on back-end development. He is passionate about Microservices, Domain driven design and refactoring.

Yen is a Java Developer at Ordina Belgium who wants to further develop his back-end skills by using the latest technologies within challenging projects.