Running Eclipse Vert.x applications with Eclipse OpenJ9

This how-to provides some tips for running Vert.x applications with OpenJ9, an alternative Java Virtual Machine built on top of OpenJDK that is gentle on memory usage.

Vert.x is a resource-efficient toolkit for building all kinds of modern distributed applications, and OpenJ9 is a resource-efficient runtime that is well-suited for virtualized and containerized deployments.

What you will build and run

  • You will build a simple microservice that computes the sum of 2 numbers through an HTTP JSON endpoint.

  • We will look at the options for improving startup time with OpenJ9.

  • We will measure the resident set size memory footprint on OpenJ9 under a workload.

  • You will build a Docker image for the microservice and OpenJ9.

  • We will discuss how to improve the startup time of Docker containers and how to tune OpenJ9 in that environment.

What you need

  • A text editor or IDE

  • Java 21

  • OpenJ9

  • Maven or Gradle

  • Docker

  • Locust to generate some workload

Note

Eclipse Foundation projects are not permitted to distribute, market or promote JDK binaries unless they have passed a Java SE Technology Compatibility Kit licensed from Oracle, to which the Eclipse OpenJ9 project does not currently have access. You can either build your own Eclipse OpenJ9 binary, or download an IBM Semeru runtime.

Create a project

The code of this project contains Maven and Gradle build files that are functionally equivalent.

With Gradle

Here is the content of the build.gradle.kts file that you should be using:

With Maven

Writing the service

The service exposes an HTTP server and fits within a single Java class:

package io.vertx.howtos.openj9;

import io.vertx.core.Future;
import io.vertx.core.VerticleBase;
import io.vertx.core.Vertx;
import io.vertx.core.json.JsonObject;
import io.vertx.ext.web.Router;
import io.vertx.ext.web.RoutingContext;
import io.vertx.ext.web.handler.BodyHandler;

import static java.util.concurrent.TimeUnit.MILLISECONDS;
import static java.util.concurrent.TimeUnit.NANOSECONDS;

public class Main extends VerticleBase {

  @Override
  public Future<?> start() {
    Router router = Router.router(vertx);
    router.post().handler(BodyHandler.create());
    router.post("/sum").handler(this::sum);

    return vertx.createHttpServer()
      .requestHandler(router)
      .listen(8080);
  }

  private void sum(RoutingContext context) {
    JsonObject input = context.body().asJsonObject();

    Integer a = input.getInteger("a", 0);
    Integer b = input.getInteger("b", 0);

    JsonObject response = new JsonObject().put("sum", a + b);

    context.response()
      .putHeader("Content-Type", "application/json")
      .end(response.encode());
  }

  public static void main(String[] args) {
    long startTime = System.nanoTime();
    Vertx vertx = Vertx.vertx();
    vertx.deployVerticle(new Main()).await();
    long duration = MILLISECONDS.convert(System.nanoTime() - startTime, NANOSECONDS);
    System.out.println("Started in " + duration + "ms");
  }
}

We can run the service:

$ ./gradlew run

and then test it with HTTPie:

$ http :8080/sum a:=1 b:=2
HTTP/1.1 200 OK
Content-Type: application/json
content-length: 9

{
    "sum": 3
}

$

We can also build a JAR archive will all dependencies bundled, then execute it:

$ ./gradlew shadowJar
$ java -jar build/libs/openj9-howto-all.jar

Improving startup time

The microservice reports the startup time by measuring the time between the main method entry, and the callback notification when the HTTP server has started.

We can do a few runs of java -jar build/libs/openj9-howto-all.jar and pick the best time. On my machine the best I got was 311ms.

OpenJ9 offers both an ahead-of-time compiler and a class data shared cache for improving startup time as well as reducing memory consumption. The first run is typical costly, but then all subsequent runs will benefit from the caches, which are also regularly updated.

The relevant OpenJ9 flags are the following:

  • -Xshareclasses: enable class sharing

  • -Xshareclasses:name=NAME: a name for the cache, typically one per-application

  • -Xshareclasses:cacheDir=DIR: a folder for storing the cache files

Let us have a few run of:

$ java -Xshareclasses -Xshareclasses:name=sum -Xshareclasses:cacheDir=_cache -jar build/libs/openj9-howto-all.jar

On my machine the first run takes 457ms, which is "much" more than 311ms! However, the next runs are all near 130ms, with the best score of 112ms which is very good for a JVM application start time.

Memory usage

Let us now measure the memory usage of the microservice with OpenJ9 and compare with OpenJDK.

Warning

This is not a rigorous benchmark. You have been warned 😉

Generate some workload

We are using Locust to generate some workload. The locustfile.py file contains the code to simulate users that perform sums of random numbers:

from locust import *
import random
import json

class Client(HttpUser):
    wait_time = between(0.5, 1)
    host = "http://localhost:8080"

    @task
    def sum(self):
        data = json.dumps({"a": random.randint(1, 100), "b": random.randint(1, 100)})
        self.client.post("/sum", data=data, name="Sum", headers={"content-type": "application/json"})

We can then run locust, and connect to http://localhost:8089 to start a test. Let us simulate 100 users with a 10 new users per second hatch rate. This gives us about 130 requests per second.

Measuring RSS

The Quarkus team has a good guide on measuring RSS. On Linux you can use either ps or pmap to measure RSS, while on macOS ps will do. I am using macOS, so once I have the process id of a running application I can get its RSS as follows:

$ ps x -o pid,rss,command -p 66820
    PID   RSS COMMAND
  66820 124032 java -jar build/libs/openj9-howto-all.jar

For all measures we start Locust and let it warm up the microservice. After a minute we reset the stats and restart a test, then look into RSS and 99% latency. We will try to run the application with no tuning and then by limiting the maximum heap size (see the -Xmx flag).

With OpenJDK 21 and no tuning:

  • RSS: ~143 MB

  • 99% latency: 1ms

With Semeru 21 and no tuning:

  • RSS: ~90 MB

  • 99% latency: 1ms

OpenJ9 is clearly very efficient with respect to memory consumption, without compromising the latency.

Tip

As usual take these numbers with a grain of salt and perform your own measures on your own services with a workload that is appropriate to your usages.

Building and running a Docker image

Ok so we have seen how gentle OpenJ9 was on memory even without tuning. Let us now package the microservice as a Docker image.

Here is the Dockerfile you can use:

FROM ibm-semeru-runtimes:open-21-jdk
RUN mkdir -p /app/_cache
COPY build/libs/openj9-howto-all.jar /app/app.jar
VOLUME /app/_cache
EXPOSE 8080
CMD ["java", "-Xvirtualized", "-Xshareclasses", "-Xshareclasses:name=sum", "-Xshareclasses:cacheDir=/app/_cache", "-jar", "/app/app.jar"]

You can note:

  • -Xvirtualized is a flag for virtualized / container environments so OpenJ9 reduces CPU consumption when idle

  • /app/_cache is a volume that will have to be mounted for containers to share the OpenJ9 classes cache.

The image can be built as in:

$ docker build . -t openj9-app

We can then create containers from the image:

$ docker run -it --rm -v /tmp/_cache:/app/_cache -p 8080:8080 openj9-app

Again the first container is slower to start, while the next ones benefit from the cache.

Tip

On some platforms cgroups may not grant permissions to access the class data cache directory. You can use the z flag to fix such issues, as in:

docker (…​) -v /tmp/_cache:/app/_cache:z (…​)

Summary

  • We wrote a microservice with Vert.x.

  • We ran this microservice on OpenJ9.

  • We improved startup time using class data sharing.

  • We put the microservice under some workload, then checked that the memory footprint remained low with OpenJ9 compared to OpenJDK with HotSpot.

  • We built a Docker image with OpenJ9, class data sharing for fast container boot time and diminished CPU usage when idle.


Last published: 2024-12-21 00:38:08 +0000.