How to make it simple and straightforward to launch Java processes in Linux / Docker

In terms of the job profile of a DevOps engineer, I often do automate the installation and configuration of various IT systems in various environments: from containers to the cloud. I had to work with many systems based on the Java stack: from small (like Tomcat) to large-scale (Hadoop, Cassandra, etc.).


Moreover, almost every such system, even the simplest one, for some reason had a complex unique launch system. At a minimum, these were multi-line shell scripts, like in Tomcat , or even whole frameworks, like in Hadoop . My current "patient" in this series, which inspired me to write this article, is the repository of Nexus OSS 3 artifacts, the launch script of which takes ~ 400 lines of code.


Opacity, redundancy, and the complexity of startup scripts create problems even when manually installing one component on a local system. Now imagine that a set of such components and services need to be packaged in a Docker container, along with writing another layer of abstraction for more or less adequate orchestration, deployed to the Kubernetes cluster and implement this process as a CI / CD pipeline ...


In short, let's use the example of the mentioned Nexus 3 to figure out how to get back from the labyrinth of shell scripts to something more similar to java -jar <program.jar> , given the availability of convenient modern DevOps tools.


Where does this complexity come from?


If in a nutshell, then in ancient times, when UNIX wasn’t asked again: “in the sense of Linux?”, Systemd and Docker and others did not exist, portable shell scripts (init scripts) and PID- were used to control the processes files. Init scripts set the necessary environment settings, which were different in different UNIX systems, and, depending on the arguments, started the process or restarted / stopped it using the ID from the PID file. The approach is simple and straightforward, but these scripts stopped working at every nonstandard situation, requiring manual intervention, did not allow running multiple copies of the process ... but not the essence.


So, if you look closely at the above-mentioned startup scripts in Java projects, you can see the obvious signs of this prehistoric approach in them, including even references to SunOS, HP-UX and other UNIX systems. As a rule, such scripts do something like the following:



The mentioned Nexus 3 startup script is a suitable example of such a script.


In fact, all the above scripting logic, as it were, tries to replace the system administrator, who would install and configure everything manually for a specific system from beginning to end. But in general, any requirements of a wide variety of systems can not be taken into account. Therefore, it turns out, on the contrary, a headache, both for developers who need to support these scripts, and for system engineers, who then need to understand these scripts. From my point of view, it is much easier for the system engineer once to figure out the parameters of the JVM and configure it as it should, than every time when installing a new system to understand the intricacies of its startup scripts.


What to do?


Y - forgive! KISS and YAGNI in our hands. Especially because the year 2018 is in the courtyard, which means that:



So let's go through the startup-scripts functionality again, taking into account the listed points, without trying to do the work for the system engineer, and remove all the "extra" from there.



As a result, we just need to build and execute a Java command like java <opts> -jar <program.jar> using the selected process manager (Systemd, Docker, etc.). All parameters and options ( <opts> ) are left to the discretion of the system engineer who adjusts them for a specific environment. If the <opts> list of options is quite long, you can return to the idea of ​​a startup script, but, in this case, as compact and declarative as possible . containing no programming logic.


Example


As an example, let's see how to simplify the Nexus 3 startup script .


The easiest option is to not get into the jungle of this script - just run it in real conditions ( ./nexus start ) and look at the result. For example, you can find a complete list of the arguments of the running application in the process table (via ps -ef ), or run the script in debug mode ( bash -x ./nexus start ) to watch the entire process of its execution and at the very end the start command.


I ended up with the following Java command.
 /usr/java/jdk1.8.0_171-amd64/bin/java -server -Dinstall4j.jvmDir=/usr/java/jdk1.8.0_171-amd64 -Dexe4j.moduleName=/home/nexus/nexus-3.12.1-01/bin/nexus -XX:+UnlockDiagnosticVMOptions -Dinstall4j.launcherId=245 -Dinstall4j.swt=false -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Xms1200M -Xmx1200M -XX:MaxDirectMemorySize=2G -XX:+UnlockDiagnosticVMOptions -XX:+UnsyncloadClass -XX:+LogVMOutput -XX:LogFile=../sonatype-work/nexus3/log/jvm.log -XX:-OmitStackTraceInFastThrow -Djava.net.preferIPv4Stack=true -Dkaraf.home=. -Dkaraf.base=. -Dkaraf.etc=etc/karaf -Djava.util.logging.config.file=etc/karaf/java.util.logging.properties -Dkaraf.data=../sonatype-work/nexus3 -Djava.io.tmpdir=../sonatype-work/nexus3/tmp -Dkaraf.startLocalConsole=false -Di4j.vpt=true -classpath /home/nexus/nexus-3.12.1-01/.install4j/i4jruntime.jar:/home/nexus/nexus-3.12.1-01/lib/boot/nexus-main.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.main-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.osgi.core-6.0.0.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.diagnostic.boot-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.jaas.boot-4.0.9.jar com.install4j.runtime.launcher.UnixLauncher start 9d17dc87 '' '' org.sonatype.nexus.karaf.NexusMain 

First, apply a couple of simple tricks to it:



We already get something more digestible
 JAVA_OPTS = ( '-server' '-Dexe4j.moduleName=/home/nexus/nexus-3.12.1-01/bin/nexus' '-Di4j.vpt=true' '-Di4jv=0' '-Dinstall4j.jvmDir=/usr/java/jdk1.8.0_171-amd64' '-Dinstall4j.launcherId=245' '-Dinstall4j.swt=false' '-Djava.io.tmpdir=../sonatype-work/nexus3/tmp' '-Djava.net.preferIPv4Stack=true' '-Djava.util.logging.config.file=etc/karaf/java.util.logging.properties' '-Dkaraf.base=.' '-Dkaraf.data=../sonatype-work/nexus3' '-Dkaraf.etc=etc/karaf' '-Dkaraf.home=.' '-Dkaraf.startLocalConsole=false' '-XX:+LogVMOutput' '-XX:+UnlockDiagnosticVMOptions' '-XX:+UnlockDiagnosticVMOptions' '-XX:+UnsyncloadClass' '-XX:-OmitStackTraceInFastThrow' '-XX:LogFile=../sonatype-work/nexus3/log/jvm.log' '-XX:MaxDirectMemorySize=2G' '-Xms1200M' '-Xmx1200M' '-classpath /home/nexus/nexus-3.12.1-01/.install4j/i4jruntime.jar:/home/nexus/nexus-3.12.1-01/lib/boot/nexus-main.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.main-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.osgi.core-6.0.0.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.diagnostic.boot-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/' ) java ${JAVA_OPTS[*]} com.install4j.runtime.launcher.UnixLauncher start 9d17dc87 '' '' org.sonatype.nexus.karaf.NexusMain 

Now you can go into the depths.


Install4j is such a graphical Java installer. It seems that it is used for the initial installation of the system. On the server, we do not need it, we remove.


We agree on the location of the components and data Nexus on the file system:



The creation of directories and links is the lot of configuration management systems (for everything about all 5-10 lines in Ansible), so we will leave this task to system engineers.


Let our script at startup change the working directory to /opt/nexus - then we can change the paths to the relative components of the Nexus components.


Options like -Dkaraf.* the settings for Apache Karaf , the OSGi container, in which our Nexus is obviously packed. Let's change karaf.home , karaf.base , karaf.etc and karaf.data according to the placement of the components, using relative paths if possible.


Seeing that CLASSPATH consists of a list of jar files that are in the same lib/ directory, we replace the entire list with lib/* (we also have to turn off wildcard expansion with set -o noglob ).


Let's change java to exec java , so that our script does not run java as a child process (the process manager simply doesn’t see this child process), but "replaces" itself with java ( exec description ).


Let's see what we got:


 #!/bin/bash JAVA_OPTS=( '-Xms1200M' '-Xmx1200M' '-XX:+UnlockDiagnosticVMOptions' '-XX:+LogVMOutput' '-XX:+UnsyncloadClass' '-XX:LogFile=/data/nexus/log/jvm.log' '-XX:MaxDirectMemorySize=2G' '-XX:-OmitStackTraceInFastThrow' '-Djava.io.tmpdir=/data/nexus/tmp' '-Djava.net.preferIPv4Stack=true' '-Djava.util.logging.config.file=etc/karaf/java.util.logging.properties' '-Dkaraf.home=.' '-Dkaraf.base=.' '-Dkaraf.etc=etc/karaf' '-Dkaraf.data=/data/nexus/data' '-Dkaraf.startLocalConsole=false' '-server' '-cp lib/boot/*' ) set -o noglob cd /opt/nexus \ && exec java ${JAVA_OPTS[*]} org.sonatype.nexus.karaf.NexusMain 

A total of 27 lines instead of> 400, transparent, understandable, declarative, no superfluous logic. If necessary, this script can be easily transformed into an Ansible / Puppet / Chef template and add only the logic that is needed for a specific situation.


You can use this script as an ENTRYPOINT in the Dockerfile or call the Systemd unit file, at the same time adjusting ulimits and other system parameters there, for example:


 [Unit] Description=Nexus After=network.target [Service] Type=simple LimitNOFILE=1048576 ExecStart=/opt/nexus/bin/nexus User=nexus Restart=on-abort [Install] WantedBy=multi-user.target 

Conclusion


What conclusions can be drawn from this article? In principle, it all comes down to a couple of points:


  1. Each system has its own purpose, i.e., it is not necessary to hammer nails with a microscope.
  2. Simplicity (KISS, YAGNI) rules - to implement only what is needed for this particular situation.
  3. And most importantly, it's cool that there are IT specialists of a different profile. Let's interact and make our IT systems easier, clearer and better! :)

Thanks for attention! I would welcome feedback and constructive discussion in the comments.

Source: https://habr.com/ru/post/415893/


All Articles