String concatenation, or Patch bytecode

I recently read an article about optimizing the performance of Java code — in particular, string concatenation. It left the question - why when using StringBuilder in the code under the cut, the program works slower than with simple addition. At the same time, + = when compiled turn into calls to StringBuilder.append ().

I immediately had a desire to understand the problem.

// ~20 000 000    public String stringAppend() { String s = "foo"; s += ", bar"; s += ", baz"; s += ", qux"; s += ", bar"; s += ", bar"; s += ", bar"; s += ", bar"; s += ", bar"; s += ", bar"; s += ", baz"; s += ", qux"; s += ", baz"; s += ", qux"; s += ", baz"; s += ", qux"; s += ", baz"; s += ", qux"; s += ", baz"; s += ", qux"; s += ", baz"; s += ", qux"; return s; } // ~7 000 000    public String stringAppendBuilder() { StringBuilder sb = new StringBuilder(); sb.append("foo"); sb.append(", bar"); sb.append(", bar"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); sb.append(", baz"); sb.append(", qux"); return sb.toString(); }

Then all my reasoning came down to the fact that this is inexplicable magic inside the JVM, and I gave up trying to realize what was happening. However, during the next discussion of the differences in platforms in the speed of working with strings, we and a friend of yegorf1 decided to figure out why and how exactly this magic happens.

Oracle Java SE

upd: tests were conducted in Java 8
The obvious solution is to compile the sources into bytecode, and then see its contents. So we did. In the comments there were suggestions that acceleration is associated with optimization - constant strings should obviously be glued together at the compilation level. It turned out that this is not the case. I will give a part of the bytecode decompiled using javap:

  public java.lang.String stringAppend(); Code: 0: ldc #2 // String foo 2: astore_1 3: new #3 // class java/lang/StringBuilder 6: dup 7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 10: aload_1 11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 14: ldc #6 // String , bar 16: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;

You may notice that no optimizations have been made. Strange isn't it? All right, let's see the second function bytecode.

  public java.lang.String stringAppendBuilder(); Code: 0: new #3 // class java/lang/StringBuilder 3: dup 4: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 7: astore_1 8: aload_1 9: ldc #2 // String foo 11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 14: pop 15: aload_1 16: ldc #6 // String , bar 18: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;

Are there any optimizations again? Moreover, let's look at the instructions on 8, 14, and 15 bytes. A strange thing happens there - first, a reference to an object of the StringBuilder class is loaded onto the stack, then it is thrown from the stack and loaded again. The simplest solution comes to mind:

  public java.lang.String stringAppendBuilder(); Code: 0: new #41 // class java/lang/StringBuilder 3: dup 4: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 7: astore_1 8: aload_1 9: ldc #2 // String foo 11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 14: ldc #6 // String , bar 16: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;

Throwing out extra instructions, we get a code that works 1.5 times faster than the stringAppend version, in which this optimization has already been carried out. Thus, the culprit of "magic" is the unfinished bytecode compiler, which cannot perform fairly simple optimizations.

Android ART

upd: code was built under sdk 28 by re-release buildtools
So, it turned out that the problem is related to the implementation of the Java compiler in bytecode for the stack JVM. Here we remembered the existence of ART, which is part of the Android Open Source Project . This virtual machine, or rather, the bytecode compiler in the native code, was written in the terms of the claim from Oracle, which gives us every reason to believe: the differences from the implementation of Oracle are significant. In addition, due to the specifics of ARM processors, this virtual machine is a register one, not a stack one.

Let's take a look at Smali (one of the bytecode representations under ART):

 # virtual methods .method public stringAppend()Ljava/lang/String; .registers 4 .prologue .line 6 const-string/jumbo v0, "foo" .line 7 .local v0, "s":Ljava/lang/String; new-instance v1, Ljava/lang/StringBuilder; invoke-direct {v1}, Ljava/lang/StringBuilder;-><init>()V invoke-virtual {v1, v0}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder; move-result-object v1 const-string/jumbo v2, ", bar" invoke-virtual {v1, v2}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder; move-result-object v1 //... .method public stringAppendBuilder()Ljava/lang/String; .registers 3 .prologue .line 13 new-instance v0, Ljava/lang/StringBuilder; invoke-direct {v0}, Ljava/lang/StringBuilder;-><init>()V .line 14 .local v0, "sb":Ljava/lang/StringBuilder; const-string/jumbo v1, "foo" invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder; .line 15 const-string/jumbo v1, ", bar" invoke-virtual {v0, v1}, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder; //...

In this variant of stringAppendBuilder there are no more problems with the stack - the machine is register-based, and they cannot arise in principle. However, this does not interfere with the existence of absolutely magical things:

 move-result-object v1

This string in stringAppend does nothing - the reference to the StringBuilder object we need is already in the v1 register. It would be logical to assume that it is stringAppend that will work slower. This is confirmed empirically - the result is similar to the result of the “patched” version of the program for the stack JVM: StringBuilder works almost one and a half times faster.

Source: https://habr.com/ru/post/416479/

All Articles

String concatenation, or Patch bytecode

Oracle Java SE

Android ART

More articles: