Instrumenting the Trade* Benchmarks in DaCapo
As part of a small side project, I recently instrumented the DaCapo benchmark to record dynamic call graph information. In addition to inserting logging code, my instrumentation added a handful of static fields and a static initializer method. As instrumentation goes this is pretty straightforward, but I ran into the following monstrosity when trying to run both the Tradesoap and Tradebeans benchmarks:
Exception initializing client: org.apache.geronimo.kernel.config.LifecycleException: load of org.apache.geronimo.daytrader/daytrader-dacapo/2.2-SNAPSHOT/car failedorg.apache.geronimo.kernel.config.LifecycleException: load of org.apache.geronimo.daytrader/daytrader-dacapo/2.2-SNAPSHOT/car failed
at org.apache.geronimo.kernel.config.SimpleConfigurationManager.loadConfiguration(SimpleConfigurationManager.java:328)
at org.apache.geronimo.kernel.config.SimpleConfigurationManager.loadConfiguration(SimpleConfigurationManager.java:281)
at org.apache.geronimo.kernel.config.SimpleConfigurationManager.loadConfiguration(SimpleConfigurationManager.java:256)
at org.apache.geronimo.kernel.config.KernelConfigurationManager.loadConfiguration(KernelConfigurationManager.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.geronimo.gbean.runtime.ReflectionMethodInvoker.invoke(ReflectionMethodInvoker.java:34)
at org.apache.geronimo.gbean.runtime.GBeanOperation.invoke(GBeanOperation.java:124)
at org.apache.geronimo.gbean.runtime.GBeanInstance.invoke(GBeanInstance.java:832)
at org.apache.geronimo.gbean.runtime.RawInvoker.invoke(RawInvoker.java:57)
at org.apache.geronimo.kernel.basic.RawOperationInvoker.invoke(RawOperationInvoker.java:35)
at org.apache.geronimo.kernel.basic.ProxyMethodInterceptor.intercept(ProxyMethodInterceptor.java:96)
at org.apache.geronimo.kernel.config.EditableConfigurationManager$$EnhancerByCGLIB$$50f44d0d.loadConfiguration(<generated>)
at org.dacapo.daytrader.DaCapoClientRunner.initialize(DaCapoClientRunner.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dacapo.daytrader.Launcher.initialize(Launcher.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dacapo.harness.Tradesoap.prepare(Tradesoap.java:45)
at org.dacapo.harness.Benchmark.run(Benchmark.java:163)
at org.dacapo.harness.TestHarness.runBenchmark(TestHarness.java:199)
at org.dacapo.harness.TestHarness.main(TestHarness.java:152)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at Harness.main(Unknown Source)
Caused by: org.apache.geronimo.kernel.config.InvalidConfigException: Error starting configuration gbean org.apache.geronimo.configs/system-database/2.1.4/car
at org.apache.geronimo.kernel.config.KernelConfigurationManager.load(KernelConfigurationManager.java:181)
at org.apache.geronimo.kernel.config.SimpleConfigurationManager.loadConfiguration(SimpleConfigurationManager.java:313)
... 33 more
Caused by: org.apache.geronimo.gbean.InvalidConfigurationException: Configuration gbean failed to start org.apache.geronimo.configs/system-database/2.1.4/car
reason: Unable to deserialize GBeanState in classloader: [org.apache.geronimo.kernel.config.MultiParentClassLoader id=org.apache.geronimo.configs/system-database/2.1.4/car]
at org.apache.geronimo.kernel.config.KernelConfigurationManager.load(KernelConfigurationManager.java:164)
... 34 more
Oof, enterprise Java at its finest. Googling around indicated that others had hit this same problem, but I was unable to find a publicly documented solution. This blog post outlines the issue I hit, how to avoid it, how to reproduce the error, and my debugging process.
The Issue
The Tradesoap and Tradebeans benchmarks use Apache Geronimo, which deserializes Java objects during startup. (I’m vague on exactly where and why this deserialization takes place; if anyone more familiar with DaCapo or Geronimo would care to fill me in, feel free to reach out.)
As part of the deserialization process, Java checks that the serial version stored in the serialized object of class C matches the current serial version of C. These serial versions provide unique, compact descriptions of classes’ public interface and object layout. Thus, if the stored and declared versions do not match, the JVM concludes that either: a) there has been an incompatible change in the public API/object representation/type hierarchy of the deserialized class, or b) the object was serialized using a completely different class that happens to have the name C. In either case, the deserialization process fails with an InvalidClassException.
You can explicitly declare serial versions by adding a public static final long serialVersionUID field to a serializable class. (Eclipse and IDEA will happily compute a suitable SUID and perform this modification for you.) However, if no such field exists, Java will compute the serial version automatically. Recall that a serial version is a compact description of a class’ public interface and object layout. Java’s algorithm (which you can read here) hashes the names and signatures of non-private methods, the names and types of (most) fields, along with other information such as implemented interfaces.
Java will compute this default serial version during serialization and deserialization. In each instance, the serial version is computed using the class definitions at the time of serialization/deserialization. This is where we ran into trouble. The objects being deserialized by Geronimo were serialized using classes that did not have any instrumentation, which is reflected in the serial version stored in the serialized object. However, when running the Tradesoap and Tradebeans benchmarks with runtime instrumentation, the serial versions do reflect the fields or methods added by the instrumentation, leading to a serial version mismatch, InvalidClassException, and the deserialization failure we ultimately see.
The Solution
There are two ways to avoid incompatible serial versions. One is to only add members that do not affect the serial UID computation. In other words, you may only add:
- private static or instance methods
- private static fields
- private transient instance fields
One potential pitfall of this approach is that the existence of a static initializer does impact the serial UID, and so you cannot add one.
If the above solution isn’t suitable, you can precompute explicit serial UIDs based off
pre-instrumentation class definitions. The ASM library
includes a SerialUIDVersionAdder
which computes a serial UID using the same algorithm as the JVM and then inserts it into the class definition.
You must apply this transformation before adding any fields or methods, otherwise the computed serial UIDs will
not match those stored in serialized objects. For example, suppose you want to instrument a class whose definition
is stored in the byte array classBuffer
. You can use the SerialUIDVersionAdder
as follows:
You can also use the utility method found in this Gist.
Reproducing the Error
I have created a small, self-contained project which demonstrates the above issue and fix. You can find the Github repository here. The README details how to both reproduce the error and demonstrates the effectiveness of the fix.
The Debugging Process
To close, I want to briefly outlined how I debugged the above
issue. You’ll notice that the above stracktrace does not mention an
InvalidClassException
. Instead, Geronimo has “helpfully” suppressed this exception
and thrown an uninformative generic exception instead.
If you use the debug
jvm, you can add
the +XX:TraceExceptions
flag which instructs the JVM to log every
exception, even those suppressed as in Geronimo. (H/T: Jon
Bell, who passed along the TraceExceptions trick
and put me on the serialization trail. Check out his invaluable
jvm.wtf guide.) Be warned: the JVM throws a lot of
exceptions even during normal execution, so you should redirect standard output to
avoid clobbering your scrollback.
The output produced by this flag isn’t terribly friendly, but you can extract the names of all exceptions thrown using the following bash snippet:
where exceptions.log
is the file containing the output produced by using
the XX:TraceExceptions
flag.
Using the above snippet we can find the exceptions thrown when
running with instrumentation as follows.
In the following, assume that exceptions_good.log
contains the exception trace when running the DaCapo benchmark without
instrumentation, and exceptions_bad.log
the trace when running with
instrumentation.
Running the above on the output produced by the Tradesoap and Tradebeans benchmarks gives:
+a 'java/io/InvalidClassException'
+a 'org/apache/geronimo/gbean/InvalidConfigurationException'
+a 'org/apache/geronimo/kernel/config/InvalidConfigException'
+a 'org/apache/geronimo/kernel/config/LifecycleException'
which gives us our clue about serialization.