Developing a JVM Agent for bytecode instrumentation with Javassist – Part 1

May 15, 2013

Hi, this is the first part of a blog post entitled “Developing a JVM Agent for bytecode instrumentation with Javassist“.

In this first part I’ll walk you through the basics of what is/what’s the purpose of a  JVM agent, bytecode instrumentation, we’ll also talk a bit about the available APIs for developing an agent and show how to use the APIs to create an agent for bytecode instrumentation

In the next parts of this post I’ll talk about what is Javassist (a library for bytecode manipulation which we’ll use), how to use it by looking at examples and it’s API, then finally by combing Javassist with JVM Agent API we’ll have a fully working agent capable of doing amazing things with bytecode instrumentation.

All source and compiled code (jars) are available on my github repository

What are JVM/Java Agents?

A JVM Agent is a program that runs in same process as the JVM.
An agent can receive events from the JVM and query it for information on JVM itself or the application running in it and control or alter the application.

Agents are commonly used by tools in order to assist the development and monitoring of applications running in the JVM.

 

What is bytecode instrumentation?

Bytecode instrumentation in a nutshell is altering the compiler generated bytecode for a class, with it you can all sorts of nifty cool stuff such as calculating how long does it take to a method to be executed, alter its execution flow etc – possibilities are endless when you can add/change bytecode of an application, and since we are dealing with bytecode you can do all this without the need of the applications source at all.

Bytecode instrumentation is not the only thing JVM Agents can do, in fact they can do all sorts of interesting and crazy stuff like iterating over all reachable objects in the JVM or get notified when resources needed by the JVM (like memory, threads) get exhausted and many many other things, but still Instrumentation is one of the most useful and interesting features as it can leverage a lot of flexibility and power over the application.

 

Available APIs when creating JVM Agents

Starting with Java 1.5 JVM TI (Java Virtual Machine Tool Interface) was introduced as the main API for monitoring tools and as a substitute for JVMPI (Java Virtual Machine Profiling Interface) and JVMDI (Java Virtual Machine Debug Interface, which were considered at the time quite complex, unstable, unreliable and poorly documented.

The JVM TI was defined by JSR-163, it’s a native interface of the JVM, hence requiring you to write the agent’s code in C/C++ and also use JNI (Java Native Interface), JSR-163 was a great success and JVMTI is commonly used by all sort of vendors for profiling and monitoring tools, but the JVM TI was not the only interface defined by the JSR-163 for the release of Java 1.5, they also defined among other things the Java programming language in-process instrumentation API (java.lang.instrument) which as the name pretty much says it allows you to instrument bytecode on the JVM by working directly with a Java API.

Not only is usually simpler to work with a Java interface than a native interface, it is much easier to work with bytecode using one of the various existing Java tools dedicated to it. (such as ASM, Javassist etc.)

One thing should be noted, this Java API is not the JVMTI ported to Java, while JVMTI also allows instrumentation the java.lang.instrument API allows only instrumentation, and it is widely used for this, but it does substitute in any way the other existing features of JVM TI.

Since our goal here is bytecode instrumentation it is much better to stick with Java API since dealing with bytecode within Java itself is much easier,  part due to the large availability of well know open source libraries (such as Javassist, ASM, BCEL) that already tackle this problem.

 

The basics of the java.lang.instrument API

So let’s get down to the basics of writing an instrumentation agent in a few quick steps.

Step #1 – The Agent Premain Class

The main agent class has to implement something similar to the public static void main method if we want to load the agent during the JVM start-up procedure, it’s the premain method, and it has two possible signatures:

public static premain(String agentArgs, Instrumentation inst);

public static void premain(String agentArgs);

These are executed in the above order of availability, there’s also the agentmain method that gets called if an agent is loaded after the JVM starts, but this approach won’t be tackled on this post.

In our case today, we’ll be using only the first method signature:

public static void premain(String agentArgs, Instrumentation instrumentation);

That’s because we need the Instrumentation argument that’s passed along by the JVM to the agent

The Instrumentation object passed along it’s a sort of Instrumentation Service exposed by the JVM to the agent so that he can among other things register one or more ClassFileTransformer that allows us to redefine a class bytecode.

So that’s what we have to do in our premain agent class, register our ClassFileTransformer, this would look something like this:

package my.agent;

import java.lang.instrument.Instrumentation;

public class MyAgent {

    //agent start-up hook
    public static void premain(String args, Instrumentation inst) throws Exception {
    	System.out.println("Loading Agent..");
        inst.addTransformer(new MyTransfomer());
    }

}

 

Pretty straight-forward right?

Ok, moving on then,

Step #2 – The ClassFileTransfomer

So as I’ve mentioned ClassFileTransformer(which MyTransformer is an implementation of) allows us to transform/redefine a class bytecode before it gets defined by the JVM

 

ClassFileTransformer is an interface that requires us to implement the following method:

 

byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer);

Once we register our own ClassFileTransformer by doing:

inst.addTransformer(new MyTransfomer())

The JVM will take care of calling up the transform method of MyTransformer once a class starts its definition (or redefinition) process.

Here follows a dummy implementation of MyTransformer that will just print out the name of every class that gets loaded into the JVM.

package my.agent;

import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.security.ProtectionDomain;

/**
 * Dummy sample ClassFileTransformer
 * @author rafael oltra
 *
 */

public class MyTransfomer implements ClassFileTransformer {

	@Override
	public byte[] transform(ClassLoader loader, String className,
			Class<?> klass, ProtectionDomain domain, byte[] klassFileBuffer)
			throws IllegalClassFormatException {
		System.out.println(className + " is about to get loaded by the ClassLoader");

		return null;
	}

}

Right now the only thing our ClassFileTransformer is doing is printing out each and every class that gets loaded by the ClassLoader, don’t fret! This is just a way to show the agent API, we’ll get this guy doing way more interesting things soon

We’ll come back to this method again once we go over on how to use Javassist to do bytecode instrumentation, but one thing you should now is that the return value of this method should be the class altered bytecode in a byte array or null in case of no instrumentation going on.

Step #3 – MANIFEST.MF

So we have a the premain class which loads the dummy implementation of an agent via the ClassFileTransformer API,

What we need now is to create a valid MANIFEST.MF file for our agent, the reason for this is because not only we need to tell the JVM the JAR that contains the agent (which we do it in the next step), we also need to tell it what is class of the agent that contains the premain method among other things, for our agent the following MANIFEST file should be enough:

Manifest-Version: 1.0
Agent-Class: my.agent.MyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
Premain-Class: my.agent.MyAgent

Make sure to put the MANIFEST.MF file the META-INF folder of the JAR!

Step #4 – Building the agent

Our agent in it’s current state does not have any external dependencies, you can build it any way you feel like it as long you have the MANIFEST.MF file in the META-INF folder.

If you don’t feel like building it you can download the JAR file with everything already set from my github page.

Final Step! – Running the agent within the JVM – How to and output

To make the agent actually run inside the JVM you need to pass the -javaagent argument to the JVM, the way this works is that you execute your java application usually in the same way but passing the -javaagent argument with the location of the JAR with the agent when executing the JVM binary file, for instance:

java -javaagent:PATH_TO_AGENT.jar -jar myapplication.jar

Since in this example we are printing out each and every class that gets loaded, I suggest loading the agent in a simple application so you won’t get flooded with name of classes in your console window.

If you don’t want to run this agent in your own application (or it’s just too much trouble) I’ve built a very simple demo application for this post, that does some basic use of JDK, (it simply tries to open a connection to http://www.google.com), it is also available from my github repository here, it’s the dummy-app.jar.

Since we are printing out every class that gets loaded, this might slow your application down when starting up if a lot of classes are loaded (for example when running it within a application server, that definitely loads a lot of stuff specially on start-up), since this is just a sample agent there would not be much use of putting it up with a real application.

To run the agent with this sample application you just have to place both jars in the same folder and run it like this:

java -javaagent:sample-agent-pt1.jar -jar dummy-app.jar

then you should see printed to your console a few dozen lines like this (this output might vary slighty depending on the version and vendor of your JVM)

Loading Agent..
dummy/DummyMain is about to get loaded by the ClassLoader
Starting DummyApp
This is so we'll demonstrate our agent printing out the classes that get loaded by the ClassLoader
Trying to hit http://www.google.com:80
sun/net/www/protocol/http/Handler is about to get loaded by the ClassLoader
sun/net/www/protocol/http/HttpURLConnection is about to get loaded by the ClassLoader
java/net/HttpURLConnection is about to get loaded by the ClassLoader
java/util/logging/Logger is about to get loaded by the ClassLoader
java/util/logging/Handler is about to get loaded by the ClassLoader
java/util/logging/Level is about to get loaded by the ClassLoader
java/util/logging/LogManager is about to get loaded by the ClassLoader
java/util/logging/LogManager$1 is about to get loaded by the ClassLoader
java/beans/PropertyChangeSupport is about to get loaded by the ClassLoader
java/util/logging/LogManager$LogNode is about to get loaded by the ClassLoader
...

One interesting thing to notice is that the main-class for this dummy-app.jar is DummyMain, and it’s the first thing that gets loaded into the JVM, the rest of the classes are the stuff that get’s loaded once we try to do this:

URL url = new URL("http://www.google.com");
url.openConnection().connect();

Next part!

Okay, this concludes our first part of this sort of tutorial on “Developing a JVM Agent for bytecode instrumentation with Javassist”!

As I mentioned before all source code and ready to use JARs are available at my github repository.

For the next post we’ll discuss how to use Javassist and improve the agent so that it does more interesting stuff, like deploying it on an application server (like Tomcat, JBoss, Weblogic) to gather info from Servlets and EJBs.

If you have any comments, suggestions or corrections please feel free to leave a comment or mail me, thanks!

References

JVM TI Documentation
JSR-163
java.lang.instrument Javadocs
Javassist
ASM

tags:
posted in JVM Bytecode Instrumentation by Rafael Oltra

Follow comments via the RSS Feed | Leave a comment | Trackback URL

6 Comments to "Developing a JVM Agent for bytecode instrumentation with Javassist – Part 1"

  1. Alex wrote:

    Hi,
    with this agent and API is possibel to change class at runtime in a JVM in a application server ?

  2. Rafael Oltra wrote:

    Yes Alex, by instrumenting the class bytecode, I’m finishing the second part of this post where I’ll discuss how to do it by using Javassist

  3. Ramon wrote:

    Hi.
    are classes defined in agents jar visibilies to the client application?

  4. Rafael Oltra wrote:

    Hi Ramon, the classes defined by the agent are loaded by the System Classloader, if the application classloader inherits the system classloader then they are visible

  5. Ram wrote:

    Hi Rafael,
    Very informative and well written post.
    I am trying to learn these things and hence found your post very useful. Thank you. Looking forward to next part.

    However I do have two queries. I would be glad if me could help me with these:

    1. Is it possible to build a java agent which can inspect/control all classes running currently in the JVM?

    2. Is it possible to build a java object as a composite of objects(sub objects), so that these sub-objects can evolve dynamically updating their behavior on the fly while the main java object is still running?

  6. Binh Nguyen Thanh wrote:

    Thanks, nice post

Leave Your Comment

 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org