Jenkins : Making your plugin behave in distributed Jenkins

If you are just getting started with writing a Jenkins plugin, then this is not something you need to worry about right now. Come back to this page when you got something working.

Distribution Architecture of Jenkins

Jenkins uses a mechanism similar to distributed agents to perform distributed computing. That is, the thread that's running on the master can send closures to remote machines, then get the result back when that closure finishes computation.

For example, the following code, taken from hudson.maven.ProcessCache, shows one such closure implementation.

private static class GetSystemProperties extends MasterToSlaveCallable<Properties,RuntimeException> {
    public Properties call() {
        return System.getProperties();
    }
    private static final long serialVersionUID = 1L;
}

Closures implement hudson.remoting.Callable, which is parameterized on both the return type and the exception type that it can throw.

You can dispatch this closure to a slave by calling Channel.call like this:

Properties systemProperties = channel.call(new GetSystemProperties());

Java serialization is used to send the closure to execute and to receive the return value. That is, the following things happen when the above statement is executed:

  1. (local) The closure is serialized (and thus, everything that the closure references is serialized).
  2. (local → remote) The serialized byte image is sent to the remote JVM.
  3. (remote) The closure is deserialized.
  4. (remote) The closure is executed.
  5. (remote) The closure's return value (upon a normal completion) or the exception (upon an abnormal completion) are serialized.
  6. (remote → local) The serialized byte image is sent back to the local JVM.
  7. (local) The return value or exception is deserialized, and the Channel.call returns the value or throws the exception.

Behind the scene, the remoting framework takes care of class file transmissions, exception chaining, and other low-level stuff.

Do not use anonymous inner classes to implement callables. Use either top-level classes or static nested classes.

Implicit Remoting

Since distributed computing is a complex topic, Jenkins has several key abstractions in place to make this aspect of Jenkins somewhat transparent.

FilePath

For simple plugins, doing remoting at the file access layer is the easiest and the most transparent way to achieve distribution-safe code. For this reason, Jenkins introduces the hudson.FilePath class to perform file access (instead of java.io.File.) Unlike File, FilePath can point to any file or directory in the master or any of the slaves. The methods defined on FilePath will work correctly when files that it refers to are on a remote machine.

Launcher

hudson.Launcher is another abstraction for an implicit distribution. This class plays a similar role to java.lang.ProcessBuilder, except that it can launch a process on a remote JVM.

Performance Considerations

Because of the pervasive use of FilePath and Launcher throughout in Jenkins core, sometimes plugin developers don't even notice that their code behave correctly in the distributed environment.

However, one should note that these simple approach may have performance penalty — often you can achieve better performance by moving the code to where the data is (which is what a closure gives you), instead of moving the data to where the code is (which is what the implicit data remoting in FilePath gives you.)

So a well-written Jenkins plugin should use explicit closures to perform a block of task remotely, and only send the summary back to the master, instead of moving the large data over the network.

A good example of this can be seen in the JUnitResultArchiver class, which performs XML parsing on the remote machine and just sends back the resulting objects to the master.

Passing Objects That Are Not Serializable

When using FilePath.act, Channel.call, and so on, sometimes the objects you want to pass to the closure are not serializable. What do you do about that?

If you control those classes, and if those classes can be made serializable, consider making them serializable.

Another common strategy is to create a serializable factory class, pass this factory instance, then re-create an equivalent instance on the other JVM:

private static final class SerializableHttpClientFactory implements Serializable {
    private int timeout;

    SerializableHttpClientFactory(int timeout) {
        this.timeout = timeout;
    }

    HttpClient create() throws SocketException {
        HttpClient client = new HttpClient();
        client.setTimeout(timeout);
        return client;
    }

    private static final long serialVersionUID = 1L;
}
 
private static final class UseClient extends MasterToSlaveFileCallable<Void>() {
    private final SerializableHttpClientFactory factory;
    UseClient(SerializableHttpClientFactory factory) {
        this.factory = factory;
    }
    @Override
    public Void invoke(File f, VirtualChannel channel) throws IOException, InterruptedException {
        HttpClient client = factory.create();
        return null;
    }
    private static final long serialVersionUID = 1L;
}

void foo(FilePath ws) {
    ws.act(new UseClient(new SerializableHttpClientFactory(50)));
}

Gotchas

  • On a slave, usually only a part of the Jenkins object graph is available. This means that, for instance, Jenkins.getInstance() must not be called. To work around this, grab all the information you need on the master side, assign them to final variables, then access them from the closure so that only those bits get sent to the slave. See this mail on the devlist.

TBD

  • Asynchronous calls
  • FilePath.act
  • Pipes
  • Remotable objects, like TaskListener, FilePath.
  • Exporting a proxy
  • your suggestions welcome