Thursday, November 5, 2009

Ant integration with application servers is too hard

I spend way too much of my time at work wrestling with Ant scripts that try to integrate with application servers.

Some application servers do a better job of this than others, but overall the current state of the art in this area is still as described in the Ant manual:



<parallel>
<wlrun ... >
<sequential>
<sleep seconds="30"/>
<junit fork="true" forkmode="once" ... >
<wlstop/>
</sequential>
</parallel>

This example represents a typical pattern for testing a server application. In one thread the server is started (the <wlrun> task). The other thread consists of a three tasks which are performed in sequence. The <sleep> task is used to give the server time to come up. Another task which is capable of validating that the server is available could be used in place of the <sleep> task. The <junit> test harness then runs, again in its own JVM. Once the tests are complete, the server is stopped (using <wlstop> in this example), allowing both threads to complete. The <parallel> task will also complete at this time and the build will then continue.


Let's stop for a bit and critique this approach:

  • First, we have the complexity of the <parallel> and <sequential> tasks, which are complicated and intricate. As the Ant manual itself says,

    Anyone trying to run large Ant task sequences in parallel ... is implicitly taking on the task of identifying and fixing all concurrency bugs [in] the tasks that they run. ... Accordingly, while this task has uses, it should be considered an advanced task ...

  • Secondly, consider the ugly <sleep> call: why did we have to sleep? How long do we need to sleep? What happens if we sleep for too long, or for not long enough? As the Ant manual notes, there are sometimes ways around this, but they require assistance from the application server.

  • Lastly, what happens when something fails? How do you ensure that, having started the application server, you can reliably shut it down? What happens if you try to shut it down, but it never actually started up? And so forth.



Furthermore, there are many other interactions that one needs to have with an application server beyond just starting and stopping it:

  • What's the status of this application server? Is it up or down?

  • Deploy or undeploy an application to the server. Query the current version of a deployed application; re-deploy a different version of an application, either with or without stopping and re-starting the application and/or the server

  • Find out if the server has encountered any errors; capture the diagnostic error logs from the server.

  • Adjust the configuration of the server: give it different resources, change its operating parameters, etc.

  • Install or un-install an application server from scratch.


And many more.

All of these tasks are routine jobs that I'd like to be able to reliably automate, and over the years (decades!!) I have made some progress in doing so.

But still, all those hours spent in trying to write and maintain reliable Ant automation scripts for application server integration.

Is there a better way?

3 comments:

  1. the seqential/parallel stuff is only needed since wlrun never returns.

    You can replace the sleep with a waitfor if there is some condition that you could check (like some URL becoming available for the http condition).

    Steve Loughran has started to write a task for functional testing which is in Ant's svn trunk (nder taskdefs/optional/testing) which takes care of all the orchestration stuff. AFAIU it isn't ready for public consumption, yet.

    ReplyDelete
  2. Yes, the <funtest> task in SVN_HEAD does
    -the parallel app server start
    -a wait for a condition to be met
    -the junit tests
    -a teardown/finally clause to kill the app server

    Inside the task all that parallel/sequential stuff is set up for you. Furthermore the exceptions in the junit tests get retained and priority over anything in teardown

    I added this test as a rewrite of the SmartFrog <functionaltest> task, needs help on tests and documentation. Please come and help.

    ReplyDelete
  3. Bryan: You should take a look at the Nolio Automation Center (www.noliosoft.com). This solution is exactly what you're looking for and have answers to all your raised issues.

    ReplyDelete