My clients are typically concerned about batch process run times. As their business has grown, they want to process more and more data. Or, the users are now more savvy and want that data sooner. Failing to make SLAs is bad, bad, bad.
Generally, a batch stream is made up of a number of serially run jobs. And each job may have have a number of serial activities. But which ones to focus on?
Experience has shown that the largest opportunities are generally found by identifying the largest percentage contributors to the overall processing time. After all, making a 10% improvement in a 20-minute job is only 2 minutes, while a 10% gain in a three-hour job saves 18 minutes.
Plotting the process run times in a Pareto chart makes a easily-understood graphically representation that can help the technical and management teams focus attention on the right processes.
Consider the following example data of 14 batch jobs.
| job | start | end | minutes |
|---|---|---|---|
| LS | 2008-10-12 05:37:17 | 2008-10-12 06:02:19 | 25.0 |
| P1 | 2008-10-12 06:04:32 | 2008-10-12 09:15:00 | 190.5 |
| SSP1 | 2008-10-12 09:26:33 | 2008-10-12 12:35:56 | 189.4 |
| SSP2 | 2008-10-12 12:37:13 | 2008-10-12 12:48:08 | 10.9 |
| SEB | 2008-10-12 12:49:27 | 2008-10-12 13:04:41 | 15.2 |
| DN | 2008-10-12 13:10:08 | 2008-10-12 18:13:04 | 302.9 |
| CM1 | 2008-10-12 21:14:15 | 2008-10-13 02:13:26 | 299.2 |
| CM2 | 2008-10-13 02:14:37 | 2008-10-13 02:59:13 | 44.6 |
| CM3 | 2008-10-13 03:00:15 | 2008-10-13 03:03:04 | 2.8 |
| CM4 | 2008-10-13 03:04:07 | 2008-10-13 03:05:26 | 1.3 |
| R1 | 2008-10-13 03:23:08 | 2008-10-13 03:24:10 | 1.0 |
| R2 | 2008-10-13 03:25:12 | 2008-10-13 03:33:05 | 7.9 |
| R3 | 2008-10-13 03:34:13 | 2008-10-13 07:43:40 | 249.5 |
| SDT | 2008-10-13 08:50:59 | 2008-10-13 09:03:28 | 12.5 |
When plotted as a Pareto, it is easy to see that the largest three jobs together take over 60% of the total time.

I've found it easiest to use to use a script-based approach to generating the chart rather than a spreadsheet tool such as Excel or OpenOffice.org Calc. Why? It is very simple (it sorts and accumulates the data form me), and produces consistent output graphs every time. The data is first processed by a custom Ruby script (attached), then plotted using the Ploticus tool. This script and approach, by the way, isn't specific to process run times. One could use it for quality control or other analysis.
To use, you'll need Ruby, Ploticus, and the script in the attachment -- it (pareto.rb) will generate a Pareto chart from tab-delimited, two-column data (on stdin):
(1) label
(2) numeric qty
The script will sort the data by (2), then add the following columns:
(3) percentage component of value column (0-100)
(4) accumulated percentage (last one should be 100)
It will then create the Ploticus script on stdout with the data embedded (which will plot (1), (2), (4)).
To use the script (pareto.rb), do something like the following:
$ ./pareto.rb -t 'Batch Job Runtimes' -y 'minutes' < jobs.txt |pl -stdin -png -o pareto.png
Or if you're on Windows, you'll need to invoke ruby directly so the pipes work. For example:
d:\> ruby pareto.rb -t "Batch Job Runtimes" -y "minutes" < jobs.txt |pl -stdin -png -o pareto.png
The script takes a few command-line arguments. Simply use the traditional "-h" to see them:
$ ./pareto.rb -h
Usage: ./pareto.rb [options]
-v, --[no-]verbose Run verbosely
-y, --ylabel=LABEL Y-axis label
-t, --title=TITLE Graph title
--qtycolor=COLOR Color of quantity bars
--pctcolor=COLOR Color of cumulative percent line
--plotheight=INCHES Height of plot area
I hope you find this helpful.
| Attachment | Size |
|---|---|
| batch_job_pareto.zip | 5.3 KB |