JMonks.org
Home
Latest News
Getting Started
    Framework
    IO Services
Download
Development
Architecture
Javadoc
    Framework
    IO Services
Bugs/Requests

SourceForge.net

Getting Started - Batch Framework

Overview

Batch Framework provides the perfect environment to write and execute the batch jobs for the developers. It allows the batch job configuration in various ways, logging manager to take care of the job logging, different repository implementations to sent and receive the data between the jobs, management agent to manage and monitor the running batch job and finally different controllers to write your batch jobs. May be the following diagram provides the better understanding of the framework by using pool job controller in working.

Batch Framework control flow diagram



Rest of the sections will explain the each concept in the framework.


Top

System Requirements

Batch Framework can be run on any JRE starting from 1.4.2.


Top

Download

Batch Framework 1.0 beta version can be downloaded here.

Download the Batch_Framework_1.0_beta.zip from the available downloadable files. This distribution contains the framework jar (batch_framework_1.0_beta.jar), all the jars required in the runtime and source code to add to debug path if needed.


Top

Installation

Following are the Setps to install the Batch Framework.

  • Extract the downloaded Batch_Framework_1.0_beta.zip file into any directory.
  • Add the batch_framework_1.0_beta.jar available in the extracted directory into the classpath.
  • Add all the jars available in the lib directory into the classpath.
  • Add framework-config.xml and batch-config.xml files in conf directory into the classpath.

This is the basic configuration needed to get going with the Batch Framework. There are so many options available to customize the framework and running environment, these will be discussed while going further.


Top

My First Batch Job

The best way to understand the framework is to write our first batch job. So, lets go ahead and write our first batch job which actually prints "Hello World". Following Steps explains how to write our first batch job and execute it using our framework.

  1. Write the following java code in com.mycompany.jobs.HelloWorldJob.java in your favourite IDE and compile.

    package com.mycompany.jobs;

    import org.jmonks.batch.framework.controller.basic.BasicJobProcessor;
    import org.jmonks.batch.framework.JobContext;
    import org.jmonks.batch.framework.ErrorCode;

    public class HelloWorldJob extends BasicJobProcessor
    {
        public HelloWorldJob()
        {
        }

        public ErrorCode process(JobContext context)
        {
            System.out.println("Hello World");
            return ErrorCode.JOB_COMPLETED_SUCCESSFULLY;
        }

        public long getTotalRecordsCount()
        {
             return 10;
        }

        public long getProcessedRecordsCount()
        {
             return 5;
        }

        public Object getProcessorState()
        {
             return "give job info being processed.";
        }
    }

  2. Add the following configuration to the batch-config.xml file in the classpath.

    <job-config job-name="helloworld" status="active">
        <job-controller job-controller-class-name="org.jmonks.batch.framework.controller.basic.BasicJobController">
            <basic-job-processor basic-job-processor-class-name="com.mycompany.jobs.HelloWorldJob" thread-count="1"/>
        </job-controller>
        <job-logging-config>
            <job-logger-config logger-name="com.mycompany.jobs" logger-level="DEBUG"/>
        </job-logging-config>
    </job-config>

  3. Run the following command from command prompt. Assuming the CLASSPATH varibale is holding all the jars and files mentioned in the Installation section.


    java -classpath %CLASSPATH% org.jmonks.batch.framework.Main job-name=helloworld


Top

Job Controllers

Job Controllers as the name implies, define, how the job should be written and how it will be executed (controlled) in runtime. This is the important component in the framework and this is the component developer really care and work with.

Job Controllers define their own logic to write a batch job and explains the way this logic will be executed in the runtime. Along with the controller logic, they support the admin applications by providing the useful information to monitor and manage. They provide set of interfaces for developers to implement and put their business logic in. They provide the way to configure the batch job in the job configuration. As of now framework, provides two controllers names as PoolJobController and BasicJobController.

PoolJobController

This controller provides the logic revolves around the job pool, job loader and job processor(s). The way this logic works is, there will be a designated pool for each batch job, there will be a loader loads all the data needs to be processed into the pool, there will be one ore more than one job processor(s) picks the data from the pool and process them. This controller provides three interfaces to implement the batch job, JobPool which holds the job data, PoolJobLoader which loads the job data into the JobPool and PoolJobProcessor which picks the data from the pool and process them.

For better and quick understanding, lets take a task of writing a batch job "process-integers" which actually loads 10 integer objects into the pool and process them by using 3 job processors.

  1. Write a Loader class IntegerJobLoader.java extending the AbstractPoolJobLoader class. The AbstractPoolJobLoader implements the PoolJobLoader to eliminate the burden of implementing the managemtn APIs for developers.

    package com.mycompany.jobs;

    import org.jmonks.batch.framework.controller.pool.AbstractPoolJobLoader;
    import org.jmonks.batch.framework.JobContext;
    import org.jmonks.batch.framework.ErrorCode;

    public class IntegerJobLoader extends AbstractPoolJobLoader
    {
        public IntegerJobLoader()
        {
        }

        public ErrorCode loadPool(JobContext jobContext)
        {
             for(int i=0;i<10;i++)
                 loadJobData(new Integer(i));
             loadJobData(null);

             return ErrorCode.JOB_COMPLETED_SUCCESSFULLY;
        }

        public long getTotalJobDataCount()
        {
             return 10;
        }
    }

  2. Write a Processor class IntegerJobProcessor.java extending the AbstractPoolJobProcessor class. The AbstractPoolJobProcessor implements the PoolJobProcessor to eliminate the burden of implementing the managemtn APIs for developers.

    package com.mycompany.jobs;

    import org.jmonks.batch.framework.controller.pool.AbstractPoolJobProcessor;
    import org.jmonks.batch.framework.JobContext;
    import org.jmonks.batch.framework.ErrorCode;

    public class IntegerJobProcessor extends AbstractPoolJobProcessor
    {
        public IntegerJobProcessor()
        {
        }

        public void initialize(JobContext jobContext)
        {
        }

        public ErrorCode process(Object jobData)
        {
             Integer value=(Integer)jobData;
             // Perform business logic on jobData.
             System.out.println("Received Value = " + value.toString());

             return ErrorCode.JOB_COMPLETED_SUCCESSFULLY;
        }

        public void cleanup()
        {
        }
    }

  3. Thinking of writing a JobPool implementation. Typically, framework will come up with different implementations of JobPool. You can choose the available JobPool implementation by configuring the required JobPool implementation in job configuration as mentioned in the next step. As of now, framework has one JobPool implementation which is CollectionJob which uses java collections as the pool.

  4. Add the following configuration to the job configuration file which is batch-config.xml file for our discussion.

    <job-config job-name="process-integers" status="active">
         <job-controller job-controller-class-name="org.jmonks.batch.framework.controller.pool.PoolJobController">
             <pool-job-loader pool-job-loader-class-name="com.mycompany.jobs.IntegerJobLoader">
                 <property key="any-config-param">any-config-value</property>
             <pool-job-loader/>
             <pool-job-processor pool-job-processor-class-name="com.mycompany.jobs.IntegerJobProcessor" thread-count="3">
                 <property key="any-config-param">any-config-value</property>
             <pool-job-processor/>
             <job-pool job-pool-class-name="org.jmonks.batch.framework.controller.pool.CollectionJobPool">
                 <property key="pool-size">1000</property>
             </job-pool>
         </job-controller>
        <job-logging-config>
            <job-logger-config logger-name="com.mycompany.jobs" logger-level="INFO"/>
        </job-logging-config>
    </job-config>

  5. Run the following command from command prompt. Assuming the CLASSPATH varibale is holding all the jars and files mentioned in the Installation section.


    java -classpath %CLASSPATH% org.jmonks.batch.framework.Main job-name=process-integers

Please see the org.jmonks.batch.framework.controller.pool package for additonal details on PollJobController and its interfaces.

BasicJobController

This controller allows the developer to execute a simple business logic.This provides BasicJobProcessor interfaces which will have process method. All you need to do is implement this one perform the business logic.

My First Batch Job uses this processor to write its business logic. Follow the instructions available in that section to write the batch jobs using BasicJobController.

Please see the org.jmonks.batch.framework.controller.basic package for additonal details on BasicJobController and its interfaces.

Top

Controlling the Logging

Logging is one of the very important activity in writing the batch jobs. This information will be very useful when trying to debug the failures of a batch job. Batch Framework comes with a logging manager which takes care of all the logging for the batch jobs. Following is the default logging configuration from the framework configuration (framework-config.xml) file. In the rest of this section, we will explore this configuration.

<framework-logging-config
        framework-logging-level="DEBUG"
        job-logging-directory="/batchserver/logs"
        job-base-package-name="com.mycompany.batch"
        job-logging-level="INFO"/>

Framework does its own logging to a "batch_framework.log" file in the directory given in the "job-logging-directory" attribute in the configuration. Initially, it will be set to "/batchserver/logs". This can changed to wherever you like the logging needs to be done. The default logging level for the all the framework information will be "DEBUG" which is specified in the attribute "framework-logging-level".

Along with its own logging, framework provides the logging facility for all the batch jobs. For each batch job, it creates a directory in the framework logging directory (directory specified in "job-logging-directory" attribute) when the job executes first time. Each invocation of the batch job will create a seperate file appending the timestamp to the job name. The "job-base-package-name" attribute value will be used to create a single logger for all the batch jobs and its logging level will be defaulted to "INFO" which is specfied by the "job-logging-level" attribute value. These two values can be changed according to your requirements. Please see the org.jmonks.batch.framework.LoggingManager for additonal details.

Using a single logger to all the jobs is not enough for everyone. There could be scenarios that logging level needs to be controlled for each batch job. To satisfy those requirements framework provides the flexibility to control the logging level of each batch job. This can be done by declaring the additional loggers in the job configuration. Following job configuration defines it own loggers to control the logging.

<job-config job-name="helloworld" status="active">
    <job-controller job-controller-class-name="org.jmonks.batch.framework.controller.basic.BasicJobController">
        <basic-job-processor basic-job-processor-class-name="com.mycompany.jobs.HelloWorldJob" thread-count="1"/>
    </job-controller>
    <job-logging-config>
         <job-logger-config logger-name="com.mycompany.jobs" logger-level="TRACE"/>
         <job-logger-config logger-name="com.mycompany.jobs.hello" logger-level="ERROR"/>
    </job-logging-config>
</job-config>

Along with these facilities, framework provides the controlling of the logging level of every job at the runtime through management and monitoring applications.


Top

Job Configuration

Job configuration can be done either in XML files or any database that supports JDBC connectivity. Batch Framework provides two factories to configure based on your needs. Following XML snippet in framework configuration (framework-config.xml) controls the configuration of the jobs.

<job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.config.xml.XMLJobConfigFactory">
        <property key="job-config-file-classpath-location">batch-config.xml</property>
</job-config-factory-config>

By default framework is configured with XMLJobConfigFactory to read the job configuration from XML files and again from batch-config.xml file which is available in classpath. All this information can be controlled in the framework configuration file.

XML Configuration

org.jmonks.batch.framework.config.xml.XMLJobConfigFactory allows us to configure the batch jobs in XML files. It is flexible to read the XML job configuration file either from the classpath or as an absolute path in the file system. This can be dictated by property key in "job-config-factory-config". Use the key "job-config-file-classpath-location" to allow the factory to read the configuration file from the class path and use the key "job-config-file-absolute-location" to allow the factory to read the configuration file from the file system and this value should be an absolute path. Following XML snippet configures factory with the absolute path.

<job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.config.xml.XMLJobConfigFactory">
        <property key="job-config-file-absolute-location">/batchserver/config/batch-config.xml</property>
</job-config-factory-config>

Please see the
org.jmonks.batch.framework.config.xml.XMLJobConfigFactory for additional details and check out that package for all the configuration classes which explains the format of the job configuration.

Database Configuration

org.jmonks.batch.framework.config.xml.DBJobConfigFactory allows us to configure job configuration in any database complaint with JDBC. As of now SQL scripts for MySQL and Oracle has been provided to create the tables for the job configuration.

MySQL

To configure the framework to use MySQL for the job configuration, please follow the following steps.

  1. Execute the run_all.sql script available in <extracted_dir>/bin/dbscripts/mysql directory.
  2. Change the "job-config-factory-config" in framework configuration as below and change the values appropriate to your MySQL database settings.

    <job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.config.db.DBJobConfigFactory">
            <property key="jdbc-driver-class-name">com.mysql.jdbc.Driver</property>
            <property key="jdbc-url">jdbc:mysql://localhost:3306/batchserver</property>
            <property key="username">root</property>
            <property key="password">password</property>
    </job-config-factory-config>

Oracle

To configure the framework to use Oracle for the job configuration, please follow the following steps.

  1. Execute the run_all.sql script available in <extracted_dir>/bin/dbscripts/oracle directory.
  2. Change the "job-config-factory-config" in framework configuration as below and change the values appropriate to your Oracle database settings.

    <job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.config.db.DBJobConfigFactory">
            <property key="jdbc-driver-class-name">oracle.jdbc.driver.OracleDriver</property>
            <property key="jdbc-url">jdbc:oracle:thin:@hostname:1521:instancename</property>
            <property key="username">scott</property>
            <property key="password">tiger</property>
    </job-config-factory-config>

Please see the
org.jmonks.batch.framework.config.db.DBJobConfigFactory for additional details and check out that package for all the configuration classes which explains the format of the job configuration.
Top

Repository Configuration

Repository can be used to share the data between batch jobs, saves the job statistics and save all the management and monitoring applications. Framework comes with the default configuration to use DB4O (Database for Objects) as the repository and will be saved in the "/batchserver/repository" directory. There are mutliple implementations of this repository available. By changing the configuration in framework configuration, we can switch to different repository implementations.

DB4O - Standalone Mode

By default framework comes with this configuration and it will configured in the following way in framework configuration. If needed, directory name can changed to wherever we needed.

<repository-config repository-class-name="org.jmonks.batch.framework.repository.db4o.Db4oRepository">
        <property key="db4o-directory">/batchserver/repository</property>
</repository-config>

DB4O - Client Server Mode

org.jmonks.batch.framework.respository.db4o.ClientServerDb4oRepsoitory impelementation allows us to use DB4O database running in the client server mode. Following is the configuration to use that implemenation. Planning to provide a script to create a DB4O server in future phases.

Please see the
org.jmonks.batch.framework.repository.db4o.ClientServerDb4oRepository for additional details on how to configure this one in framework configuration file.

MySQL

org.jmonks.batch.framework.respository.jdbc.JdbcRepository implementation allows us to use any database complaint with JDBC. To configure the framework to use MySQL for the repository, please follow the following steps.

  1. Execute the run_all.sql script available in <extracted_dir>/bin/dbscripts/mysql directory. (If you have done this step for job configuration, ignore this step.)
  2. Change the "repository-config" in framework configuration as below and change the values appropriate to your MySQL database settings.

    <job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.repository.jdbc.JdbcRepository">
            <property key="jdbc-driver-class-name">com.mysql.jdbc.Driver</property>
            <property key="jdbc-url">jdbc:mysql://localhost:3306/batchserver</property>
            <property key="username">root</property>
            <property key="password">password</property>
    </job-config-factory-config>

Oracle

org.jmonks.batch.framework.respository.jdbc.JdbcRepository implementation allows us to use any database complaint with JDBC. To configure the framework to use Oracle for the repository, please follow the following steps.

  1. Execute the run_all.sql script available in <extracted_dir>/bin/dbscripts/mysql directory. (If you have done this step for job configuration, ignore this step.)
  2. Change the "repository-config" in framework configuration as below and change the values appropriate to your Oracle database settings.

    <job-config-factory-config job-config-factory-class-name="org.jmonks.batch.framework.repository.jdbc.JdbcRepository">
            <property key="jdbc-driver-class-name">oracle.jdbc.driver.OracleDriver</property>
            <property key="jdbc-url">jdbc:oracle:thin:@hostname:1521:instancename</property>
            <property key="username">scott</property>
            <property key="password">tiger</property>
    </job-config-factory-config>

Management and Monitoring applications are going to use Repository heavily for their operations. So it would be beneficial to use a repository works in client server mode to easily run these applicatins.
Top

JMX Connector Configuration

JobManagementAgent uses the JobConnectorHelper configuration in framework configuration file to create the JMX conncetor server and register this inforamtion in the look up location. Framework configuration file comes with the connector helpers class org.jmonks.batch.framework.management.jmxmp.RepositoryJMXMPConnectorHelper.

JMXMP - Repository

org.jmonks.batch.framework.management.jmxmp.RepositoryJMXMPConnectorHelper uses the JMXMP connector to create the connector servers and uses the repository to save the connector server information. Framework uses this connector helper by default. Here is the configuration.

<job-connector-config job-connector-helper-class-name="org.jmonks.batch.framework.management.jmxmp.RepositoryJMXMPConnectorHelper"/>


Top

Feedback

I am trying my level best to keep this tutorial understandable and it is upto date with the releases. If you see any mistakes or inconsistency in examples, please send me a message (If you are a member of sourceforge) or an email.


Top