Contact Us 1-800-596-4880

Migrating the Batch Module

Standard Support for Mule 4.1 ended on November 2, 2020, and this version of Mule reached its End of Life on November 2, 2022, when Extended Support ended.

Deployments of new applications to CloudHub that use this version of Mule are no longer allowed. Only in-place updates to applications are permitted.

MuleSoft recommends that you upgrade to the latest version of Mule 4 that is in Standard Support so that your applications run with the latest fixes and security enhancements.

The Batch module was introduced to handle Extract Transform Load (ETL) functionality.

What’s covered in this section:

Migrating the Batch Element

In Mule 3, Batch was a top-level element, at the same level as a flow. In Mule 4, the <batch:job> element is a scope that lives inside a flow so that it is easier to understand, invoke dynamically, and interact with other Mule components.

This also means that there is no <batch:input> anymore. The role of the <batch:input> is now accomplished by whatever set of processors are in the flow before the <batch:job> is triggered.

  • <batch:job>: Even though Batch is a scope, it has a name. No Mule app can have two <batch:job> definitions with the same name. This allows better tracking and management when one single flow defines more than one job

  • <batch:execute>: This operation does not exist anymore. Instead, just invoke whatever flow contains the bach job.

Mule 3 example
<batch:job name="batchJob">
    <<batch:input>>
        <poll frequency="10000">
            <sfdc:query ..../>
        </poll>
    </<batch:input>>
    <batch:process-records>
        <batch:step name="batchStep1">
          <mp />
          <mp />
      </batch:step>
      <batch:step name="batchStep1">
        <event processor/>
        <event processor/>
      </batch:step>
    <batch:process-records>
</batch:job>
Mule 4 example
<flow name="flowOne">
  <scheduler>
			<scheduling-strategy>
				<fixed-frequency frequency="10000"/>
			</scheduling-strategy>
		</scheduler>
  <batch:job jobName="batchJob">
    <batch:process-records>
      <batch:step name="batchStep1">
        <event processor/>
        <event processor/>
      </batch:step>
      <batch:step name="batchStep1">
        <mp />
        <mp />
      </batch:step>
    </batch:process-records>
  </batch:job>
</flow>

Migrating Record Vars to Vars

Unlike a regular flow, Batch processes at the record level instead of message level.

In Mule 3, you use record-level variables (recordVars) to attach variables to a particular record, just like you use Flow variables (flowVars) for the Mule message.

In Mule 4, variables (vars) replace recordVars. Variables are part of the Mule event in Mule 4. From within the <batch:step>, you can modify a variable that existed before the <batch:job> began executing, and you can do it for one particular record, without side effects on the other ones.

The Mule 3 example here uses recordVars within an expression in a For Each component:

Mule 3 Example
<batch:step name="commitStep">
  <batch:commit size="10">
    <foreach>
      <expression-component>
        record.payload = 'foo';
        record.recordVars['marco'] = 'polo';
      </expression-component>
    </foreach>
  </batch:commit>
</batch:step>

The Mule 4 example here uses vars within an expression in a For Each component:

Mule 4 Example
<batch:job jobName="batchJob">
  <batch:process-records>
    <batch:step name="batchStep">
      <batch:aggregator doc:name="batchAggregator" size="10">
        <foreach doc:name="For Each">
          <script:execute engine="groovy">
            <script:code>
              vars['marco'] = 'polo'
              vars['record'].payload = 'foo'
            </script:code>
          </script:execute>
        </foreach>
      </batch:aggregator>
    </batch:step>
  </batch:process-records>
</batch:job>

Support for Non-Java DataSets

In Mule 3, the batch input phase required a Java structure. So, for input data in formats such as XML or JSON, you first needed to transform to Java, then make DataWeave use streaming in that transformation to avoid running out of memory for large datasets.

For example, suppose you have this source data pushed to your app through an HTTP call:

Example: JSON Input Data
[
    {
        "name": "Luke Skywalker",
        "darkSide": true
    },
    {
        "name": "Ben Solo",
        "darkSide": true
    },
    {
        "name": "Obi-Wan Kenobi",
        "darkSide": false
    }
]

In Mule 3, you need to transform that JSON to Java before passing it over, something like this:

Mule 3 Example
<batch:job name="forceJob">
   <<batch:input>>
     <http:listener path="/forceWielders" config-ref="forceListener" />
     <ee:transform>
            <ee:message>
                <ee:set-payload><![CDATA[%dw 2.0
                  output application/java
                  ---
                 payload
                }]]></ee:set-payload>
            </ee:message>
     </ee:transform>
   <<batch:input>>
   .....
</batch:job>

In Mule 4, Batch can automatically determine that the payload is a JSON array and perform the splitting on its own, for example:

Mule 4 Example
<flow name="useTheForceBatch">
  <http:listener path="/forceWielders" config-ref="forceListener" />
  <batch:job jobName="forceJob">
    ....
  </batch:job>
</flow>

You no longer have to set streaming in Mule 4 because of the automatic streaming framework it uses. So, when you migrate to Mule 4, you can avoid a transformation step.

XML Changes to Batch Elements and Attributes for Mule 4

  • Camel Case attributes: Following the Mule 4 DSL guidelines, and in order to improve consistency, all DSL attributes have been changed to camel case. For example, max-failed-records is now maxFailedRecords, accept-policy is acceptPolicy, and so on.

  • MuleSoft removed the filter-expression parameter from the <batch:step>` element. This attribute was deprecated in Mule 3.6 and should be replaced with accept-expression parameter.

  • The <batch:commit> is now called <batch:aggregator>.

See Also