Automate Command Line Dataloader using ANT with Dynamic arguments and SOQL query

As we know, there are multiple ways to load data inside Salesforce like using Workbench, Jitterbit, API etc… Salesforce also provides standard tool to load data into Salesforce using DataLoader. Dataloader by default supports User Interface or Command Line interface. Most of us already know that User interface based Dataloader needs manual interaction by end user. However if you want to automate data related operations, better choice would be Command Line Data Loader. Assuming you already have basic knowledge about how CLI Data loader works, I will go more advance in this article.

Automate Command Line Data Loader
Automate Command Line Data Loader

Following are basic steps for CLI Dataloader :

  1. Download and install DataLoader
  2. Generate Secret key file
  3. Generate Password on basis of Secret key generated
  4. Create “Process-conf.xml” (here is the problem)
  5. Create field mapping (sdl) file if required
  6. Execute Data loader from Command line

If you want to repeat same operation every time then there is no issue however what if you need to modify SOQL everytime ? what if you need to extract data, modify some value and upload back to Salesforce ? There are endless possibilities you may want to perform with Dataloader but everytime you will need to manually generate “Process-config.xml“, manually update extracted file and then upload. There are many tools available which will auto generate “Process-config.xml”, but again it will be manual operation everytime.

We will be automating all these problem with help of ANT tool. This article is inspired by this post.

You will need to download ANT and add path of ANT jar file in environment variable.

Also, create secret key file and using this file, you will need to create password, check this post to see how to do it.

Next step would be creating configuration file which will contain server URL, username and password.

build.properties

# build.properties
sfSandbox.serverurl = https://test.salesforce.com
sfProd.serverurl = https://login.salesforce.com
#Password generated using encrypt.bat
#Example to generate Key - encrypt.bat -g somkey
#Save above output in key.txt
#Example to generate Pwd - encrypt.bat -e urpassword key.txt
sf.prod.profileName.username = yoursalesforce@domain.com
sf.prod.profileName.password = your_encrypted_password_generated_with_securitytoken

template-process-conf.xml (to extract records)

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
    <bean id="_object_" class="com.salesforce.dataloader.process.ProcessRunner" singleton="false">
    <description>TemplatedCsvExtract extracts to a CSV file."</description>
    <property name="name" value="TemplatedCsvExtract"/>
    <property name="configOverrideMap">
    <map>
        <entry key="process.encryptionKeyFile" value="config/key.txt"/>
        <entry key="sfdc.endpoint" value="_serverURL_"/>
        <entry key="sfdc.username" value="_userName_"/>
        <entry key="sfdc.password" value="_password_"/>
        <entry key="sfdc.debugMessages" value="false"/>
        <entry key="sfdc.timeoutSecs" value="6000"/>
        <entry key="sfdc.extractionRequestSize" value="_batchSize_"/>
        <entry key="process.operation" value="extract"/>
        <entry key="dataAccess.type" value="csvWrite"/>
        <entry key="sfdc.entity" value="_object_"/>
        <entry key="sfdc.extractionSOQL" value="_soql_"/>
        <entry key="dataAccess.name" value="_file_"/>
        <entry key="sfdc.debugMessagesFile" value="_logFile_"/>
    </map>
    </property>
    </bean>
</beans>

This template will be used to create actual “process-conf.xml” to extract records.

template-process-conf_update.xml (to update records)

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
    <bean id="_object_" class="com.salesforce.dataloader.process.ProcessRunner" singleton="false">
    <description>TemplatedCsvExtract extracts to a CSV file."</description>
    <property name="name" value="TemplatedCsvExtract"/>
    <property name="configOverrideMap"> 
       <map>
            <entry key="process.encryptionKeyFile" value="config/key.txt"/>
            <entry key="sfdc.endpoint" value="_serverURL_"/>
            <entry key="sfdc.username" value="_userName_"/>
            <entry key="sfdc.password" value="_password_"/>
            <entry key="sfdc.debugMessages" value="false"/>
            <entry key="sfdc.timeoutSecs" value="6000"/>
            <entry key="sfdc.extractionRequestSize" value="_batchSize_"/>
            <entry key="process.operation" value="update"/>
            <entry key="dataAccess.type" value="csvRead"/>
            <entry key="sfdc.entity" value="_object_"/>
            <entry key="dataAccess.name" value="_file_"/>
            <entry key="sfdc.debugMessagesFile" value="_logFile_"/>
            <entry key="process.mappingFile"  value="_mappingFile"/>
            <entry key="process.outputError"  value="_opError_"/>
            <entry key="process.outputSuccess" value="_opSuccess_"/>
        </map>
     </property> 
    </bean>
</beans>

This template will be used to create actual “process-conf.xml” to update records.

Above 2 xml files will act as a template, in which we will replace parameters like “_serverURL_”,”_userName_” etc… using ANT.

Field Mapping file – AccountFields.sdl

Id=Id
Name=Name
Status__c=Status__c

Actual ANT file “build.xml”

<?xml version="1.0" encoding="UTF-8"?>
<project name="Export" default="all">
    <property file="build.properties"/>
    <property environment="env"/>
    <tstamp>
        <format property="todayDate" pattern="MM-dd-yyyy_HH-mm-ss" />
        <format property="todayDate_only" pattern="MM-dd-yyyy" />
    </tstamp>
    <!-- subtract 2 days from Today -->
    <tstamp>
        <format property="twoDaysAgo" pattern="yyyy-MM-dd" offset="-2"/>
    </tstamp>

    <macrodef name="export_Account">
        <attribute name="file"/>
        <attribute name="object"/>
        <attribute name="soql"/>
        <attribute name="userName"/>
        <attribute name="password"/>
        <attribute name="serverURL"/>
        <attribute name="batchSize"/>
        <attribute name="limit"/> 

        <sequential>
            <echo message="Exporting @{object}"/>
            <mkdir dir="exports/${todayDate_only}"/>
            <mkdir dir="logs/${todayDate_only}"/>
            <input
                message="Enter Year, example: 2014"
                addproperty="date.year"
            />
            <input
                message="Enter Month in number"
                validargs="1,2,3,4,5,6,7,8,9,10,11,12"
                addproperty="date.month"
            />
            <input
                message="Enter Day"
                addproperty="date.day"
            />
            <copy file="config/template-process-conf.xml" tofile="config/process-conf.xml" overwrite="true" failonerror="true"/>
            <replace file="config/process-conf.xml">
                <replacefilter token="_object_" value="@{object}"/>
                <replacefilter token="_soql_" value="@{soql} ${date.year}-${date.month}-${date.day}T00:00:00.000+0000 LIMIT @{limit}"/>
                <replacefilter token="_file_" value="exports/${todayDate_only}/@{file}.csv"/>
                <replacefilter token="_serverURL_" value="@{serverURL}"/>
                <replacefilter token="_userName_" value="@{username}"/>
                <replacefilter token="_password_" value="@{password}"/>
                <replacefilter token="_batchSize_" value="@{batchSize}"/>
                <replacefilter token="_logFile_" value="logs/${todayDate_only}/@{file}_log.csv"/>
            </replace>
            <java classname="com.salesforce.dataloader.process.ProcessRunner" classpath="dataloader-28.0.2-uber.jar" failonerror="true">
                <sysproperty key="salesforce.config.dir" value="config"/>
                <arg line="process.name=@{object}"/>
            </java>
            <copy file="exports/${todayDate_only}/@{file}.csv" tofile="exports/${todayDate_only}/@{file}_Updated.csv" overwrite="true" failonerror="true"/>
            <replace file="exports/${todayDate_only}/@{file}_Updated.csv">
                <replacefilter token="UW Created" value="Ready for Sales"/>
                <replacefilter token="UW Executed" value="Ready for Sales"/>
            </replace>
        </sequential>
    </macrodef>
    <macrodef name="update_Account">
        <attribute name="file"/>
        <attribute name="object"/>
        <attribute name="userName"/>
        <attribute name="password"/>
        <attribute name="serverURL"/>
        <attribute name="batchSize"/>
        <attribute name="mappingFile"/>

        <sequential>
            <echo message="Updating @{object}"/>
            <mkdir dir="exports/${todayDate_only}"/>
            <mkdir dir="logs/${todayDate_only}"/>
            <copy file="config/template-process-conf_update.xml" tofile="config/process-conf.xml" overwrite="true" failonerror="true"/>
            <replace file="config/process-conf.xml">
                <replacefilter token="_object_" value="@{object}"/>
                <replacefilter token="_file_" value="exports/${todayDate_only}/@{file}_Updated.csv"/>
                <replacefilter token="_serverURL_" value="@{serverURL}"/>
                <replacefilter token="_userName_" value="@{username}"/>
                <replacefilter token="_password_" value="@{password}"/>
                <replacefilter token="_batchSize_" value="@{batchSize}"/>
                <replacefilter token="_logFile_" value="logs/${todayDate_only}/@{file}_log.csv"/>
                <replacefilter token="_opSuccess_" value="logs/${todayDate_only}/@{file}_log_Success.csv"/>
                <replacefilter token="_opError_" value="logs/${todayDate_only}/@{file}_log_error.csv"/>
                <replacefilter token="_mappingFile" value="@{mappingFile}"/>
            </replace>
            <java classname="com.salesforce.dataloader.process.ProcessRunner" classpath="dataloader-28.0.2-uber.jar" failonerror="true">
                <sysproperty key="salesforce.config.dir" value="config"/>
                <arg line="process.name=@{object}"/>
            </java>
        </sequential>
    </macrodef>
    <target name="all">
        <export_Account
            file="Account_backup-${todayDate}"
            object="Account"
            soql="Select Id, Name,Status__c from Account WHERE CreatedDate >  "
            userName="${sf.prod.profileName.username}"
            password="${sf.prod.profileName.password}"
            serverURL="${sfProd.serverurl}"
            batchSize="200"
            limit = "1000" /> 

        <update_Account
            file="Account_backup-${todayDate}"
            object="Account"
            userName="${sf.prod.profileName.username}"
            password="${sf.prod.profileName.password}"
            serverURL="${sfProd.serverurl}"
            batchSize="1"
            mappingFile="config/AccountFields.sdl"/>
    </target>
</project>

How it all fits and works ?

Automate Command Line Data Loader
Automate Command Line Data Loader
  1. Execute command “ANT all” or create a batch file with this command.
  2. It will Ask for user input to enter Year, month and Day
  3. On basis of input provided it will create SOQL to get all Account which are created after that date
  4. It will Create “process-conf.xml” and dynamically populate soql, export file location, username, password etc.
  5. It will create two folder if not exists – “exports” and “logs”
  6. It will create folder by date name to maintain data backup
  7. Dataloader command will be executed to start exporting Account
  8. Once Account is exported in CSV file, it will create its copy and replace “status__c” field value by “Ready For Sales”.
  9. Once file is updated, ANT will create new “Process-conf.xml” to update Account record from “template-process-conf_update.xml” template file.
  10. Dataloader command will be executed to start updating Account

How to schedule Command line dataloader to run automatically ?

Create batch file with your ANT command, in this case “ANT all”. Schedule this file using Windows scheduler, check this useful post for more information.

Related posts

  • kirubakaran s

    Small correction to template-process-conf_update.xml file at lin 7.property should be closed before i guess..

  • sunder

    Jitendra – I’ve a question. Have you tried running this command line data loader interface from Jenkins? I can run command line interface successfully using windows batch file option on Jenkins. But the issue I’ve seen is if Metadata API throws any error that is not reflected back to Jenkins. So overall Jenkins is showing as success even though the process.bat failed due to duplicate row insertion.

    • Hi Sunder, Eventhough Jenkins is able to execute ANT but its most appropriate use is with Metadata, If deployment fails, ANT issues error and jenkins know. Dataloader is different, If password and server connection is OK, that means dataloader is considered as executed successfully and if there is any error then Error CSV file is created. If you want to use Dataloader with Jenkins then after dataloader executed, read error file, if it has any error row then raise FAIL task of ANT so that jenkins will know. https://ant.apache.org/manual/Tasks/fail.html

  • Sharmila

    Hi Jitendra,
    I have tried to update the csv in same way with line 58-59 in build.xml. From my understanding, it should be updated in @{file}_Updated.csv, before the file is being used for “Update”. However, this does not happen in my case.

    May I know whether the ‘UW Created’ or ‘UW Executed’ is the API Name or the field name?
    How does it update the Status__c so that the value is set to ‘Ready for Sales’?

    I have tried to replace line 58-59 to but my isAttentionNeeded__c in the @{file}_Updated.csv still the same.

    Really appreciate if you can help on this. Thanks!

    • Hi Sharmila,
      ‘UW Created’ or ‘UW Executed’ are text in CSV file, which are nothing but values. I am doing blank update of these values. replace is not very intelligent in this example. If you try to replace “isActionNeeded__c” to false, it will search for text “isActionNeeded__c” and replace it by false. I suspect, you need add some extra logic to achieve this.

  • Satyendra Vishwakarma

    I need to download the latest version of automated dataloader.
    i am not able to find the link from where i can download this and please share the link here if anybody knows.

    • You can download data loader after login to your Salesforce org or Developer Org and navigating to tools section.