Considerations and Best practices for Salesforce Dataloading

Although we are doing data loading in Salesforce from ages, its very common to miss some steps or considerations. Therefore thought to list some of best practices and considerations I follow before or during data load process.

Salesforce Dataloader
Salesforce Data loader Considerations

  • Developer note field – I mostly create a text field on every object where data is being loaded. This field helps to identify existing data verses newly loaded data. This field can be deleted after few weeks of data loading. Content in this field is like  Data loaded on 21-JanMock B data loaded on 29 Jan by xyz  etc.
  • Legacy primary key – This is external Id field which contains primary key of external system. It helps to relate data between Salesforce and legacy system if needed.
  • Set Inactive users as Owner – We can load historical data with Inactive users. This is great feature if you don’t want to loose audit information. It can be enabled using User Interface options – “Update Records with Inactive Owners”.
  • Load Audit fields : Use permission set / profile to enable editing of audit fields like created date and created by fields.  We can enable it at profile or permission set level using permission “Set Audit Fields upon Record Creation”.
  • Audit fields can only set while inserting records. Make sure this is considered, otherwise we need to delete record and insert back again to populate audit fields.
  • Make sure local computers timezone is set to Hawaii (article link). Restart your system and start dataloader. Dataloader should show locale as Hawaii. You can also set timezone as GMT manually for data loader.
  • If you are using SQL Server, then convert date to UTC and then load datetime into Salesforce.
  • If you are using SQL Server, You can use this code to convert T-SQL date to Salesforce date format – CONVERT(NVARCHAR,GETDATE(),126)
  • For supported Datetime format read more in detail at Johan’s blog
  • If because of internet issue, connection drops and load fails in between, then in dataloader setting, you can restart from where connection dropped. It shows that what was success count and we can enter row number from where we want dataloader to restart processing. (shown in below image)
  • If you are using SQL Server, then DBAmp or Cozyroc can be used to load data into Salesforce.
  • If number of fields between legacy system and Salesforce doesn’t match, then instead of ignoring legacy system’s field, you can create a hidden text area field and populate it with csv values.
  • While loading address in Account or Contact, most of time number of address are more than 2. Salesforce supports two address only, out of box (Shipping and Billing address). In these cases, you might end up creating custom child object to store multiple addresses.
  • Legacy note in Salesforce is replaced by Enhanced notes. To load Enhanced notes using Dataloader, follow this post. If using other tools then we need to load notes in Base64 format in ContentNote object and then link ContentNote with record using junction object ContentDocumentLink object.
  • If there are attachments to be loaded, then it will take different approach. Check this article to use Dataloader to load attachments.
  • If file size is less than 25 MB then we can use attachments other wise you might need to use Chatter Files.
  • If any of field contains Rich Text, then we have to be little careful here. CSV file containing Rich text doesn’t work properly. you might need to think on some alternatives like converting rich text to simple text.
  • Loading attachment may cause heap size memory issues, in those cases try to minimize batch size.
Salesforce DataLoader - Start from where connection dropped
Salesforce DataLoader – Start from where connection dropped

Fixing Errors :

If you get below error

CSV Error: Invalid CSV file format. Please select a different file.
com.sforce.async.CSVReader$CSVParseException: Not expecting more text after end quote

Then it can be fixed following below points

  • In CSV file, one column has comma. I normally replace comma by semicolon (If accepted by client) or you can enclose it in double quotes
  • In CSV file, one column has double quotes. I normally replace double quotes by single quotes.
  • CSV file contains non ASCII characters. It can be removed by searching using expression [^\x00-\x7F]+. Read more here.

Related posts

2 thoughts on “Considerations and Best practices for Salesforce Dataloading”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.