Salesforce Integration with SQL Server Integration Services

On April-14, 2018,  I had a chance to conduct online session on demonstrating integration between SQL Server Integration Services (SSIS) and Salesforce with the help of COZYROC connector.

This session covered below topics

  • Introduction to ETL
  • What is SQL Server Integration Services (SSIS)
  • COZYROC connector
  • Demo of extracting and updating records in Salesforce
  • Demo of downloading attachments from Salesforce
  • Demo of uploading attachments in Salesforce
  • Best Practices – How to configure SSIS package to avoid hardcoding
  • Best Practices – enabling logging

Definition of Frequently Used Database Architecture Related Terms

  1. Data Warehouse

Data warehouse is also known as Enterprise Data Warehouse (EDW). Data warehouse is used as source for Business Intelligent’s reporting and analysis. Data Warehouse system collects data from multiple sources and contains historical data for trend analysis reporting. ETL tool is used mostly to build Data Warehouse and interfaces around it. Data Warehouse acts as Single Version of truth.

Data warehouse overview (From Wikipedia)
Data warehouse overview (From Wikipedia)

2. Operational Data Store (ODS)

Operational Data Store is frequently confused and definition is overlapped with Data Warehouse. Some of my clients had used word ODS instead of Data Warehouse, which got me confused on number of occasion. As per my understanding & research, ODS is used to integrate data from multiple systems and feed it to Data Warehouse. Data Warehouse consist of complete history of data, whereas ODS contains latest or recent data (short window of data). Data load frequency in ODS is mostly hourly whereas data load frequency in Data Warehouse mostly is nightly because of data volume. Most important reason to have ODS in your company is ability to run report realtime, where source system does not have required reporting capabilities.

3. Data Mart

Data warehouse can contain many Data Marts. Mostly Data mart is created per business line or system that needs data from Data Warehouse. Indirectly we can say, Data Mart is access layer used to get data out of Data Warehouse by other systems.

4. Data Lake

Term Data Lake was coined by James Dixon, CTO of Pentaho to compare with Data Mart. As per James, Data Mart have several problems mostly related to data silos. Data Lake is method of storing data from sources in its actual or raw format that could be Relational Data, XML, flat files or even binary files. Other tools like ETL, access Data Lake as per need for reporting or analysis purposes.



Merge Two Data in ETL project of SSIS

Before reading this tutorial, please go through the first article on creating ETL project in Business Intelligent  (BI) tool of visual studio for SSIS.

For this tutorial we will need two data sources. I am taking one flat file and another excel file so that new user can understand that extraction can be done from different types of datasources.

Text file contains “CustomerId, Subscription Start Date, Subscription End Date” and Excel file Contains “CustomerId, Segment, CustomerIdSpace” columns.

Observe that both data sources contains one common column named “CustomerId”.  For a while don’t count the column “CustomerIdSpace” of Excel Data Source.

Create New Business Intelligent project in Visual Studio 2005 and Drag “Data Flow Task” from tools to package.

Double click on “Data Flow Task” and new tab will opened. Drag “Flat File Source” and “Excel source” from tool box.

Double click on “Flat File Source” and “Excel Source” and create new Connection for excel file and text file respectively.

Drag two “Sort” control from “Data Flow Transformation” section of the toolbox below both sources.

Double click on “Sort” control and select the column “CustomerId” for both “Sort” Control.

Now Drag the “Merge Join” control from tool box and drop green arrow from both “Sort” Control to the “Merge Join” control. Select the type of Join, in this case we have selected “inner join“, also select the columns which should be exported in Output .

Here, we are not going to write the result in file, but we will use “derived column” control after merge join and add “data Viewer” as discussed in previous article to view the output.

The final snapshot of the ETL package will look like:

Merge Two Data in ETL project of SSIS
Merge Two Data in ETL project of SSIS

Categorized as SSIS Tagged ,

Create Simple ETL Project in SSIS – Filter Records

In Previous article, we have seen the basics of SSIS. In this article, we will create our first basic application of filtering data from csv file.

As stated, Business intelligence tool is installed with SQL Server 2005 and best part is that it is integrated with Visual Studio 2005. So developer get the benefit of SSIS feature in Visual Studio power to work in disconnected environment.

In this article, we will read the csv file and get the required records and write back into another csv file.  csv file is attached in the source code in this article.

To start the project, click new Project in Visual Studio and select “Business Intelligence Projects” and then “Integration Services Project” as shown in below figure.

Create Business Intelligence Projects in Visual Studio
Create Business Intelligence Projects in Visual Studio
Categorized as SSIS Tagged