Import data from Salesforce and others to Apache Kylin using JDBC

Introduction

Apache Kylin is an open source Distributed Analytics engine which provides OLAP on your massive data sets in Hadoop/Spark and lets you query them less than a second. Apart from Hadoop/Spark – Kylin will also let you connect to your data in RDBMS sources like SQL Server, MySQL, Postgres etc., using JDBC drivers.

It’s not just RDBMS data sources, you can connect to variety of data sources from Apache Kylin using Progress DataDirect JDBC Connectors including SaaS sources like Salesforce, JIRA, Oracle Eloqua, Google Analytics and RDBMS sources like SQL Server, DB2, MySQL, Postgres, OpenEdge etc., In this tutorial, we will show how you can connect and sync tables from Salesforce – but you can use the same steps with any Progress DataDirect JDBC Connectors.

Install Progress DataDirect Salesforce JDBC Driver

  1. Download DataDirect Salesforce JDBC driver from here.
  2. To install the driver, you would have to execute the .jar package and you can do it by running the following command in terminal or just by double clicking on the jar package.
    java -jar PROGRESS_DATADIRECT_JDBC_SF_ALL.jar
  3. This will launch an interactive Java installer which you can use to install the Salesforce JDBC driver to your desired location as either a licensed or evaluation installation.

Configure Salesforce connectivity for Kylin

  1. Kylin uses SQOOP to load the data to HDFS. So, you would have to copy the DataDirect Salesforce JDBC driver you just installed to two locations.
    1. $SQOOP_HOME/lib
    2. $KYLIN_HOME/ext
  2. Next, you need to configure the Salesforce connection in kylin.properties file. Go to $KYLIN_HOME/conf/kylin.properties and add the below configuration

  3. kylin.source.default=8
    kylin.source.jdbc.connection-url=jdbc:datadirect:sforce://login.salesforce.com;TransactionMode=ignore
    kylin.source.jdbc.driver=com.ddtek.jdbc.sforce.SForceDriver
    kylin.source.jdbc.dialect=default
    kylin.source.jdbc.user=<username>
    kylin.source.jdbc.pass=<password>
    kylin.source.jdbc.sqoop-home=<Your SQOOP_HOME directory>
    kylin.source.jdbc.filed-delimiter=|

 

Sync tables from Salesforce

  1. Go to Kylin Web UI at http://localhost:7070 and open the model tab. Under Model Tab, Open Data sources tab and click on the icon “Load Table from Tree”.
  2. You should now see the list of schemas and Tables in the pop up which represents the data in your Salesforce instance.


    Table select

  3. Choose the tables you want and click on Sync. Now you are ready to build models, create OLAP cubes to query your Salesforce data instantaneously from your BI tools using Kylin.

Feel free to try the DataDirect Salesforce JDBC driver  to bring your Salesforce data to Apache Kylin for faster querying and other Progress DataDirect JDBC drivers as per your needs. Let us know if you have any questions or issues, we will be happy to help.


JDBC TUTORIAL

Import data from Salesforce and others to Apache Kylin using JDBC

View all Tutorials

Connect any application to any data source anywhere

A product specialist will be glad to get in
touch with you