Rapidminer to Salesforce Using DataDirect Cloud JDBC

Rapidminer to Salesforce Using DataDirect Cloud JDBC

April 13, 2016 0 Comments
RapidMiner is a powerful analytics provider, but it becomes more powerful with Salesforce. Learn how to connect your data sources and use RapidMiner to the fullest.

RapidMiner is leading the predictive analytics platform space. It provides a logical GUI to help provide easy visualization, transformation and analysis of data. It focuses on data mining, text mining, and predictive analytics.

RapidMiner is empowering organizations to include predictive analytics in any business process. Salesforce, which is now almost synonymous with SaaS, is a web-based CRM solution. Salesforce specializes in managing the sales cycle—that includes managing leads/customers, organizing marketing campaigns, analyzing performance and tracking revenues.

There is a lot of customer information in Salesforce that an organization will need in their data analysis. Our DataDirect Cloud connectivity service allows you to merge a number of SaaS and on-premises sources into RapidMiner, thus increasing the capabilities of RapidMiner multifold. We have included a tutorial below to help integrate Salesforce data into your RapidMiner.

Setting Up a DataDirect Cloud Salesforce Connection

Setting up a Salesforce Account

  1. If you do not have a Salesforce account, you can register for a free trial here.
  2. Please go ahead and set the security token by navigating through the following path:

    [Your Name] -> Setup -> Personal Information -> Reset Security Token

  3. Save the security Token that you receive to your registered email Id.

Setting up a DataDirect Cloud account

  1. If you do not already have a DataDirect Cloud account, you can register for a free trial here.
  2. Once you log in to the Progress Pacific dashboard, choose "Connect Data"

DataDirect Connect Data

Configuring Salesforce access from DataDirect Cloud

  1. The first step is to create a data source definition. Choose ‘Data Sources’ on the left pane. And then click the ‘+New Data Source’ button.
  2. From the list of available sources, choose Salesforce as your cloud data source.
  3. In the next window as shown below, provide:
    1. Name: SalesforceDB (or a name of your choice)
    2. User Id and Password associated with your Salesforce account
    3. Security token of your account
    4. Salesforce Login URL: login.salesforce.com

    Create Salesforce Data Source

  4. Now, click on Test Connection. If it is successful, save this data source.

Your DataDirect Cloud can now access your Salesforce data. For more information you can refer to the following blog, which provides more details on Salesforce connectivity.

Setting up DataDirect Cloud RapidMiner Connection

Installing RapidMiner and DataDirect Cloud JDBC driver

  1. In your DataDirect Cloud account, find your JDBC driver in the downloads page. Choose the JDBC driver for your OS (Windows/Linux).
  2. Install this driver. You can find the documentation right below the drivers.
  3. Next if you do not have a RapidMiner account, you can download it here. Install the version that your system supports.
  4. You can also get 14 days of access to RapidMiner Studio Professional by signing up for a trial version.

Configuring DataDirect Cloud JDBC driver in RapidMiner

  1. Navigate to the database drivers in the following path:
    Connections -> Manage Database Drivers ->Add
  2. In the window pane below, provide the following information:
    1. Name: D2C Salesforce Driver (or Your Preferred Name)
    2. URL prefix: jdbc:datadirect:ddcloud://
    3. Port: 443
    4. Schema separator:   ;
      (NOTE: the default / will not work)
    5. Jar File: Locate the DataDirect Cloud jar file under

Configuring DataDirect Cloud JDBC driver in RapidMiner

Configuring RapidMiner Database Connections

  1. Navigate to the New Connections page from: Connections -> Manage Database Connections ->New or File -> Add Data -> From Database -> New Connection
  2. In the New Connections Page, provide the following information:
    1. Name: Salesforce D2C (Your Preferred name)
    2. Database System: D2C Salesforce Driver (Or the name you gave to your driver)
    3. Host: service.datadirectcloud.com
    4. Port: 443
    5. Database Scheme: databaseName=SalesforceDB  (replace SalesforceDB with the DB name you chose while configuring Salesforce data source in DataDirect Cloud)
    6. Provide the Username and Password for your DataDirect Cloud account

      Configuring RapidMiner Database Connections

  3. Test this connection. Once you have successfully established a connection with DataDirect Cloud, your Salesforce data is available for analysis in RapidMiner.

 Analyzing Salesforce Data in RapidMiner

  1. Create a new process: File -> New Process.
  2. In the repository pane, navigate to the data you are looking for. In this example I will access Salesforce Account information: DB -> Salesforce D2C -> Example Sets -> SFORCE.ACCOUNT
  3. You can click the table to view the information. Please note the BILLINGCITY and BILLINGSTATE information of Row No. 1

    Analyzing Salesforce Data in RapidMiner-1

  4. Go back to the design view. Next drag this table into the Process panel.
    Note: You may have to open the process panel from View -> Show Panel -> Process
  5. You can rename the retrieve operator if you would like.
  6. Next type Set Data in the Operators Panel. Drag the ‘Set Data’ Operator into the process panel.
  7. Connect the output of retrieve operator to this operator.
  8. Set Example Index as 1, attribute name as BILLINGCITY, value as Austin as shown below:

    Analyzing Salesforce Data in RapidMiner-2

  9. Click the Edit List(0) and add entries as shown below:

    Analyzing Salesforce Data in RapidMiner-3

  10. Connect the output of Set Data to the res button at the right corner of the Process Panel. The design will look as shown below:

    Analyzing Salesforce Data in RapidMiner-4

  11. Click the play button and you will see that the data is modified in the example table as shown below:

Analyzing Salesforce Data in RapidMiner-5

This application leverages our powerful DataDirect Cloud Connectivity Service. Whether you are connecting to other SaaS sources or to an on-premises data source behind a firewall, DataDirect Cloud lets you do it. The connectivity service currently supports 50+ different data sources including SaaS/Cloud sources, Relational databases and Big Data sources. You can connect any of those sources to your RapidMiner account without having to change any of the application code. For more information, please get in touch with one of our experts.

DataDirect Cloud Data Connectivity


Nishanth Kadiyala

Nishanth Kadiyala is a Technical Marketing Manager at Progress. He got his B.Tech degree from IIT Guwahati and his MBA from UNC Chapel Hill. He has worked on several technologies including database designing, SQL querying and Cloud Computing in the past. Currently, he is committed to educating enterprises about standards based connectivity via ODBC, JDBC, ADO.NET and OData. He is also proficient with DataDirect Hybrid Connectivity Services – DataDirect Cloud and Hybrid Data Pipeline. You can stay in touch with him through Twitter.

Comments are disabled in preview mode.
Latest Stories
in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

More From Progress
2020 Progress Data Connectivity Report
2020 Progress Data Connectivity Report
Read More
Getting Ahead of the Hybrid Data Curve
Read More
Creating Quick, Codeless Connectivity with Autonomous REST Connector
Read More