Idaliz Baez presents three demos for Spark data integration including JDBC Apache SQOOP, ODBC SparkSQL and Salesforce Spark DataFrames.
Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. This project successfully graduated from the Incubator in March of 2012 and is now a top-level Apache project.
In this first demo, I will show you how to ingest external data into Hadoop using Apache Sqoop and the DataDirect JDBC drivers.
Although Apache Sqoop has a certain level of out-of-the-box support, this support is very limited. Here are some of the key features you are missing out on if you stick with out-of-the-box capabilities:
In the second demo, I’m going to walk you through accessing Hadoop data through Spark SQL with any of your beloved BI/Reporting applications.
To conclude the demos, I will show you how to access data for Spark across relational, cloud, SaaS and NoSQL data sources utilizing JDBC connectivity.
In these demos we discussed JDBC connection to Apache Sqoop, ODBC connection to SparkSQL and Salesforce Spark DataFrames. If you would like to test any of these solutions for yourself, we offer free trials for each of them! Get started with high-performance data connectivity today!
Try Now for Free
Idaliz is a Sales Engineer with Progress. After receiving her undergraduate degree from Duke University in Civil and Environmental Engineering, Idaliz Baez spent a year at NASA Goddard Space Flight Center gaining on-the-job experience before returning to Duke in pursuit of her Masters of Engineering Management degree.
Copyright © 2018 Progress Software Corporation and/or its subsidiaries or affiliates.
All Rights Reserved.
Progress, Telerik, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks for appropriate markings.