It’s now easier than ever to work with external data in Apache Hive. Learn how you can quickly connect Hive to Salesforce with Progress DataDirect.
With the inclusion of the JDBC Storage Handler, Hive now makes it easier to access and query your data from external sources. In this tutorial, we’ll walk through the steps of connecting Hive to an external Salesforce instance using a Progress DataDirect JDBC connector.
Apache Hive is one of the most popular open source data warehouses in use today. Built to withstand the big-data forces of Hadoop and sporting a user-friendly SQL-like query interface, Hive is a fantastic resource for managing and analyzing large datasets. Hive’s earliest users include Facebook, Netflix and Amazon. If it can handle the amount of data these companies are creating, then it likely will handle yours as well.
Starting with version 2.3, Hive introduced a new and powerful feature called the JDBC Storage Handler. This new functionality allows you to connect and query any data source with a JDBC connector. This becomes immensely helpful as you invariably will need to manage and analyze more than just what resides in your data warehouse. And while Hive has always had some limited capability to handle external data (vs managed), this new upgrade makes it easier and more seamless to do so.
It’s great to talk about this new product feature, but it’s better to actually start working with it! My colleague Saikrishna Bobba has assembled instructions to get you up and running quickly. In this example, he’s going to walk you through the steps of connecting Apache Hive to your Salesforce instance using the Progress DataDirect Salesforce JDBC Connector.
Once you’ve walked through it, you’ll be able to use this process to connect Hive to any external source for which you have a JDBC connector. Get started today with a free trial download of our DataDirect JDBC drivers and see what data you can bring into Apache Hive!
Read the Hive Tutorial
Download a JDBC Trial Today
James Goodfellow is a Senior Product Marketing Manager at Progress and focuses his efforts on the DataDirect suite of solutions. Through his tenure at companies like Progress and SAS, he has spent the bulk of his career launching successful marketing campaigns for data and analytics products. James blogs here and around the web on topics such as data connectivity, analytics, IoT, visualization and machine learning. You can follow him on twitter at @jcgoodfellow.
Copyright © 2018 Progress Software Corporation and/or its subsidiaries or affiliates.
All Rights Reserved.
Progress, Telerik, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks for appropriate markings.