With the release of Hive 2.3.0, Hive has introduced a new feature called a JDBC Storage Handler. The idea is to use a generic JDBC driver so that you could import your table in to Hive and run Hive QL queries to analyze the data that resided in JDBC tables by joining data from other systems, imported using JDBC Storage handler.
In this tutorial, we will walk you through on how you can use Hive’s native JDBC storage handler to query the data in Relational database – MySQL and SaaS data – Salesforce from Hive. We will be using high-performant Progress DataDirect MySQL and Salesforce drivers to enable this.
The following software is required:
Install Progress DataDirect MySQL and Salesforce JDBC drivers
You can create external tables for tables in any data source using Progress JDBC drivers and query your data from Hive using its native JDBC storage handler. We hope this tutorial helped you to work with Hive’s native JDBC storage handler. Feel free to comment if you have any questions.