Home Services Partners Company
Instant Oracle Big Data apps with Database Gateway for ODBC to Hadoop Hive

Instant Oracle Big Data apps with Database Gateway for ODBC to Hadoop Hive

April 11, 2013 0 Comments

One challenge writing about big data is that content can get outdated before you even finish the first sentence!  Thanks to the Oracle Database Gateway for ODBC and the DataDirect Apache Hadoop Hive ODBC driver, I am confident this article is still a great read.  This connectivity shifts "data gravity" back to your Oracle applications by federating data from Hadoop Hive.  A common use case from Oracle shops has been to perform lookups against historical transactional data in Hadoop making it no longer necessary to schedule on demand load jobs to physically move the data.

Get started today

1. Assume Oracle Database which includes the Database Gateway for ODBC (DG4ODBC) or standalone DG4ODBC is installed.

2. Download and test-connect the DataDirect ConnectXE for ODBC Apache Hadoop Hive driver on Linux.

3. Configure Database Gateway for ODBC with the exceptions noted below in the init<sid>.ora file for Hadoop Hive.

Oracle Gateway to Hadoop configuration files from video for copy/paste reference

$ORACLE_HOME/network/admin/tnsames.ora

CLOUDERA =
 (description =
 (address = (protocol = tcp)(host = localhost)(port = 1521))
 (connect_data =
 (sid = Cloudera)
 )
 (hs = ok)
 )

$ORACLE_HOME/network/admin/Listener.ora

(SID_DESC =
 (SID_NAME=Cloudera)
 (ORACLE_HOME=/u01/app/oracle/product/11.2.0/xe)
 (PROGRAM=dg4odbc)
 (ENV=LD_LIBRARY_PATH=/opt/Progress/DataDirect/Connect64_for_ODBC_71/lib)
 )

$ORACLE_HOME/hs/admin/init<sid>.ora

#
# HS init parameters
#
HS_FDS_CONNECT_INFO = Cloudera
HS_FDS_TRACE_LEVEL = OFF
HS_FDS_SHAREABLE_NAME = /opt/Progress/DataDirect/Connect64_for_ODBC_71/lib/libodbc.so
#Properties below are for DataDirect Hadoop Hive ODBC Gateway
HS_FDS_QUOTE_IDENTIFIER="`"
HS_FDS_TRANSACTION_MODEL=READ_ONLY
#
# ODBC specific environment variables
#
set ODBCINI=/opt/Progress/DataDirect/Connect64_for_ODBC_71/odbc.ini
#
# Environment variables required for the non-Oracle system
#
set LD_LIBRARY_PATH=/opt/Progress/DataDirect/Connect64_for_ODBC_71/lib

/opt/Progress/DataDirect/Connect64_for_ODBC_71/odbc.ini

[Cloudera]
Driver=/opt/Progress/DataDirect/Connect64_for_ODBC_71/lib/ddhive27.so
Description=DataDirect 7.1 Apache Hive Wire Protocol
ArraySize=16384
Database=default
DefaultLongDataBuffLen=1024
EnableDescribeParam=0
HostName=nc-lnx84.datadirect.com
LoginTimeout=30
LogonID=
MaxVarcharSize=2147483647
Password=
PortNumber=9321
RemoveColumnQualifiers=0
StringDescribeType=12
TransactionMode=0
UseCurrentSchema=0
WireProtocolVersion=2

Login to SQLPlus to create database link:

SQL> create database link bigdata connect to "test01" identified by "test01" using 'Cloudera';
Database link created.
Sumit Sakar

Sumit Sarkar

Sumit Sarkar is a Chief Data Evangelist at Progress, with over 10 years experience working in the data connectivity field. The world's leading consultant on open data standards connectivity with cloud data, Sumit's interests include performance tuning of the data access layer for which he has developed a patent pending technology for its analysis; business intelligence and data warehousing for SaaS platforms; and data connectivity for aPaaS environments, with a focus on standards such as ODBC, JDBC, ADO.NET and ODATA. He is an IBM Certified Consultant for IBM Cognos Business Intelligence and TDWI member. He has presented sessions on data connectivity at various conferences including Dreamforce, Oracle OpenWorld, Strata Hadoop, MongoDB World and SAP Analytics and Business Objects Conference, among many others. 

Comments
Comments are disabled in preview mode.