Connect to Amazon Redshift using ODBC from Python on Linux

Introduction

In this tutorial, we will walk through how to connect to Amazon Redshift from Python on a Linux machine.

Install unixODBC

1.       Install the unixODBC package by running the below command.

a.       For CentOS:

sudo yum install unixODBC-devel unixODBC

b.       For Ubuntu/Debian:

sudo apt-get install unixODBC-dev unixODBC

2.       After installing unixODBC, copy the contents of /home/<user>/Progress/DataDirect/ODBC_80_64bit/odbcinst.ini and paste them in the file at /etc/odbcinst.ini

Install DataDirect Amazon Redshift ODBC driver

1.       Download the Progress DataDirect ODBC Driver for Amazon Redshift.

2.       Extract the package by running the below command

tar -xvf PROGRESS_DATADIRECT_ODBC_REDSHIFT_LINUX_64.tgz

3.       Install the driver by running the executing the bin file

./ PROGRESS_DATADIRECT_ODBC_8.0_LINUX_64_INSTALL.bin

4.       After installation, go to the install folder and run the shell script odbc.sh or odbc.csh to install required environment variables.

5.       This will set three environment variables. Verify if these are present before going further.

[progress@centos7264 ODBC_80_64bit]$ echo $LD_LIBRARY_PATH && echo $ODBCINI && echo $ODBCINST
/home/progress/Progress/DataDirect/ODBC_80_64bit/lib:/home/progress/Progress/DataDirect/ODBC_80_64bit/jre/lib/amd64/server
/home/progress/Progress/DataDirect/ODBC_80_64bit/odbc.ini
/home/progress/Progress/DataDirect/ODBC_80_64bit/odbcinst.ini

 

 

Connect to Redshift from Python

1.       To access your Redshift from Python, you must install pyodbc package. Install it by running the below command

pip install pyodbc

2.       Now use a sample Python program like below to access your data from Redshift

import pyodbc
 
conn = pyodbc.connect('Driver={DataDirect 8.0 Amazon Redshift Wire Protocol}; HostName=redshift-cluster-1.cy1mp8nn6ntk.us-west-2.redshift.amazonaws.com; Database=dev; UID=awsuser; PWD=Galaxy472; Port=5439')
 
cursor = conn.cursor()
 
## Create Tables
cursor.execute("CREATE TABLE Track ( TrackId INT NOT NULL, Name VARCHAR(200) NOT NULL, AlbumId INT, MediaTypeId INT NOT NULL, GenreId INT, Composer VARCHAR(220), Milliseconds INT NOT NULL, Bytes INT, UnitPrice NUMERIC(10,2) NOT NULL);")
 
cursor.execute("INSERT INTO Track (TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice) VALUES (1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99);")
 
conn.commit()
 
##Access Data using SQL
cursor.execute("select * from Track")
while True:
    row = cursor.fetchone()
    if not row:
        break
    print(row)
 
##Access Data using SQL
cursor.execute("select * from Artist")
while True:
    row = cursor.fetchone()
    if not row:
        break
    print(row)

Feel free to download the Progress DataDirect ODBC Driver for Amazon RedShift and try it out. If you have any questions please contact us and we will be happy to help you.

ODBC TUTORIAL

Connect to Amazon Redshift using ODBC from Python on Linux

View all Tutorials

Connect any application to any data source anywhere

A product specialist will be glad to get in
touch with you