Export SAS Data Sets into operational database using Bulk Load

Export SAS Data Sets into operational database using Bulk Load

June 03, 2011 0 Comments

One of the use cases for the ground breaking DataDirect bulk load functionality is to load data into a data warehouse at lightning fast speeds.  After speaking with prospects and existing customers, the use case works in the reverse direction as well, i.e. organizations want to move big data from SAS data set libraries into an operational database for use with existing corporate applications and websites.

SAS bulk load to SQL Server from Unix/Linux

To illustrate, the SAS/Access ODBC interface queries your data warehouse to solve complex business problems, achieve key objectives and more effectively manage your information assets.  Users can then move the data back into the data warehouse using the SAS BULKLOAD=YES.  However, how can the solution data sets from SAS servers running on Unix/Linux be consumed by applications running against an operational SQL Server database?

Answer: Progress DataDirect SQL Server bulk load.

There are 3 easy ways to export big data from SAS using DataDirect Connect SQL Server ODBC drivers 6.1 out of the box without additional software to expand your investment in SAS intelligence.   And don't forget about our SQL Server Wire Protocol driver on Windows since it can load data with ease faster than native utilities such as BCP currently leveraged by SAS BULKLOAD.

1. Using SAS with codeless configuration:

a. On Unix/Linux, set EnableBulkLoad=1 connection property in the data source for supported drivers in the odbc.ini configuration.

b. On Windows, check "Enable Bulk Load" box in the "bulk" tab of the ODBC driver setup for supported drivers.

c. Insert data using SAS/Access with the generic ODBC interface.

The driver transparently converts ODBC parameter array inserts into the bulk API of the target data source without any application changes.  Keep reading if performance is not what you expect, or if you have additional requirements.

2. Windows using out-of-box bulk streaming sample:

Use bulk streaming  sample with the SAS ODBC driver as the source data source to leverage bulk load functionality with codeless configuration.

a. Setup SAS ODBC driver data source as "SAS_source", and target data source with a Connect for ODBC driver version 6.1 that supports bulk load.  For this example, we'll use the SQL Server Wire Protocol driver with data source as "SQLServer_destination"

b. Check EnableBulkLoad in Bulk tab of driver setup.

c. Run %INSTALL_DIR%samplesbulkstrmbulkstrm.exe and configure source connection string, SQL to generate result set, and destination connection string.

Export SAS Dataset to SQL Server

d. Click "Bulk Load Table".  The driver transparently transforms the source result set into the required format for the target bulk API of the target data source.

3. Unix/Linux using out-of-box bulk sample:

Export SAS data sets to CSV file, and use the bulk sample to load the file.  The bulkstrm sample exists on Unix/Linux, however the source data source using the SAS ODBC driver is only available for 32-bit Windows in version 9.23 and previous.

a. Export SAS data sets to CSV.  Here are 5 ways to do the export.

b. Create a bulk load configuration file for the CSV file.  See the chapter on the bulk load configuration file in the User's Guide for reference.  Below is an example:

"http://www.datadirect.com/ns/bulk/BulkData.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">            "false">INTEGERCOL           "true">VARBINCOL           externalfilecodepage="Windows-1252" nullable="true">VCHARCOL           externalfilecodepage="Windows-1252" nullable="true">UNIVCHARCOL

c. Launch the interactive bulk sample in $INSTALL_DIR/samples/bulk and pass a connection string.

d. Choose "Load"

e. Specify the file names and accept default values.

Bulk Load on Unix/Linux to SQL Server

And let's say you run into an error after loading billions of records after an hour.  With our bulk API, you can specify how many errors/warnings you can tolerate; and additionally replay the bulk insert at the point of failure!

Pretty cool, huh?  With DataDirect Bulk Load's unique standards based implementation, the above steps will work seamlessly against DB2, Sybase and Oracle as well. Download a free 15 day trial of the latest Connect for ODBC drivers today for Windows/Unix/Linux and show your colleagues today!!

Sumit Sakar

Sumit Sarkar

Sumit Sarkar is a Chief Data Evangelist at Progress, with over 10 years experience working in the data connectivity field. The world's leading consultant on open data standards connectivity with cloud data, Sumit's interests include performance tuning of the data access layer for which he has developed a patent pending technology for its analysis; business intelligence and data warehousing for SaaS platforms; and data connectivity for aPaaS environments, with a focus on standards such as ODBC, JDBC, ADO.NET and ODATA. He is an IBM Certified Consultant for IBM Cognos Business Intelligence and TDWI member. He has presented sessions on data connectivity at various conferences including Dreamforce, Oracle OpenWorld, Strata Hadoop, MongoDB World and SAP Analytics and Business Objects Conference, among many others. 

Comments are disabled in preview mode.
Latest Stories in
Your Inbox
More From Progress
OData: The Fastest Way to RESTify Your Databases
OData: The Fastest Way to RESTify Your Databases
Download Whitepaper
The Business Value of Kinvey
Download Whitepaper
Finding App ROI Through Time And Money
Watch Webinar