Progress DataDirect Hybrid Data Pipeline is a data access server that provides simple, secure access to cloud and on-premises data sources, such as RDBMS, Big Data, and NoSQL. Hybrid Data Pipeline allows business intelligence tools and applications to use ODBC, JDBC, or OData to access data from supported data sources. Hybrid Data Pipeline can be installed in the cloud or behind a firewall. Hybrid Data Pipeline can then be configured to work with applications and data sources in nearly any business environment. Progress DataDirect Hybrid Data Pipeline consists of four primary, separately installed components.
-
The Hybrid Data Pipeline server provides access to multiple data sources through a single, unified interface. The server can be hosted on premises or in the cloud.
-
The On-Premises Connector enables the Hybrid Data Pipeline to establish a secure connection from the cloud to an on-premises data source.
-
The ODBC driver enables ODBC applications to communicate to a data source through the Hybrid Data Pipeline server.
-
The JDBC driver enables JDBC applications to communicate to a data source through the Hybrid Data Pipeline server.
4.3.0 Release Notes
Enhancements
Security
LDAP authentication
Hybrid Data Pipeline has added support to integrate with Active Directory for user authentication using LDAP protocol. Customers can configure an LDAP authentication configuration by providing the details of the server and can configure users to use the LDAP authentication as opposed to the default authentication.
In order to get started with LDAP Authentication, you need to do the following:
- Create an Authentication Service of type 3 using the Authentication APIs. Once your authentication service has been created, you must note the authentication service ID.
- Create Users tagged to the authentication service ID. You have several different ways of creating service has been created, you must note the authentication service ID.
- Create Users tagged to the authentication service ID. You have several different ways of creating users. Refer to the User guide for details.
Permissions
• Support for a permissions API has been added. The Permissions API enables administrators to manage permissions through the Users, Roles, and DataSource APIs. In addition, the Permissions API allows administrators to create data sources on behalf of users and manage end user access to data source details. Administrators can also specify whether to expose change password functionality in the Web UI and SQL editor functionality.
Password policy
• Support for a password policy has been added.
Tomcat Upgrade
• The Hybrid Data Pipeline server and On-Premises Connector have been upgraded to install and use Tomcat 8.5.28.
Hybrid Data Pipeline Server
New response file options
GUI |
Console |
Definition
|
D2C_ADMIN_PASSWORD |
D2C_ADMIN_PASSWORD_CONSOLE |
Specifies the password for the
default administrator. |
D2C_USER_PASSWORD |
D2C_USER_PASSWORD_CONSOLE |
Specifies the password for the
default user. |
Web UI
• Product Information In cases where you are using the evaluation version of the product, the Web UI now mentions evaluation timeout information as 'xx Days Remaining'.
• Version Information The product version information now includes details about the licence type. This can be seen under the version information section of the UI. The licence type is also returned when you query for version information via the version API.
Beta support for third party JDBC drivers
• With the 4.3 release, Hybrid Data Pipeline enables users to plug JDBC drivers into Hybrid Data Pipeline and access data using those drivers. This beta feature supports accessibility via JDBC, ODBC and OData clients with the Teradata JDBC driver. If you are interested in setting up this feature as you evaluate Hybrid Data Pipeline, please contact our sales department.
Apache Hive
• Enhancements
• Enhanced to optimize the performance of fetches.
• Enhanced to support the Binary, Char, Date, Decimal, and Varchar data types.
• Enhanced to support HTTP mode, which allows you to access Apache Hive data sources using HTTP/HTTPS requests. HTTP mode can be configured using the new Transport Mode and HTTP Path parameters.
• Enhanced to support cookie based authentication for HTTP connections. Cookie based authentication can be configured using the new Enable Cookie Authentication and Cookie Name parameters. * Enhanced to support Apache Knox.
• Enhanced to support Impersonation and Trusted Impersonation using the Impersonate User parameter.
• The Batch Mechanism parameter has been added. When Batch Mechanism is set to multiRowInsert, the driver executes a single insert for all the rows contained in a parameter array. MultiRowInsert is the default setting and provides substantial performance gains when performing batch inserts.
• The Catalog Mode parameter allows you to determine whether the native catalog functions are used to retrieve information returned by DatabaseMetaData functions. In the default setting, Hybrid Data Pipeline employs a balance of native functions and driver-discovered information for the optimal balance of performance and accuracy when retrieving catalog information.
• The Array Fetch Size parameter improves performance and reduces out of memory errors. Array Fetch Size can be used to increase throughput or, alternately, improve response time in Web-based applications.
• The Array Insert Size parameter provides a workaround for memory and server issues that can sometimes occur when inserting a large number of rows that contain large values.
• Certifications
• Certified with Hive 2.0.x, 2.1.x
• Apache Hive data store connectivity has been certified with the following distributions:
• Cloudera (CDH) 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 5.10, 5.11, 5.12
• Hortonworks (HDP) 2.3, 2.4, 2.5
• IBM BigInsights 4.1, 4.2, 4.3
• MapR 5.2
Version and distribution support
• Hive versions 1.0 and higher are supported. Support for earlier version has been deprecated.
• The HiveServer2 protocol and higher is supported. As a result:
• Support for the HiveServer1 protocol has been deprecated.
• The Wire Protocol Version parameter has been deprecated.
• Support has been deprecated for the following distributions:
• Amazon Elastic MapReduce (Amazon EMR) 2.1.4, 2.24-3.1.4, 3.2-3.7
• Cloudera's Distribution Including Apache Hadoop (CDH) 4.0, 4.1, 4.2, 4.5, 5.0, 5.1, 5.2, 5.3
• Hortonworks (HDP), versions 1.3, 2.0, 2.1, 2.2
• IBM BigInsights 3.0 - MapR Distribution for Apache Hadoop 1.2, 2.0
• Pivotal Enterprise HD 2.0.1, 2.1
IBM DB2
• Certifications
• Certified with DB2 V12 for z/OS
• Certified with dashDB (IBM Db2 Warehouse on Cloud)
Oracle Marketing Cloud (Oracle Eloqua)
• Data type support. The following data types are supported for the Oracle Eloqua data store.
• BOOLEAN
• DECIMAL
• INTEGER
• LONG
• LONGSTRING
• STRING
Oracle Sales Cloud
• Data type support. The following data types are supported for the Oracle Eloqua data store.
• ARRAY
• BOOLEAN
• DATETIME
• DECIMAL
• DURATION
• INTEGER
• LARGETEXT
• LONG
• TEXT
• URL
Known Issues
FIPS compliance with the On-Premises Connector
• The On-Premises Connector is not currently FIPS compliant. Therefore, any connections made to an
on-premises data source through an On-Premises Connector will not be fully FIPS compliant.
Performing a silent installation - Log file issue
• When performing a silent install, if the deployment script fails, no 'SilentInstallError.log' is written. You may
check the 'Installation directory/ddcloud/final.log' to know the installation status.
The use of wildcards in SSL server certificates
-
The Hybrid Data Pipeline service will not by default connect to a backend data store that has been configured for SSL when a wildcard is used to identify the server name in the SSL certificate. If a server certificate contains a wildcard, the following error will be returned.
There is a problem connecting to the DataSource. SSL handshake failed:
sun.security.validator.ValidatorException: PKIX path building failed:
sun.security.provider.certpaths.SunCertPathBuilderException: unable to find
valid certification path to requested target
To work around this issue, the exact string (with wildcard) in the server
certificate can be specified with the Host Name in Certificate option when
configuring your data source through the Hybrid Data Pipeline user interface or
management API
Load balancer port limitation
- Either port 80 for non-SSL environments, or port 443 for SSL environments, must be used in the configuration of a load balancer used to support a Hybrid Data Pipeline cluster. Non-standard ports in the configuration of a load balancer are not currently supported.
Web UI
- When a data source is configured with OData Version 4 and the OData Schema Map version is 'odata_mapping_v3' and it does not contain any "entityNameMode", any further editing of the OData Schema map adds "entityNameMode":"pluralize". This affects how entity names are referred to in the OData queries. To avoid this, you must set the entityNameMode whenever a data source is created or edited to the preferred mode. Alternatively, you can remove the "entityNameMode" property from the OData schema map json while saving the data source, if you want to use the default "Guess" mode.
- If a Hybrid Data Pipeline administrator creates a user with a password that contains a percentage mark (%), the new user may face issues while trying to login. In addition, Hybrid Data Pipeline functionality may not work as expected.
- When an administrator tries to add new users using the Add Users window, the Password and Confirm Password fields occasionally do not appear properly in the popup window.
- COPY DETAILS functionality is not currently working in Internet Explorer 11 due to a limitation with the third party plugin Clipboard.js on bootstrap modals. More details on this can be found at https://github.com/zenorocha/clipboard.js/wiki/Known-Issues.
Management API
- When the Limits API (throttling) is used to set a row limit and createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE) is being used, a row-limit-exceeded error is returned at the row limit instead of one row beyond the limit. For example, if a row limit is set at 45 rows when returning a scrollable, insensitive result set beyond the specified limit, the connectivity service returns the following error on the 45th row as opposed to the expected 46th row: The limit on the number of rows that can be returned from a query -- 45 -- has been exceeded.
- If a Hybrid Data Pipeline administrator creates a user with a password that contains a percentage mark (%), the new user may face issues while trying to login. In addition, Hybrid Data Pipeline functionality may not work as expected.
OData
- Functions are not currently supported for $orderby.
- OData functions are not supported with the On-Premises Connector.
- Functions with default parameters are not working.
- For DB2, BOOLEAN Data type does not work with functions in OData.
- For SQL Server and DB2, OData datatypes Edm.Date and Edm.TimeofDay do not work in Power BI, if the function is selected from from the list of function Imports and parameter values are provided. However Power BI allows ‘Edm.Date’ and ‘Edm.TimeOfDay’ types for Function imports when passed directly in OData feed. There is one workaround available for type Edm.TimeofDay. The columns that are exposed as Edm.TimeofDay should be mapped as “TimeAsString” in ODataSchemaMap. In this case, PowerBI works
as expected.
- In a load balancer environment, when invoking function import (and not function) that takes datetimeoffset as a parameter, we need to encode the : character present in time parameter. So, the following will return an error:
http://NC-HDP-U13/api/odata4/D2C_ORACLE_ODATAv4_FUNCT/ODATA_FUNC_GTABLE_DATE
(DATEIN=1999-12-31T00:00:00Z,INTEGERIN=5)
The correct URL encoded example must look like the following:
http://NC-HDP-U13/api/odata4/D2C_ORACLE_ODATAv4_FUNCT/ODATA_FUNC_GTABLE_DATE
(DATEIN=1999-12-31T00%3A00%3A00Z,INTEGERIN=5)
- When invoking function import (and not function) that returns null using Power BI, a data format error is returned. The resolution to this issue is being discussed internally as well as with Microsoft.
- OData 4.0 support for $expand does not work with the following data stores: Salesforce, Dynamics CRM, SugarCRM, Rollbase, Google Analytics, and Oracle Service Cloud.
- $expand only supports one level deep. Take for example the following entity hierarchy:
Customers
|-- Orders
| |-- OrderItems
|-- Contacts
The following queries are supported:
Customers?$expand=Orders
Customers?$expand=Contacts
Customers?$expand=Orders,Contacts
However, this query is not supported:
Customers?$expand=Orders,OrderItems
OrderItems
is a second level entity with respect to Customers
. To query Orders
and OrderItems
, the query must be rooted at Orders. For example:
Orders?$expand=OrderItems
Orders(id)?$expand=OrderItems
- Hybrid Data Pipeline OData model asynch API incorrectly returns zero instead of the actual percent complete when querying the status of a model that is being generated.
- When manually editing the ODataSchemaMap value, the table names and column names specified in the value are case-sensitive. The case of the table and column names must match the case of the tables and column names reported by the data source.
Note: It is highly recommended that you use the OData Schema Editor to generate the value for the ODataSchemaMap data source option. The Schema Editor takes care of table and column name casing and other syntactic details.
- The $expand clause is not supported with OpenEdge data sources when filtering for more than a single table.
- The day, endswith, and cast functions are not working when specified in a $filter clause when querying a DB2 data source.
On-Premise Connector
- FIPS compliance with the On-Premises Connector: The On-Premises Connector is not currently FIPS compliant. Therefore, any connections made to an on-premises data source through an On-Premises Connector will not be fully FIPS compliant.
- External authentication with the On-Premises Connector: External authentication services are not currently supported when connecting to data sources using the On-Premises Connector.
- If User Account Control is enabled on your Windows machine and you installed the On-Premises Connector in a system folder (such as Windows or Program Files), you must run the On-Premises Connector Configuration Tool in administrator mode.
- When using Kerberos with Microsoft Dynamics, the JRE installed with the On-Premises Connector must be configured to run with Kerberos. Take the following steps to configure the JRE.
- Download a zip file containing new version of the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files for JDK/JRE 8 at http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html.
- Unzip the file into the \jre\lib\security directory to update the Java security policy files to support 256-bit encryption:
- C:\<installdir>\jre\lib\security\local_policy.jar
- C:\<installdir>\jre\lib\security\US_export_policy.jar
- Uninstalling and re-installing the On-Premises Connector causes the Connector ID of the On-Premises Connector to change. Any Hybrid Data Pipeline data sources using the old Connector ID must be updated to use the new Connector ID. Installing to a new directory allows both the old and new On-Premises Connector to exist side-by-side. However, you must update the Connector ID option in previously-defined Hybrid Data Pipeline data sources to point to the new On-Premises Connector. In addition, you must update Connector Id wherever it was used, such as the definitions of Group Connectors and Authorized Users. Note that upgrading an existing installation of the On-Premises Connector maintains the Connector ID.
- When upgrading the On-Premises Connector, if the specified user installation directory contains a hyphen “-”, the upgrade will fail. To work around this issue, avoid using hyphen “-” in the user installation directory name. If your existing On-Premises Connector installation directory name contains a hyphen, you must uninstall the existing On-Premises Connector and then perform a new install rather than attempting to upgrade the existing On-Premises Connector installation.
JDBC Driver
- If you attempt to install the JDBC driver in GUI mode to the default installation directory but do not have appropriate permissions for the default directory, the installer indicates that the installation has succeeded when in fact the driver has not been installed. When attempting an installation under the same circumstance but in console mode, the proper error message is displayed.
- The default value for the Service connection property does not connect to the Hybrid Data Pipeline server. To connect, set Service to the Hybrid Data Pipeline server in your connection URL.
- The JDBC 32-bit silent installation fails on Windows 10. Use the standard installation instead.
- Executing certain queries against MS Dynamics CRM may result in a "Communication failure. Protocol error."
- Using JNDI data sources, encryptionMethod must be configured through setExtendedOptions.
- See the d2cjdbcreadme.txt file installed the JDBC driver for more information.
ODBC Driver
- The default ODBC.INI generated by the installer is missing required entries for Service=, PortNumber=, and HybridDataPipelineDataSource=.
- Console mode installation is supported only on UNIX.
- When you first install a driver, you are given the option to install a default data source for that driver. We recommend that you install default data sources when you first install the drivers. If you do not install the default data source at this time, you will be unable to install a default data source for this driver later. To install a default data source for a driver after the initial installation, you must uninstall the driver and then reinstall it.
- See the d2codbcreadme.txt file installed the ODBC driver for more information.
All data stores
- It is recommended that Login Timeout not be disabled (set to 0) for a data source.
- Using setByte to set parameter values fails when the data store does not support the TINYINT SQL type. Use setShort or setInt to set the parameter value instead of setByte.
Google Analytics
- Validation message is not displayed when a user enters a Start Date value less than the End Date value in Create/Update Google Analytics page.
- Once a Google Analytics OAuth profile is created for a specific Google account, changing the Google Account associated with the profile results in "the configuration options used to open the database do not match the options used to create the database" error being returned for any existing data sources.
Microsoft Dynamics CRM
- Executing certain queries against MS Dynamics CRM with the JDBC driver may result in a “Communication failure. Protocol error."
- Testing has shown the following two errors from Microsoft Dynamics CRM Online when executing queries against the ImportData and TeamTemplate tables:
- Attribute errortype on Entity ImportData is of type picklist but has Child Attributes Count 0
- Attribute issystem on Entity TeamTemplate is of type bit but has Child Attributes Count 0
Note: We have filed a case with Microsoft and are waiting to hear back about the cause of the issue.
- The initial on-premises connection when the relational map is created can take some time. It is even possible to receive an error "504: Gateway Timeout". When this happens, Hybrid Data Pipeline continues to build the map in the background such that subsequent connection attempts are successful and have full access to the relational map.
OpenEdge 10.2b
- Setting the MaxPooledStatements data source option in an OpenEdge data store to a value other than zero can cause statement not prepared errors to be returned in some situations.
Oracle Marketing Cloud (Oracle Eloqua)
- Data store issues
- There are known issues with Batch Operations.
- The Update/Delete implementation can update only one record at a time. Because of this, the number of APIs executed depends on the number of records that get updated or deleted by the query plus the number of API calls required to fetch the IDs for those records.
- Lengths of certain text fields are reported as higher than the actual lengths supported in Oracle Eloqua.
- We are currently working with Oracle to resolve the following issues with the Oracle Eloqua REST API.
- AND operators that involve different columns are optimized. In other cases, the queries are only partially optimized.
- OR operators on the same column are optimized. In other cases, the queries are completely post-processed.
- The data store is not able to insert or update the NULL value to any field explicitly.
- The data store is unable to update few fields. They are always reported as NULL after update.
- Oracle Eloqua uses a double colon (::) as an internal delimiter for multivalued Select fields. Hence when a value with the semi-colon character (;) is inserted or updated into a multivalued Select field, the semicolon character gets converted into the double colon character.
- Query SELECT count (*) from template returns incorrect results.
- Oracle Eloqua APIs do not populate the correct values in CreatedBy and UpdatedBy fields. Instead of user names, they contain a Timestamp value.
- Only equality filters on id fields are optimized. All other filter conditions are not working correctly with Oracle Eloqua APIs and the data store is doing post-processing for such filters.
- Filters on Non-ID Integer fields and Boolean fields are not working correctly. Hence the driver needs to post-process all these queries.
- The data store does not distinguish between NULL and empty string. Therefore, null fields are often reported back as empty strings.
- Values with special characters such as curly braces ({,}), back slash (\), colon (:), slash star (/*) and star slash (*/) are not supported in where clause filter value.
Oracle Sales Cloud
- Currently, passing filter conditions to Oracle Sales Cloud works only for simple, single column conditions. If there are multiple filters with 'AND' and 'OR', only partial or no filters are passed to Oracle Sales Cloud.
- Oracle Sales Cloud reports the data type of String and Date fields as String. Therefore, when such fields are filtered or ordered in Hybrid Data Pipeline, they are treated as String values. However, when filter conditions are passed to Oracle Sales Cloud, Oracle Sales Cloud can distinguish between the actual data types and apply Date specific comparisons to Date fields. Therefore, query results can differ depending on whether filters have been passed down to Oracle Sales Cloud or processed by Hybrid Data Pipeline.
- There appears to be a limitation with the Oracle Sales Cloud REST API concerning the >=, <=, and != comparison operators when querying String fields. Therefore, Hybrid Data Pipeline has not been optimized to pass these comparison operators to Oracle Sales Cloud. We are working with Oracle on this issue.
- There appears to be a limitation with the Oracle Sales Cloud REST API concerning queries with filter operations on Boolean fields. Therefore, Hybrid Data Pipeline has not been optimized to pass filter operations on Boolean fields to Oracle Sales Cloud. We are working with Oracle on this issue.
- The drivers currently report ATTACHMENT type fields in the metadata but do not support retrieving data for these fields. These fields are set to NULL.
- Join queries between parent and child tables are not supported.
- Queries on child tables whose parent has a composite primary key are not supported. For example, the children of ACTIVITIES_ACTIVITYCONTACT and LEADS_PRODUCTS are not accessible.
- Queries on the children of relationship objects are not supported. For example, the children of ACCOUNTS_RELATIONSHIP, CONTACTS_RELATIONSHIP, and HOUSEHOLDS_RELATIONSHIP are not accessible.
- Queries on grandchildren with multiple sets of Parent IDs and Grand Parent IDs used in an OR clause are not supported. For example, the following query is not supported.
Select * From ACCOUNTS_ADDRESS_ADDRESSPURPOSE
Where (ACCOUNTS_PARTYNUMBER = 'OSC_12343' AND
ACCOUNTS_ADDRESS_ADDRESSNUMBER = 'AUNA-2XZKGH')
OR (ACCOUNTS_PARTYNUMBER = 'OSC_12344' AND
ACCOUNTS_ADDRESS_ADRESSNUMBER = 'AUNA-2YZKGH")
- When querying documented objects like "CATALOGPRODUCTITEMS" and "CATEGORYPRODUCTITEMS", no more than 500 records are returned, even when more records may be present. This behavior is also seen with some custom objects. We are currently working with Oracle support to resolve this issue.
- A query on OPPORTUNITIES_CHILDREVENUE_PRODUCTS or LEADS_PRODUCTGROUPS with a filter on the primary key column returns 0 records even when more records are present. We are currently working with Oracle support to resolve this issue.
- Queries that contain subqueries returning more than 100 records are not supported. For example, the following query is not supported.
Select * From ACCOUNTS_ADDRESS
Where ACCOUNTS_PARTYNUMBER
In (Select Top 101 PARTYNUMBER From ACCOUNTS
- When you create custom objects, your Oracle Sales Cloud administrator must enable these objects for REST API access through Application Composer. Otherwise, you will not be able to query against these custom objects.
Oracle Service Cloud
- When you create a custom object, your Oracle Service Cloud administrator must enable all four columns of the Object Fields tab of the Object Designer, or you cannot query against the custom objects.
- The initial connection when the relational map is created can take some time. It is even possible to receive an error "504: Gateway Timeout". When this happens, Hybrid Data Pipeline continues to build the map in the background such that subsequent connection attempts are successful and have full access to the relational map.
SugarCRM
- Data sources that are using the deprecated enableExportMode option will still see a problem until they are migrated to the new data source configuration.
- Data source connections by default now use Export Mode to communicate with the Sugar CRM server, providing increased performance when querying large sets of data. Bulk export mode causes NULL values for currency columns to be returned as the value 0. Because of this, there is no way to differentiate between a NULL value and 0, when operating in export mode. This can be a problem when using currency columns in the SQL statements, because Hybrid Data Pipeline must satisfy some filter conditions on queries, such as with operations like =, <>, >, >=, <, <=, IS NULL and IS NOT NULL. For example, suppose a currency column in a table in SugarCRM has 3 null values and 5 values that are 0. When a query is executed to return all NULL values (SELECT * FROM <table> WHERE <>currency column> IS NULL), then 3 rows are returned. However, if a query is executed to return all rows where the column performs an arithmetic operation (SELECT * FROM <table> WHERE <currency column> + 1 = 1), then all 8 records are returned because the 3 NULL values are seen as 0.