Configure an AWS Load Balancer for Progress Hybrid Data Pipeline

Introduction

Accessing on-premises data from the cloud often brings with it many security and availability headaches, but with DataDirect Hybrid Data Pipeline it is now possible to securely access data behind any firewall while still leveraging the benefits of advanced cloud load balancing.

 

In this tutorial, you’ll learn how to quickly configure Amazon’s Application Load Balancer (ALB) to achieve both the scalability and reliability of a clustered Hybrid Data Pipeline installation by leveraging our support of the WebSockets protocol.

Download and Install a Hybrid Data Pipeline Cluster

This document assumes you are already familiar with installing a Hybrid Data Pipeline cluster on 2 or more EC2 instances which meet the minimum requirements – including use of an external configuration database and shared storage. It also assumes familiarity with configuring VPCs, Security Groups and basic administration within the AWS environment.

General Information about installing a single node in AWS can be found here, while details on a cluster installation can be found in our Installation Guide.

Note that there are a few changes with the DataDirect Hybrid Data Pipeline installation specific to supporting cloud load balancers:

  • There is a new option for Load Balancers. Be sure to choose option 3 if using a WebSockets based load balancer such as the Amazon ALB
1 
  • Make sure to load your PEM file (SSL cert) on the HDP server as well as on the AWS Application Load Balancer. This ensure correct the redistributable files are created to support installation of the On-Premises Connector and hybrid ODBC/JDBC drivers.
  • Ensure your “Load Balancing Host Name” matches the hostname in your SSL cert.

Configuring an Application Load Balancer (ALB) in AWS

For the purpose of this document, it is assumed you are installing a 2 node cluster which only has SSH (port 22) access from the outside world and that both nodes are in the same VPC/subnet.

  1. Create a new Application Load Balancer
    1. Provide a name
    2. Scheme is internet-facing
    3. IP address type is ipv4
    4. Choose availability zone(s) of your preference
    1.1
  2. Configure 4 Target Groups
    1. Under Load Balancing, click on Target Group
      1. You will be creating 2 target groups containing all nodes and 1 group specifically for each node. For a 2 node cluster, there will be 4 target groups.
    2. New Target Group (#1)
      1. Name: hdp-nodes-http
        1. Protocol: HTTP
        2. Port: 8080
        3. Target Type: Instance
        4. Attributes
          1. Stickiness: Enabled
          2. Duration: 5 minutes

          2.2
      2. Health Checks
        1. Protocol: HTTP
        2. Path: /api/healthcheck
        3. Advanced settings
          1. Port: traffic port
          2. Healthy threshold: 2
          3. Unhealthy threshold: 2
          4. Timeout: 5 seconds
          5. Interval: 30 seconds
          6. Success Codes: 200
      3. 2.2

      4. Register Targets
        1. Select Targets tab and click Edit
        2. Select both HDP nodes and click Add to Registered
        3. 2.3

        4. Create New Target Group (#2)
          1. Name: hdp-notification-pool
          2. Protocol: HTTP
          3. Port: 11280
          4. Target Type: Instance
          5. Health Checks
            1. Protocol: HTTP
            2. Path: /
            3. Advanced settings
              1. Port: traffic port
            4. Stickiness: Disabled
          6. Register Targets:
            1. Select all HDP nodes and click Add to Registered
        5. Create New Target Group (#3)
          1. Name: hdp-opa-node1
          2. Protocol: HTTP
          3. Port: 40501
          4. Target Type: Instance
          5. Health Checks
            1. Protocol: HTTP
            2. Path: /
            3. Advanced settings
              1. Port: traffic port
            4. Stickiness: Disabled
          6. Register Targets:
            1. Register ONLY HDP node #1
        6. Create New Target Group (#4)
          1. Name: hdp-opa-node2
          2. Protocol: HTTP
          3. Port: 40501
          4. Target Type: Instance
          5. Health Checks
            1. Protocol: HTTP
            2. Path: /
            3. Advanced settings
              1. Port: traffic port
            4. Stickiness: Disabled
          6. Register Targets:
            1. Register ONLY HDP node 2
    3. Add Listener to ALB
      1. Go to Load Balancer and select the Listeners tab
      2. Click Add Listener
      3. Select HTTPS for port 443
      4. Under Default Actions, set Forward to hdp-nodes-http target group created above.
      5. Select security policy (default – ELBSecurityPolicy-2016-08)
      6. Select the correct SSL certificate for your domain
      7. Create/attach a security group to the Application Load Balancer:
      8. Allow inbound TCP on 443 for 0.0.0.0/0
    4. Create Forwarding Rules
      1. Click view/edit rules under the Rules column for the HTTPS listener
        4.1
        1. Add these 3 rules in this order (for a total of 4 rules):
          1. Path: /connect/ /connect/opa_ip-10-1-1-153_ec2_internal_40501
            1. Replace ip-10-1-1-153 with the appropriate info for your HDP node #1. This is the internal DNS name.
            2. Forward to hdp-opa-node1
          2. Path: /connect/ /connect/opa_ ip-10-1-1-95_ec2_internal_40501
            1. Replace ip-10-1-1-95 with the appropriate info for your HDP node #2. This is the internal DNS name.
            2. Forward to hdp-opa-node2
          3. Path: /connect/X_DataDirect_Notification_Server
            1. Forward to hdp-notification-pool
          4. The final rule should already exist
            1. HTTPS 443 – All other requests forward to hdp-nodes-http

            4.2
    5. Verify Targets
      1. Once the ALB has been configured with the listener and target group rules, the targets should show as healthy for each target group if the instances are online and HDP is running.
    6. 5.1

    7. Configure DNS
      1. Add a CNAME to your DNS to point your hostname to the ALB’s public DNS hostname. It is shown under the Basic Configuration of the Load Balancer as “DNS Name.”

      6.1

We hope this tutorial assisted in creating a cloud-based solution to OData enable both your on-premises and cloud data sources using Progress Hybrid Data Pipeline and Amazon’s Application Load Balancer. Now you can have security, scalability and reliability all together in a single data access solution which lets you bring sources such as Oracle, Postgres, MySQL, DB2 and SQL Server out from behind the firewall. If you have any question, please feel free to contact us.

Cloud and Hybrid, OData TUTORIAL

Configure an AWS Load Balancer for Progress Hybrid Data Pipeline

DataDirect
connectors

Connect any data source to any application

DataDirect
Tutorials

Discussions, tips and tricks for
DataDirect Connect drivers