Query any REST API using JDBC quickly – Editing the auto generated schema (Part 3)

Introduction

In the Part 1 and Part 2 of the tutorial series, we have learned how you can sample and connect to multiple endpoints and query the data using SQL, without you having to write a line of code. If you are new, we would recommend you to start from Part 1 of the tutorial series.

When you use Autonomous REST connector, it strives to a create correct schema information such as table name, column names, data types and assign a primary key, which you shouldn’t have any problem using it. But, as you start querying the API using SQL, there can be times where you might want to rename a table or field name to a value that avoids confusion for end users or change the data type for a certain field to a type that you think is appropriate or  change the primary key to a different field in the generated table. In this tutorial, we will be walk you through on how to do this in a way that makes sense to you or your end user.

Get the auto generated REST configuration

  1. In the part 2 of this tutorial series, we have used a REST config file to specify the table names and the URLs to sample as shown below.


    {
         "categories":"https://api.yelp.com/v3/categories",
         "events":"https://api.yelp.com/v3/events",
    }
     


  2. This is the REST config file in its simplest form. When you connect using this configuration file, Autonomous REST connector will sample all the URLs, normalize the JSON structure to relational schema and infers the metadata by sampling. All the mappings which it infers are stored in the memory and you can access it by querying the table _CONFIGURATION table.
  3. Run the below query to get the auto generated REST configuration created by Autonomous REST Connector.
    SELECT * FROM _CONFIGURATION


  4. Copy the data from CONTENT column to notepad and paste it. It should look as below

    {
        "categories":{
            "#path":[
                "https://api.yelp.com/v3/categories /categories"
            ],
            "alias":"VarChar(48)",
            "title":"VarChar(52),#key",
            "parent_aliases[]":"VarChar(34)",
            "country_whitelist[]":"VarChar(3)",
            "country_blacklist[]":"VarChar(3)"
        },
        "events":{
            "#path":[
                "https://api.yelp.com/v3/events"
            ],
            "events[]":{
                "attending_count":"Integer",
                "category":"VarChar(64)",
                "cost":"VarChar(64)",
                "cost_max":"VarChar(64)",
                "description":"VarChar(238)",
                "event_site_url":"VarChar(310)",
                "id":"VarChar(75)",
                "image_url":"VarChar(102)",
                "interested_count":"Integer",
                "is_canceled":"Boolean",
                "is_free":"Boolean",
                "is_official":"Boolean",
                "latitude":"Double",
                "longitude":"Double",
                "name":"VarChar(64)",
                "tickets_url":"VarChar(64)",
                "time_end":"Timestamp(0)",
                "time_start":"Timestamp(0)",
                "location":{
                    "address1":"VarChar(64)",
                    "address2":"VarChar(64)",
                    "address3":"VarChar(64)",
                    "city":"VarChar(64)",
                    "zip_code":"Integer",
                    "country":"VarChar(64)",
                    "state":"VarChar(64)",
                    "display_address[]":"VarChar(64)",
                    "cross_streets":"VarChar(64)"
                },
                "business_id":"VarChar(64)"
            },
            "total":"Integer,#key"
        },
        "business":{
            "#path":[
            ],
            "businesses[]":{
                "id":"VarChar(64)",
                "alias":"VarChar(64)",
                "name":"VarChar(64)",
                "image_url":"VarChar(102)",
                "is_closed":"Boolean",
                "url":"VarChar(295)",
                "review_count":"Integer",
                "categories[]":{
                    "alias":"VarChar(21)",
                    "title":"VarChar(33)"
                },
                "rating":"Double",
                "coordinates":{
                    "latitude":"Double",
                    "longitude":"Double"
                },
                "transactions[]":"VarChar(64)",
                "price":"VarChar(64)",
                "location":{
                    "address1":"VarChar(64)",
                    "address2":"VarChar(64)",
                    "address3":"VarChar(64)",
                    "city":"VarChar(64)",
                    "zip_code":"Integer",
                    "country":"VarChar(64)",
                    "state":"VarChar(64)",
                    "display_address[]":"VarChar(40)"
                },
                "phone":"VarChar(64)",
                "display_phone":"VarChar(64)",
                "distance":"Double"
            },
            "total":"Integer,#key",
            "region":{
                "center":{
                    "longitude":"Double",
                    "latitude":"Double"
                }
            },
            "location":{
                "#type":"VarChar(64),#key",
                "#virtual":true,
                "#default":"27617",
                "#eq":"location"
            }
        }
    }

  5. Save the above configuration in a file called YelpAuto.rest

Using the updated REST config file

  1. To use the updated REST config file, change your JDBC URL to
    jdbc:datadirect:autorest:config=C:\<path-to-file>\YelpAuto.rest;


  2. This would make the Autonomous REST connector to use this configuration instead of sampling the API every time you connect.

Understanding the REST Config file

  1. If you notice, Autonomous REST connector has updated the configuration to include the all the properties and nested objects from the response and assigned a data type for all the fields automatically.
  2. If you remember, we sampled three Yelp endpoints – Categories, Events and Business search. If you look in to the REST config file, it should have three JSON objects in the root of the config file, which represent each of these endpoints.
  3. The #path parameter is where you will find the URL that you wanted to sample by Autonomous REST Connector.
  4. For each endpoint in the config file, you will find that Autonomous REST connector has assigned a primary key by adding #key parameter. For example, look at field ‘title’ in the categories endpoint marked as a primary key.
  5. For business search endpoint, we had to send in a value via query parameter to get the results, in this case it was location. If you look at the configuration generated for business endpoint, you will find the below snippet.


    "location":{
                "#type":"VarChar(64),#key",
                "#virtual":true,
                "#default":"27617",
                "#eq":"location"
            }


  6. In the above snippet,
    1. #type represents the data type,
    2. #virtual represents that this is a virtual field and this field doesn’t appear in the response from the API in a simple key value pair.
    3. #default represents the default value Autonomous REST connector uses if nothing is provided for location in SQL query. If you remember in the part 2 of the tutorial, we set the value to 27617 when sampling it, so this was taken as a default value.
    4. #eq represents the query parameter name that it must use to send in the value using the API.

Change the Primary Key

  1. You can easily change the Primary key to a different field by using #key. Let’s say we want to change the primary key for categories table, which has ‘title’ as the Primary key.
  2. All you must do is remove the #key next to ‘title’ and add it to alias, which I want to be my new primary key. After you do that the configuration should like below
    "categories":{
            "#path":[
                "https://api.yelp.com/v3/categories /categories"
            ],
            "alias":"VarChar(48),#key",
            "title":"VarChar(52)",
            "parent_aliases[]":"VarChar(34)",
            "country_whitelist[]":"VarChar(3)",
            "country_blacklist[]":"VarChar(3)"
        }
  3. When you reconnect next time, the changes will take place and your new primary key will be alias.

Change column name

  1. If you want to change the column name of a table, you can do it using the below syntax                   
    "old_column_name<new_column_name>":"VarChar(52) "


  2. Here’s an example of renaming the column ‘alias’ to ‘categorycode’ in the Categories table

    "categories":{
            "#path":[
                "https://api.yelp.com/v3/categories /categories"
            ],
            "alias<categorycode>":"VarChar(48),#key",
            "title":"VarChar(52)",
            "parent_aliases[]":"VarChar(34)",
            "country_whitelist[]":"VarChar(3)",
            "country_blacklist[]":"VarChar(3)"
        }


  3. Your changes should take effect after you re-connect.

Change Data type

  1. You can also change the data type of a column if you think Autonomous REST Connector didn’t assign the proper data type.
  2. You can assign any data types that are supported by the connector here.
  3. For example, if you want to change data type of column alias to LongVarchar(32000), you can do it as shown below.
    "categories":{
            "#path":[
                "https://api.yelp.com/v3/categories /categories"
            ],
            "alias<categorycode>":"LongVarChar(32000),#key",
            "title":"VarChar(52)",
            "parent_aliases[]":"VarChar(34)",
            "country_whitelist[]":"VarChar(3)",
            "country_blacklist[]":"VarChar(3)"
        }


  4. Your changes should take effect after you re-connect.

Change the table names

  1. When Autonomous REST Connector samples your API endpoints, sometimes you end up with ambiguous table names when it gets normalized to relational schema due to nested JSON objects having same names. In those cases, you would want to change the table names to avoid confusion.
  2. When you sampled the Categories, Events and Businesses endpoints, you might have already noticed that there are two similarly named tables CATEGORIES and CATEGORIES_1.
    1. CATEGORIES is the table that was created by sampling Categories endpoint. It contains all the categories that Yelp offers.
    2. CATEGORIES_1 is one of the table that was created by sampling Businesses endpoint. If you look at the response from businesses endpoint, it contains a nested JSON array of categories to which that business is applicable.
  3. Let’s change the CATEGORIES_1 table name to BUSINESS_CATEGORY to avoid confusion. To do that you need to follow this syntax

    Syntax:

    "old_table_name<new_table_name>[]":{}


    Example

    "business":{
            "#path":[
            ],
            "businesses[]":{
                "id":"VarChar(64)",
                "alias":"VarChar(64)",
                "name":"VarChar(64)",
                "image_url":"VarChar(102)",
                "is_closed":"Boolean",
                "url":"VarChar(295)",
                "review_count":"Integer",
                "categories<business_category>[]":{
                    "alias":"VarChar(21)",
                    "title":"VarChar(33)"
                },
                "rating":"Double",
                "coordinates":{
                    "latitude":"Double",
                    "longitude":"Double"
                },
                "transactions[]":"VarChar(64)",
                "price":"VarChar(64)",
                "location":{
                    "address1":"VarChar(64)",
                    "address2":"VarChar(64)",
                    "address3":"VarChar(64)",
                    "city":"VarChar(64)",
                    "zip_code":"Integer",
                    "country":"VarChar(64)",
                    "state":"VarChar(64)",
                    "display_address[]":"VarChar(40)"
                },
                "phone":"VarChar(64)",
                "display_phone":"VarChar(64)",
                "distance":"Double"
            },
            "total":"Integer,#key",
            "region":{
                "center":{
                    "longitude":"Double",
                    "latitude":"Double"
                }
            },
            "location":{
                "#type":"VarChar(64),#key",
                "#virtual":true,
                "#default":"27617",
                "#eq":"location"
            }
        }


  4. Your changes should take effect after you re-connect.

 

Feel free to play with the REST config  file as per your needs and if you have any issues or questions, please contact us. We will be happy to help you. In the next tutorial (Part 4), we will look at how you can configure Autonomous REST Connector to page through all the results from endpoint beyond the first page.

JDBC TUTORIAL

Query any REST API using JDBC quickly – Editing the auto generated schema (Part 3)

View all Tutorials

Connect any application to any data source anywhere

A product specialist will be glad to get in
touch with you