Data integration and application integration, while not the same, are closely correlated. The road to application integration has always been the Application Programming Interface (API). For decades, software developers have made their applications friendly to integration by creating,
publishing, and documenting a public API so that their customers could integrate their apps in their own specific ways — ways the software developer couldn’t infer in advance.
While APIs were once little more than a technology amenity or utility, today they are strategic. APIs make applications and services more attractive to enterprise buyers, because their own developers can integrate the product with in-house applications. This allows those developers
to focus on customization rather than building raw capabilities.
In fact, APIs also help build a product’s ecosystem in general. They tend to attract independent developers who like to tinker and imbue their own projects with advanced features. But they also allow software vendors to develop a partner channel with consulting shops that can use the API to perform custom implementations. This further extends the appeal of the app or service to enterprise customers.
In both scenarios, APIs provide a path for querying data from the application and mashing it up with other data the customer may have. That includes data from in-house applications and databases as well as data from other SaaS applications. As such, APIs provide the gateway to becoming a data-driven business. But not all APIs are created equal. Let’s explore some of the nuances now.
Conventional Web APIs
Originally, APIs were developed for specific programming languages to be used by applications that ran on the same computer or server as the code that used them. In the last fifteen or so years though, APIs have become more based on Internet standards and, specifically, protocols developed for the web.
Originally referred to as Web Services and sometimes based on complex protocols, for the last ten years or so, these have simplified into something called REST APIs. REST stands for “Representational State Transfer,” but really just indicates a web API approach leveraging the basic read, write, and delete operations on the web.
Most enterprise SaaS applications have REST APIs nowadays. For any good developer, using a REST API is fairly straightforward and provides a way to, among other operations, query data and create certain transactions, all in code, without having to request and invoke these things manually through the user interface.
REST APIs Are Not Built for Analytics
Using several such APIs can be an effective way to coalesce data from a number of sources. But there are downsides to this approach.
One problem is that each API has its own documentation and its own conventions. Developers who will use multiple such APIs need to learn the intricacies of each, and as they switch between them, quickly adjust their minds to the different metaphors and organization of one versus the other.
REST APIs were designed more for data movement than for efficient querying. As a result, Business Analysts and Data Scientists are dependent on developers to get access to queryable data.
Another deficiency in the REST API approach is that it has been designed more for data movement than for efficient querying. And while it is possible to create a secondary database to stage and integrate data extracted from various applications and their APIs, there are significant disadvantages to doing so.
Wholesale data movement is expensive and often poorly-performing. In addition, creating a copy of the data invokes risks and liabilities. To begin with, the copied and original data can become out of sync. Replicating granular data can also transgress regulatory standards, especially if sensitive data fields are involved. Pulling summarized data from an application avoids these inefficiencies and liabilities, but many REST APIs don’t provide summarization functionality.
All of this serves as a disincentive for developers to go through the integration process. With each new data source, there’s risk, unpleasantness, and new complexity. And even to the extent that developers are willing to withstand this, they’re the only ones who can. Business
analysts and data scientists, by and large, are not in a position to use REST APIs themselves. As a result, they are dependent on developers to get access to queryable data. This rules out self-service approaches and is reminiscent of the days of highly-centralized BI.
Figure 1. Deficiencies of REST APIs for data integration