Home Partners Company
The Most Critical Big Data Technology You Don’t Understand

The Most Critical Big Data Technology You Don’t Understand

April 27, 2016 0 Comments

Today’s blog post is courtesy of David Loshin. An expert in all things data, here he discusses cross-platform integration challenges.

David is a recognized thought leader and expert consultant in the areas of analytics, big data, data governance, data quality, master data management, and business intelligence. He is also a prolific author regarding business intelligence best practices.

We’re excited to have him discussing cross-platform integration, a topic we take very seriously. David also invites you to participate in a survey, which we strongly encourage you to take, as he is at the cutting edge of BI and Big Data and your feedback will help drive mutual success for all.

The Productionalization Chasm

During the past year it seems that we have seen the tipping point for adoption of big data and Hadoop as part of the enterprise application framework. Many companies are crossing the chasm between experimentation and productionalization, and are developing more and more applications that leverage the Apache ecosystem components, including Hadoop-based and Spark-based capabilities, within production applications.

This trend seems to bode well for a broader adoption of the Hadoop stack in ways that are fully integrated into the enterprise. Yet there are some important issues that keep some organizations holding back. Perhaps they are hesitant to transition to less-than-conventional infrastructure. Maybe there is confusion surrounding the rapid change in the Apache big data stack. Perhaps there is still a dearth of skilled team members with the right experience to design and develop the applicationware.

Data Access through Cross-Platform Integration

However, one of the biggest issues lurking under the seams is the basic fact that big data systems need access to data. That data might be resident in a data warehouse, or the data layer of a transaction processing system. That data may be sitting on the mainframe, on a server, on people’s desktops. It might be resident in a SaaS system, or parked somewhere in the Cloud. That data might be streaming from any of a multitude of continuous feeds. And all of that data needs to land in the big data environment before it can be made available for reporting and analytics.

That need can be summarized in a concise phrase: cross-platform integration, or the middleware used to streamline access to and harmonize the data that originates from a wide variety of sources. That data may be structured or unstructured, on-premises or off-premises, static or streaming, etc. Yet in conversations with people considering adopting big data technology, I have found that few have really considered the complexity of integrating data from multiple sources and landing it in a big data environment in a way that is usable.

Broad-based data accessibility and integration is hard, and few people understand that there are software alternatives that can simplify the process. But they should, since without the data, big data technology is useless. We are hoping to get a better understanding of the types of challenges that people are confronted with as they explore and adopt Hadoop and other big data techniques.

Participate in this Hadoop Survey

My research company DecisionWorx, partnered with The Bloor Group, is looking for your input to help us understand where organizations are in their Hadoop journey. Please take the time to complete our new survey on Hadoop Productionalization in which we solicit your feedback about your corporate experience in evaluating different software distributions and vendor offerings, the relative ease of design and development of applications, expectations about cost, and many other facets of integrating this innovative technology into the enterprise.

david-loshin

David Loshin

David Loshin, president of Knowledge Integrity, Inc, is a recognized thought leader and expert consultant in the areas of analytics, big data, data governance, data quality, master data management, and business intelligence. Along with consulting on numerous data management projects over the past 15 years, David is also a prolific author regarding business intelligence best practices, as the author of numerous books and papers on data management. David is a frequent invited speaker at conferences, web seminars, and sponsored web sites and TechTarget channels, and shares additional content at his notes and articles at www.dataqualitybook.com.

Read next Meet “Mustang,” the New Approach to Real-Time Data Access
Comments
Comments are disabled in preview mode.