True Democratization of Analytics with Meta-Learning

True Democratization of Analytics with Meta-Learning

August 11, 2017 0 Comments
True Democratization of Analytics with Meta-Learning_870x220

There are many solutions that claim to democratize analytics, but they are really constrained. A meta-learning approach democratizes without limits.

The democratization of analytics has become a popular term, and a quick Google search will generate results that explore the necessity of empowering more people with analytics and the rise of citizen data scientists. The ability to easily make better use of your (constantly growing) pool of data is a critical driver of business success, but many of the existing solutions that claim to democratize analytics only do so within severe limits. If you have a complex business scenario and are looking to get revolutionary insights using them, it’s easy to come away disappointed.

However, the democratization of analytics isn’t just a buzzword that refers to a narrow approach. It’s possible to do so much more. Let’s quickly review the current state of the market that you’re likely familiar with, and then dive into our proposed solution.

Lightweight Solutions that Oversimplify

One way this type of solution is marketed is as something that’s simple because it works in an environment business leaders are already familiar with, like Excel or Tableau. These solutions tend to be lightweight and are really about easily generating a digestible report. That’s all well and good, but it’s really democratizing report generation and lightweight analysis rather than enabling you to develop truly predictive scenarios that require Machine Learning.

Narrowly Defined Analytics as a Service

Another option that is gaining adoption is to use pre-trained models usable out-of-the-box for image analysis and classification, speech to text conversion, and translation services. While these make certain limited use cases available to more organizations, they don’t actually democratize the predictive analytics processing related to business specific time-series scenarios.

Cloud Environments that are only a Framework

Finally, there are numerous cloud vendors that take care of managing the infrastructure necessary for Big Data analytics and Machine Learning, whether it’s hosting Hadoop/MapReduce, Spark, etc., providing managed database support, or hosting machine intelligence software libraries like TensorFlow. At the end of the day, these options are really democratizing the infrastructure necessary to support Machine Learning—they aren’t democratizing the Data Scientist lifecycle itself, something we discuss in detail a little later in the post.

But What about More Sophisticated Business Scenarios?  

The solutions above may technically “democratize” some form of analytics, but they fall short in democratizing Machine Learning for individual business use cases like predictive maintenance for the Industrial IoT, improving patient outcomes in healthcare, detecting fraud in financial services, etc. So while simple scenarios are becoming a commodity, business scenarios that provide the most value are beyond the reach of most organizations.

Why?

Because the Machine Learning or Data Scientist lifecycle is complex. A successful implementation includes a business requirements phase, data preparation, data modeling, and production deployment work. The last three phases are particularly resource intensive.

  • The data preparation phase involves collecting the data, cleansing the data, and transforming the data—and multiple sets of data are required for scoring and testing.
  • The data modeling phase is especially demanding and involves feature engineering, algorithm selection, testing, tuning and model optimization. These steps need to be repeated until the models reach an acceptable level of quality.
  • Then there is the deployment—you have to take the models and deploy them in production using operational data. The work doesn’t end there, as you must continuously review and revise the models to keep up with changes in the environment.

It’s pretty clear that this is a completely different challenge that the options described above can’t address. While there are cloud options that will manage the infrastructure, and there are tools that make the data scientist more efficient, there is a dearth of solutions that tackle the democratization of complex Machine Learning.

The need for democratization is driven by the amount of time and resources it takes to do this manually—even with a team of data scientists. And for those that don’t have data scientists, this is a non-starter given traditional tools and solutions.

Enter Machine Learning and Meta-Learning

It’s evident that there is a need for a better way forward when it comes to solving these complex business challenges. Data scientists have to be freed from the laborious day to day grind that consumes so much of their time today, enabling them to more effectively support a higher number of business scenarios in less time.

Progress DataRPM is designed specifically to meet this need. By developing an innovative machine automated approach, we are able to automate a range of complex tasks that the other solutions above simply can’t.

  • DataRPM uses a meta-data approach to remember, share and apply learnings from the model experiments. This approach speeds the iterative process required to build and test models, and has also proven to increase the accuracy of production analytic results tremendously. 
  • DataRPM also leverages a novel approach for detecting failures. Traditional methods limit the analytics approach to building models that identify future failure or require optimization strictly based on past failures, but this approach provides poor coverage given that it can’t predict random failures (which are the predominant type of failure). DataRPM instead models normal behavior and then detects deviations from normal. These are flagged as potential problems that can be managed effectively by the business. Next, this intelligence is then fed back into the model so that it is continuously improved based on production data.

This solution allows your team to focus the most strategic and actionable part of the process, which is analyzing and assessing the results. Whether you currently employ data scientists or not, it reduces the amount of time you need to allocate to evaluating and creating complex models.

Rather than constrain analytics and generate a simple or limited result, the meta-learning approach looks fully at the unique problems facing your business, is flexible enough to be adapted to new problems as they arise and is constantly improving. By automating some of the most arduous components of data analysis, you’re free to focus on delivering the insights and outcomes you need—quickly. It's all part of our cognitive-first vision for business applications. You can learn more about our platform for cognitive predictive maintenance here.

Mark-Troester-2015-web

Mark Troester

Mark Troester is the Vice President of Strategy at Progress. He guides the strategic go-to-market efforts for the Progress cognitive-first strategy. Mark has extensive experience in bringing application development and big data products to market. Previously, he led product marketing efforts at Sonatype, SAS and Progress DataDirect. Before these positions, Mark worked as a developer and developer manager for start-ups and enterprises alike. You can find him on LinkedIn or @mtroester on Twitter.

Read next 5 Universal Takeaways from the Health 2.0 Conference
Comments
Comments are disabled in preview mode.