Top 3 'Must Knows' for Hadoop

Top 3 'Must Knows' for Hadoop

Posted on August 03, 2012 0 Comments

  As companies are looking for ways to leverage the massive amount of data they have collected, Hadoop is quickly becoming all the rage. But as it becomes synonymous with Big Data, it’s important to step back and consider what Hadoop is really about.

  1. Hadoop stores massive amounts of data. Hadoop is an open-source (AKA: free) storage system that can manage extremely large quantities of all kinds of data (structured or unstructured). Just how much data can it handle, exactly? Well, Facebook recently announced  that just one of its Hadoop clusters is 100 petabytes in size. Maria Korolov has calculated that to be over a thousand years of video at a data transfer rate of roughly 1 gig per hour. Thinking about how much data that is can be dizzying, hence the moniker ‘Big Data.’
  2. Hadoop saves money. Because it’s free, Hadoop is extremely cost-effective when comparing it to traditional data extraction, transformation and loading (ETL) processes.  It can also extract data from nearly any type of system and load it onto a database.  So how does it work?  Hadoop can run on multiple machines that don’t share memory or disks (also known as a cluster). This makes it possible to use many different servers simultaneously to work on the same sets of data, thereby dramatically increasing the speed at which you can process and store the data.
  3. Hadoop aids in data analysis. Yes, Hadoop can handle large quantities of data. But then what? Well, Hadoop can process and analyze a large amount of data at very high speeds, boiling it down quickly to be consumed and questioned. Think of Hadoop as a  waiting room for your data - just hanging out there waiting to be asked questions it has interesting answers for. Once you get someone who can ask the questions and communicate effectively with it (like a data scientist) - you can start analyze problems - such as finding and fixing all of Boston’s potholes or helping recruiters sift through dozens of resumes. Hadoop lays the foundation for gaining insight from your data, allowing you to make intelligent decisions about your business, your community, and your world...and we can all benefit from that.

Here at DataDirect we're pretty excited to see what the world is doing with Hadoop and how we can help. We've been leading the charge in the Data world for years and have some cool ideas around Hadoop and how we're helping customers get more out of their piles of Big Data. Stay tuned to DataDirect news, as we will soon be announcing a new Hadoop driver that will allow users to directly access Hadoop (through Hive) to enable maximum insights when analyzing their Big Data.

Jesse Davis

Jesse Davis

As Senior Director of Research & Development, Jesse is responsible for the daily operations, product development initiatives and forward looking research for Progress DataDirect. Jesse has spent nearly 20 years creating enterprise data products and has served as an expert on several industry standards including JDBC, J2EE, DRDA and OData. Jesse holds a bachelor of science degree in Computer Engineering from North Carolina State university.


Comments are disabled in preview mode.

Sitefinity Training and Certification Now Available.

Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.

Learn More
Latest Stories
in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation