What’s Next for Apache Hadoop Data Management and Governance

Hadoop – the data processing engine based on MapReduce – is being superceded by new processing engines: Apache Tez, Apache Storm, Apache Spark and others. YARN makes any data processing future possible.

But Hadoop the platform – thanks to YARN as its architectural center – is the future for data management, with a selection of best-fit processing engines for any given use case: from batch to interactive to real-time. More than that, Hadoop has become a movement. It is the center of gravity for everyone engaged in the challenge of big data.

 

Hadoop Then. A MapReduce Processing Engine.

Traditional Hadoop uses HDFS for scalable storage and MapReduce as the sole system and framework that workloads ride atop. In this era, mappers and reducers ruled, and an ecosystem of tools climbed onto the elephant; Hive for SQL processing and Pig for scripting data flows to name just a few. Early Hadoop vendors also focused on bolting on basic levels of operations management, security, and other capabilities.

 

Hadoop Now. YARN enables versatile processing options for the Enterprise.

YARN is the prerequisite for Enterprise Hadoop, providing resource management and a central platform to deliver consistent operations, security, and data governance tools across Hadoop clusters.

YARN also extends the power of Hadoop to incumbent and new technologies found within the data center so that they can take advantage of cost effective, linear-scale storage and processing. It provides ISVs and developers a consistent framework for writing data access applications that run IN Hadoop.

 

Hadoop Next. The future of data management.

This means that Enterprise Hadoop powered by YARN is truly an extensible PLATFORM that facilitates both ongoing innovation and mainstream adoption by enterprises of all types and sizes across a wide range of production use cases at scale.  As we know from this week, Apache Spark has garnered tremendous interest, much as Apache Storm continues to do. The point is, new data engines will continue to emerge, and YARN is there to provide a clean and easy way for that innovation to plug in to Hadoop so that enterprises can benefit.

So repeat after me:

Traditional Hadoop and the era of batch-only mappers and reducers is dead!

YARN makes any data processing future possible.

 

Enter Your Comment

Your email address will not be published. Required fields are marked *