Tag Archives: big data

Hadoop Ecosystem

Talking about the major ecosystem of Hadoop the first name which comes to mind is MapReduce as this is the base on which complete Hadoop framework relies. Also the processing on data can be done using MapReduce algorithm which contributes a big name for processing of data in Hadoop. For writing this MapReduce algorithm, two […]

What’s Next for Apache Hadoop Data Management and Governance

Hadoop – the data processing engine based on MapReduce – is being superceded by new processing engines: Apache Tez, Apache Storm, Apache Spark and others. YARN makes any data processing future possible. But Hadoop the platform – thanks to YARN as its architectural center – is the future for data management, with a selection of […]

The Importance of Apache Drill to the Big Data Ecosystem

You might be wondering what bearing a history lesson may have on a technology project such as Apache Drill. In order to truly appreciate Apache Drill, it is important to understand the history of the projects in this space, as well as the design principles and the goals of its implementation. The lessons that have been […]

How SQOOP-1272 Can Help You Move Big Data from Mainframe to Apache Hadoop

Apache Sqoop provides a framework to move data between HDFS and relational databases in a parallel fashion using Hadoop’s MR framework. As Hadoop becomes more popular in enterprises, there is a growing need to move data from non-relational sources like mainframe datasets to Hadoop. Following are possible reasons for this: HDFS is used simply as an […]

Drill into Your Big Data Today with Apache Drill

Big data techniques are becoming mainstream in an increasing number of businesses, but how do people get self-service, interactive access to their big data? And how do they do this without having to train their SQL-literate employees to be advanced developers? One solution is to take advantage of the rapidly maturing open source, open community […]

What are the pre-requisites for big data hadoop?

  Working directly with Java APIs can be tedious and error prone. It also restricts usage of Hadoop to Java programmers. Hadoop offers two solutions for making Hadoop programming easier.Pig is a programming language that simplifies the common tasks of working with Hadoop: loading data, expressing transformations on the data, and storing the final results. […]

Who can become a hadoop professional?

System administrators can learn some Java skills as well as cloud services management skills to start working with Hadoop installation and operations. DBAs and ETL data architects can learn Apache Pig and related technologies to develop, operate, and optimize the massive data flows going into the Hadoop system. BI analysts and data analysts can learn SQL and Hive […]