HBase is a No SQL database also known as the Hadoop Database, is an open-source database management system. It is a distributed, non-relational (columnar) database that uses Hadoop distributed file system (HDFS) as its persistence store for big data projects.
It is a top-level Apache project that started out as a project by Powerset out of the demand to process massive amount of information for the purpose of natural language search. It is styled after Google BigTable HBase runs on top of HDFS and is well-suited for faster read and write operations on large datasets with high throughput and low input/output delay.
HBase database provides a strong and reliable mechanism for managing huge amounts of data across distributed environment.
Like the traditional database system, HBase comprises a set of tables. Each table contains rows and columns. The element found in each of these set of tables is known as the primary key. Every access to HBase must be done using these primary keys. The intersection of a row and a column is called a cell.
Features of HBase
Stores huge data: HBase has the capacity of containing a lot of tables (rows and columns), hence its ability to store large information. It has a great deal of table because it is layered on Hadoop clusters of commodity hardware.
It is a columnar database: data is stored in rows and columns which make up sets of tables. This is very similar to the typical relational database management systems (RDBMS).
Real-time access to data: HBase, unlike the typical database, offers access to real-time access to data in Hadoop. It allows you to query for individual and personal records and analytic reports across a massive amount of data.
A form of storage: HBase takes and keeps all the data distributed across the nodes of the cluster inside HDFS data nodes.
Where to Use HBase
- Apache HBase can be used to gain random, real-time read/write access to Big Data.
- It can also be used to host relatively large tables on top of clusters of commodity hardware.
Applications of HBase
- HBase can be used in case there is any need to write large applications.
- HBase can also be used for providing fast random access to available data.
- Companies like Twitter, Facebook, Yahoo, and Adobe also make use of HBase in their internal network.