Understanding HBase

Column in HBase

HBase’s table can has multidimensional column.
Hence if you know about RDBMS, you might have a problem to understand about HBase table structure. JSON expression will give little help to understanding.

1. Run hbase shell

hadoop@delmonte:~$ hbase shell

2. Create Table

You can make a table and columns through two ways.

hbase> create 'blogposts', {NAME => 'post'}, {NAME => 'image', 
VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}


hbase> create 'blogposts', 'post', 'image'
Express by JSON
blogposts = {

3. Add Data

hbase> put 'blogposts', 'post1', 'post:title', 'Hello World'
hbase> put 'blogposts', 'post1', 'post:author', 'The Author'
hbase> put 'blogposts', 'post1', 'post:body', 'This is a blog post'
hbase> put 'blogposts', 'post1', 'image:header', 'image1.jpg'
hbase> put 'blogposts', 'post1', 'image:bodyimage', 'image2.jpg'
Express by JSON
blogposts = {
  'post1':{ // row
    'post':{ // column
      'title':'Hello World', // cell
      'author':'The Author', // cell
      'body':'This is a blog post' // cell
    'image':{ // column
      'header':'image1.jpg', // cell
      'bodyimage':'image2.jpg' // cell

4. Look at the Data

hbase> get ‘blogposts’, ‘post1′
COLUMN          CELL
image:bodyimage timestamp=1229953133260, value=image2.jpg
image:header    timestamp=1229953110419, value=image1.jpg
post:author     timestamp=1229953071910, value=The Author
post:body       timestamp=1229953072029, value=This is a blog post
post:title      timestamp=1229953071791, value=Hello World

Row in HBase

In HBase, you need to keep in mind that a table is possible to consist more than billion rows. Thus, to find a particular row, HBase use named ‘rowkey’: It is a same concept as ‘hashkey’ in the hash table.

Most of people mentioned about MD5 to make a ‘rowkey’ and there’s reason to do.

First of all, MD5 makes long name to short (16 bytes).
e.g. An URL ‘http:// do-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo.com’ can be ‘739729cc1870c16e78c1cb1395bf2bc4’.

Second, If you monotonically increase ‘rowkey’ like ‘r1’, ‘r2’, ‘r3’ … ‘rn’, you will encounter a problem called ‘RegionServer Hotspotting’. It is really well described in here by comics.

Of course you can make a ‘rowkey’ by combination like this. The choice is depends on what your site want to store to HBase.

refer for ROW IN HBASE

Published by

Raphael, Eom

Working these days with Meteor, Redux, MongoDB email: gblue1223@gmail.com

2 thoughts on “Understanding HBase”

Leave a reply to Raphael, Eom Cancel reply