Column in HBase
HBase’s table can has multidimensional column.
Hence if you know about RDBMS, you might have a problem to understand about HBase table structure. JSON expression will give little help to understanding.
1. Run hbase shell
hadoop@delmonte:~$ hbase shell
2. Create Table
You can make a table and columns through two ways.
hbase> create 'blogposts', {NAME => 'post'}, {NAME => 'image', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
or
hbase> create 'blogposts', 'post', 'image'
Express by JSON
blogposts = { 'post':{}, 'image':{} }
3. Add Data
hbase> put 'blogposts', 'post1', 'post:title', 'Hello World' hbase> put 'blogposts', 'post1', 'post:author', 'The Author' hbase> put 'blogposts', 'post1', 'post:body', 'This is a blog post' hbase> put 'blogposts', 'post1', 'image:header', 'image1.jpg' hbase> put 'blogposts', 'post1', 'image:bodyimage', 'image2.jpg'
Express by JSON
blogposts = { 'post1':{ // row 'post':{ // column 'title':'Hello World', // cell 'author':'The Author', // cell 'body':'This is a blog post' // cell }, 'image':{ // column 'header':'image1.jpg', // cell 'bodyimage':'image2.jpg' // cell } } }
4. Look at the Data
hbase> get ‘blogposts’, ‘post1′
COLUMN CELL image:bodyimage timestamp=1229953133260, value=image2.jpg image:header timestamp=1229953110419, value=image1.jpg post:author timestamp=1229953071910, value=The Author post:body timestamp=1229953072029, value=This is a blog post post:title timestamp=1229953071791, value=Hello World
Row in HBase
In HBase, you need to keep in mind that a table is possible to consist more than billion rows. Thus, to find a particular row, HBase use named ‘rowkey’: It is a same concept as ‘hashkey’ in the hash table.
Most of people mentioned about MD5 to make a ‘rowkey’ and there’s reason to do.
First of all, MD5 makes long name to short (16 bytes).
e.g. An URL ‘http:// do-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo-buffalo.com’ can be ‘739729cc1870c16e78c1cb1395bf2bc4’.
Second, If you monotonically increase ‘rowkey’ like ‘r1’, ‘r2’, ‘r3’ … ‘rn’, you will encounter a problem called ‘RegionServer Hotspotting’. It is really well described in here by comics.
Of course you can make a ‘rowkey’ by combination like this. The choice is depends on what your site want to store to HBase.
refer for COLUMN IN HBASE http://www.evanconkle.com/2011/11/hbase-tutorial-creating-table/ http://wiki.apache.org/hadoop/Hbase/Shell http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable
refer for ROW IN HBASE http://hbase.apache.org/book/rowkey.design.html http://entireboy.egloos.com/viewer/4689269 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/ http://hbase.apache.org/book/schema.casestudies.html
I am in fact grateful to the owner of this website who has shared this impressive article at at
this time.
LikeLiked by 1 person
Thank you for your compliment 🙂 I hope this article helps you.
LikeLike