HBase
HBase is a high performance database using the Hadoop framework.
In HBase you have rows and column families which group several columns together. So it looks like a normal SQL Database where some columns somehow belong together.
The column families must exist as part of the table schema definition while the columns within it can be created on demand.
Download HBase, extract the folder somewhere, add its bin folder to the PATH, set the JAVA_HOME variable.
Set at least the JAVA_HOME value in conf/hbase-env.sh within the installation folder. In conf/hbase-site.xml set the following to a folder of your choice (/tmp will be deleted on reboot, so maybe something else)
<property>
<name>hbase.rootdir</name>
<value>file:///tmp/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/tmp/zookeeper</value>
</property>
</configuration>
Test if it works
Now start it
Failed for me with
and it helped to ensure JAVA_HOME is set in the hbase-env.sh, hbase-site.xml points to valid folders which do not exist but can be created and than stop and start hbase again.
Create the table people with column groups 'personal data' and 'professional data'. There are no columns yet in them.
Show the new table
List of all tables
How many rows do we have in our table
Put some entries into the table. The columns are created on the fly within the specified column groups.
# put 'people','42','personal data:city','New York'
# put 'people','42','professional data:company','Example inc.'
List the whole table
Get the row with the named id
COLUMN CELL
personal DATA:city TIMESTAMP=1431089145062, VALUE=NEW York
personal DATA:name TIMESTAMP=1431089144977, VALUE=Mr. John Doe
professional DATA:company TIMESTAMP=1431089146357, VALUE=Example inc.
3 ROW(s) IN 0.0160 seconds
Delete a full row
Disable a table, required before you can delete it
Enable a disabled table again
Drop table
Read commands line be line from a file
Table variables Normally you would create and use a table like this
# put 'myTable', 'id', 'colA', 'value'
# scan 'myTable'
# DESCRIBE 'myTable'
# disable 'myTable'
# DROP 'myTable'
You can also get a variable during creating the table and use it instead, saves you from repeating the table name in any command
# t.put 'id', 'colA', 'value'
# t.scan
# t.describe
# t.disable
# t.drop
You can also get such a variable from an existing table
You can even get more than one table and issue command per table
TABLES.map { |t| discribe t ; scan t}
HBase Filter
From the HBase Shell
From Java code
final Scan scan = NEW Scan();
scan.setFilter(FILTER);
WTF
# import java.text.ParsePosition
# SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime()
1218920189000
# DATE.new(1218920189000).toString()
"Sat Aug 16 20:56:29 UTC 2008”