A great hello world tutorial explaining about how to start with Hbase and Hadoop can be found Here.
This is my summery and notes about the post :
Installing the SSH server:
sudo apt-get install openssh-server
Create the Hadoop user:
sudo addgroup hadoop
sudo adduser --ingroup hadoop huser
Generate the user public keys:
#login as hadoop user
sudo -i -u huser
#Create the hadoop user public key
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
#Copy the generated public key onto the ssh/authorized_keys
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Setting Up HDFS
#Create a directory used to contain the HDFS file
mkdir /home/huser/my_hdfs_folder
Update the hadoop config with the hdfs directrory
Note:hadoop.tmp.dir is used as the base for temporary directories locally, and also in HDFS.
The following configuration set the created directory as the HDFS directory.
1: <?xml version=”1.0”?>2: <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>3: <configuration>4: <property>5: <name>hadoop.tmp.dir</name>6: <value>/home/huser/my_hdfs_folder</value>7: </property>8: <property>9: <name>fs.default.name</name>10: <value>hdfs://ubuntu:8020</value>11: </property>12: </configuration>
#Format the HDFS
/usr/local/hadoop/bin/hadoop namenode –format
#start the hadoop single instance
/usr/local/hadoop/bin/start-all.sh
View the lifeness of the hdoop instance in the following url:
http://ubuntu:50070/dfshealth.jsp
Setting up HBase
HBase need a directory inside of the HDFS
We create it using the HDFS fs –mkdir command for example
/usr/local/hadoop/bin/hadoop fs -mkdir myHbase
The new hdfs directoy should be point out in the Hbase site configuration file :
hbase-site.xml.
1: configuration>2: <property>3: <name>hbase.rootdir</name>4: <value>hdfs://ubuntu:8020/user/huser/myHbase</value>5: <description>6: </description>7: </property>8: <property>9: <name>hbase.master</name>10: <value>ubuntu:60000</value>11: <description>12: </description>13: </property>14: </configuration>
Start the HBase DB
/usr/local/hbase/bin/start-hbase.sh
Monitor its lifeness
http://ubuntu:60010/master-status
Starting the Shell
/usr/local/hbase/bin/hbase shell
Create and Update a simple DB
#Create a new table named myBlogs along with a column family BlogText
create ‘myBlogs','BlogText'
#insert some data
1: put ‘myBlogs','Ruby','BlogText:1','About ruby bla bla.'2:3: put ‘myBlogs','Ruby','BlogText:2','about {|X| bla bal.'4:5: put ‘myBlogs','Ruby','BlogText:3','for loops.'6:7: put ‘myBlogs','Python','BlogText:1','iter tools .'
The following code is used to query the created hbase DB
1: package my.learn.hbase;
2:3: import java.util.NavigableMap;
4: import java.util.NavigableSet;
5:6: import org.apache.hadoop.conf.Configuration;
7: import org.apache.hadoop.hbase.HBaseConfiguration;
8: import org.apache.hadoop.hbase.client.HBaseAdmin;
9: import org.apache.hadoop.hbase.client.HTableFactory;
10: import org.apache.hadoop.hbase.client.HTableInterface;
11: import org.apache.hadoop.hbase.client.Result;
12: import org.apache.hadoop.hbase.client.ResultScanner;
13: import org.apache.hadoop.hbase.client.Scan;
14: import org.apache.hadoop.hbase.util.Bytes;
15:16: public class HBaseReadMyBlogsData {17:18: public static final byte[] TablemyBlogs = Bytes.toBytes("myBlogs");19: // The column family
20: public static final byte[] BlogText_FAMILY = Bytes.toBytes("BlogText");21:22:23: private void ShowTheBlogsText() throws Exception {24:25: // Load's the hbase-site.xml config
26: Configuration config = HBaseConfiguration.create();27: //Factory for creating HTable instances.
28: HTableFactory factory = new HTableFactory();
29:30: HBaseAdmin.checkHBaseAvailable(config);31:32: // Link to table
33: HTableInterface table = factory.createHTableInterface(config,34: TablemyBlogs);35:36: // Used to retrieve rows from the table
37: Scan scan = new Scan();
38:39: // Scan through each row in the table
40: ResultScanner rs = table.getScanner(scan);41: try {
42: // Loop through each retrieved row
43: for (Result r = rs.next(); r != null; r = rs.next()) {44: //print out the row key
45: System.out.println("Key: " + new String(r.getRow()));46:47: //For each key loop over its qualifier for "ruby" key we will have 1 , 2 , 3
48:49: NavigableMap familyMap = r50: .getFamilyMap(BlogText_FAMILY);51: // This is a list of the qualifier keys
52: NavigableSet keySet = familyMap.navigableKeySet();53:54: // Print out each value within each qualifier
55: for (byte[] key : keySet) {56: System.out.println("\t Definition: " + (new String(key))57: + ", Value:"
58: + new String(r.getValue(BlogText_FAMILY, key)));
59: }60: }61: } catch (Exception e) {
62: throw e;
63: } finally {
64: rs.close();65: }66:67: }68: }
Notes:
The HBaseAdmin provides an interface to manage HBase database table metadata + general administrative functions. like create, drop, list, enable and disable tables.
The HBaseAdmin can be used to add and drop table column families.
אין תגובות:
הוסף רשומת תגובה