Accumulo is a sorted, distributed key-value system built on top of
Apache Hadoop, ZooKeeper, and Apache Thrift. Accumulo has cell-level access
labels and a server-side programming mechanisms.
We will cover the
following info regarding Accumulo database:
·
Installation and configuration
of Accumulo.
·
Running Accumulo.
·
Example of usage.
·
Java API
1.
Tool chain
requirements for Accumulo are: Java (1.6 and higher) Hadoop, Zookeeper (3.3.3
and higher). In this tutorial
we use Java version 1.7.0, Hadoop version 1.0.3, and Zookeeper version 3.4.5.
3.
Extract
Accumulo files.
e.g. to /specific/disk1/temp/Accumulo/.
5.
Extract
Zookeeper
files. e.g. to
/specific/disk1/temp/Zookeeper/.
6.
Go to the conf
folder in the zookeeper directory and create a file called zoo.cfg.
Insert the following lines inside zoo.cfg:
tickTime=2000
maxClientCnxns=100
dataDir=/var/zookeeper
clientPort=2181
# change the var instance to the
place you would like zookeeper data file to be placed
# e.g
dataDir=/specific/disk1/temp/zookeeper/conf/zookeeper
Save the file and close it.
8.
Extract
Hadoop
files. e.g. to
/specific/disk1/temp/Hadoop/.
9.
Go to the conf
folder in the hadoop directory and edit the following files:
·
Insert the
following lines inside core-site.xml:
<configuration>
|
<property>
|
<name>fs.default.name</name>
|
<value>hdfs://localhost:9000</value>
|
</property>
|
</configuration>
|
·
Insert the
following lines inside hdfs-site.xml:
<configuration>
|
<property>
|
<name>dfs.replication</name>
|
<value>1</value>
|
</property>
|
</configuration>
|
·
Insert the
following lines inside mapred-site.xml:
<configuration>
|
<property>
|
<name>mapred.job.tracker</name>
|
<value>localhost:9001</value>
|
</property>
|
</configuration>
|
·
If you would like
to change the default place for Hadoop to deploy local data files insert the
following lines inside core-site.xml:
<configuration>
|
<property>
|
<name> dfs.data.dir</name>
|
<value>"DesignatedPath"</value>
|
</property>
|
</configuration>
|
10.
Go to the conf folder in the Accumulo directory
and copy one of the configuration available in the conf/examples folder to conf
folder:
e.g:
"cp conf/examples/512MB/native-standalone/* conf"
If you are configuring a larger cluster you will need to create the configuration files yourself and propagate the changes to the $ACCUMULO_HOME/conf directories:
· Create a "slaves" file in $ACCUMULO_HOME/conf/.
This is a list of machines where tablet servers and loggers will run.
· Create a "masters" file in $ACCUMULO_HOME/conf/.
This is a list of machines where the master server will run.
· Create conf/accumulo-env.sh following the template of example/3GB/native-standalone/accumulo-env.sh.
11.
Edit the JAVA_HOME, HADOOP_HOME, and ZOOKEEPER_HOME values in
conf/accumulo-env.sh and point each of them to their home folder
location accordingly:
e.g:
ZOOKEEPER_HOME=/specific/disk1/temp/zookeeper
12.
Edit conf/accumulo-site.xml and set the zookeeper
servers in the
instance.zookeeper.host
property:
Edit the
value of the property to be the ip's of the machines to run zookeeper (you need
at least one computer running zookeeper)
E.g: 1 zookeeper configuration:
<property>
<name>instance.zookeeper.host</name>
<description>comma separated list of zookeeper
servers</description>
</property>
E.g: 2 zookeeper configuration:
<property>
<name>instance.zookeeper.host</name>
<value>132.67.104.169:2181,132.67.104.158:2181</value>
<description>comma separated list of zookeeper
servers</description>
</property>
1.
Now let's bring Accumulo server.
Once zookeeper and
Hadoop are configured correctly on the machine you may start Zookeeper, Hadoop
and Accumulo servers.
Run Zookeeper:
bin/zkServer.sh start (you may stop it with: bin/zkServer.sh stop)
Run Hadoop:
o
bin/hadoop namenode –format
o
bin/start-all.sh (you may stop it
with: bin/stop-all.sh)
Run Accumulo:
o
bin/accumulo init (enter the instance
id and password in our example we set it to accum/accum)
o
bin/start-all.sh (you may stop it
with: bin/stop-all.sh)
2.
You may check that Hadoop runs
correctly through the monitor page:
This should look
like:
Except the port
number in our case will be 50070.
3.
You may check that Accumulo runs
correctly through the monitor page:
This should look
like:
No comments:
Post a Comment