Saturday, May 21, 2016

Partitioning and Bucketing in Hive

Partitioning data is often used for distributing load horizontally, this has performance benefit, and helps in organizing data in a logical fashion. Example like if we are dealing with large employee table and often run queries with WHERE clauses that restrict the results to a particular country or department . For a faster query response Hive table can be PARTITIONED BY (country STRING, DEPT STRING), Partitioning tables changes how Hive structures the data storage and Hive will now create subdirectories reflecting the partitioning structure like . .../employees/country=ABC/DEPT=XYZ. If query limits for employee from country ABC t will only scan the contents of one directory ABC. This can dramatically improve query performance, but only if the partitioning scheme reflects common filtering. Partitioning feature is very useful in Hive, however, a design that creates too many partitions may optimize some queries, but be detrimental for other important queries. Other drawback is having too many partitions is the large number of Hadoop files and directories that are created unnecessarily and overhead to NameNode since it must keep all metadata for the file system in memory.
Bucketing is another technique for decomposing data sets into more manageable parts. For example, suppose a table using the date as the top-level partition and the employee_id as the second-level partition leads to too many small partitions. Instead, if we bucket the employee table and use employee_id as the bucketing column, the value of this column will be hashed by a user-defined number into buckets. Records with the same employee_id will always be stored in the same bucket. Assuming the number of employee_id is much greater than the number of buckets, each bucket will have many employee_id. While creating table you can specify like CLUSTERED BY (employee_id) INTO XX BUCKETS ; where XX is the number of buckets . Bucketing has several advantages. The number of buckets is fixed so it does not fluctuate with data. If two tables are bucketed by employee_id, Hive can create a logically correct sampling. Bucketing also aids in doing efficient map-side joins etc.

The below given questions and answers will give a clear idea about partitioning and bucketing together.

Question : Suppose if we specify 32 buckets in the CLUSED BY clause and the CREATE TABLE statement also contains the Partitioning clause, how partitions and buckets will be managed together ? Does number of partitions will be limited to 32 ? OR for each partition, 32 buckets will be created ? Is every bucket an HDFS file ?

Answer : A hive table can have both partitioning and bucketing. Based on your partition clause , for each partition will have 32 buckets created. Yes HDFS file.

Hive Join strategies

An SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them. In a similar line we’ve Hive Query Language(HQL or HiveQL) joins; which is the key factor for the optimization and performance of hive queries. Choosing the right join based on the data and business need is key principal to improve the Hive query performance.
Let’s try to understand that how join works in Hive execution. In general join operation will be compiled to a MapReduce task, involves a map stage and a reduce stage. A mapper reads from join tables and emits the join key and join value pair into an intermediate file. Hadoop sorts and merges these pairs in what’s called the shuffle stage. The reducer takes the sorted results as input and does the actual join work. The shuffle stage is very expensive; hence saving this stage will improves the task performance. In simple, the Join is a clause that combines the records of two tables (or Data-Sets).
Shuffle Join / Common Join:
When: 
  • It’s a default choice and it always works
How:
  • Reads from part of one of the tables
  • Buckets and sorts on Join key
  • Sends one bucket to each reduce
  • Join is done on the Reduce side
Pointers:
  • Works all time
Map Join:
When:
  • One table should be small, which fits in memory
  • To save the shuffle & reduce stages
  • To do the join work only in the map stage
  • It’s suitable for small tables to optimize the task
How:
  • Reads small table into memory hash table
  • Streams through part of the big file
  • Joining each record from hash table
  • It is similar to a join but all the task will be performed by the mapper alone
Pointers:
  • Very fast, but it’s limited
  • Using the Distributed Cache solve the scaling problem limitations
  • Most improvement comes from removing the JDBM component
  • No need to use persistent hashtable for map join
  • query hint as MAPJOIN
  • RIGHT OUTER JOIN / FULL OUTER JOIN is not possible
  • It has no reduce task and it can handle only one key at a time
  • set hive.auto.convert.join; If true automatically converts the joins to mapjoins at run time
  • set hive.auto.convert.join.noconditionaltask; If true no longer a need to provide the map-join hint
  • set hive.auto.convert.join.noconditionaltask.size; It controls the size of table to fit in memory
Left Semi Join:
When:
  • To have functionality of IN/EXISTS subquery semantics
How:
  • It’s generic join and effective for inner joins
  • Once the match is found it will stop looking for the scan of the other records
Pointers:
  • Right hand side table should only be reference with ON clause, not with WHERE/SELECT clause
  • Right semi-joins are not supported in Hive
Bucket Map Join:
When:
  • Total table/partition size is big, not good for mapjoin
  • Non-sorted the same
  • Bucketd the same
  • Joining on the sort/bucketing on the multiple columns
How:
  • Work together with map join, and all join tables are bucketized
  • Each small tableʼs bucket number can be divided by big tableʼs bucket number
  • Bucket columns == Join columns
  • Only matching buckets of all small tables are replicated onto each mapper
Pointers:
  • set hive.optimize.bucketmapjoin; If true  then Bucket Map Join is activated
Sort Merge Bucket (SMB) Join:
When:
  • Sorted the same
  • Bucketed the same
  • Joining on the sort/bucket on the same/equal columns
How:
  • Reads a bucket from each table
  • Process the row with the lowest value
Pointers:
  • Very efficient if applicable
  • Both Mapt & Reduce task are used
  • set hive.input.format; If it’s org.apache.hadoop.hive.ql.io.bucketizedhiveinputformat the SMB join is activated
  • set hive.auto.convert.sortmerge.join=true
  • set hive.optimize.bucketmapjoin = true
  • set hive.optimize.bucketmapjoin.sortedmerge = true
  • set hive.auto.convert.sortmerge.join.noconditionaltask=true
Sort Merge Bucket Map(SMB Map) Join:
When:
  • Sorted the same
  • Bucketed the same
  • Joining on the sort/bucket on the same/equal columns
  • No limit on file/partition/table size
How:
  • Partitioned table might be slows down due to each single key in map needs small chunk
  • Work together with bucket map join
  • Bucket columns == Join columns == sort columns Sort Merge
  • Small tables are read on demand
  • NOT hold entire small tables in memory
  • Can perform outer join
Pointers:
  • set hive.auto.convert.sortmerge.join=true
  • set hive.optimize.bucketmapjoin = true
  • set hive.optimize.bucketmapjoin.sortedmerge = true
  • set hive.auto.convert.sortmerge.join.noconditionaltask=true
  • sethive.auto.convert.sortmerge.join.bigtable.selection.policy=org.apache.hadoop.hive.ql.optimizer.TableSizeBasedBigTableSelectorForAutoSMJ
Skew Join:
When:
  • Needs to join on the two very large data tables
  • Join bottle necked on the reducer who gets the skewed key
How:
  • If a small number of skewed keys make up for a significant percentage of the data,
    they will not become bottlenecks
Pointers:
  • Because of the partial results, the results also have to be read and written twice
  • The user needs to be aware of the skew in the data and manually do the above process
Cartesian Product Join:
When:
  • To generate all the set of records from all the tables in the applications
How:
  • Not optimized as in the MapReduce usage
  • Computes full cartesian, before WHERE clause is applied
Pointers:
  • set hive.mapred.mode=strict; help us to prevent from submitting unknowing Cartesian product query

Tuesday, May 6, 2014

Accumulo - Example/Java API


Accumulo - Examples/Java API 

1.        Use the following command to prompt the client shell:
bin/accumulo shell -u root
Then enter the password for your Accumulo instance in our example we set the instance name and password to accum/accum.
2.        You may type help for the list of available commands.
3.        You can create a new table using: createtable command:
4.        E.g: create a new table called usertable:  "createtable usertable"
 





·         The following running examples are from apache accumulo
You may find more examples on: http://accumulo.apache.org/1.4/examples/.
5.     Creating a new user:
A user does not by default have permission to create a table.
root@instance> createuser username
Enter new password for 'username': ********
Please confirm new password for 'username': ********
root@instance> user username
Enter password for user username: ********
username@instance> createtable usertable
06 10:48:47,931 [shell.Shell] ERROR: org.apache.accumulo.core.client.AccumuloSecurityException: Error PERMISSION_DENIED - User does not have permission to perform this action
username@instance> userpermissions
System permissions:
Table permissions (!METADATA): Table.READ
username@instance>
6.     Granting permissions to a user:
username@instance> user root
Enter password for user root: ********
root@instance> grant -s System.CREATE_TABLE -u username
root@instance> user username
Enter password for user username: ********
username@instance> createtable usertable
username@instance> userpermissions
System permissions: System.CREATE_TABLE
Table permissions (!METADATA): Table.READ
Table permissions (usertable): Table.READ, Table.WRITE, Table.BULK_IMPORT, Table.ALTER_TABLE, Table.GRANT, Table.DROP_TABLE
username@instance usertable>
7.     Inserting data with visibilities:
usage: insert <row> <colfamily> <colqualifier> <value> [-?] [-l <expression>] [-t <timestamp>]   
description: inserts a record   
  -?,-help  display this help   
  -l,-authorization-label <expression>  formatted authorization label expression   
  -t,-timestamp <timestamp>  timestamp to use for insert.
Visibilities are boolean AND (&) and OR (|) combinations of authorization tokens. Authorization tokens are arbitrary strings taken from a restricted ASCII character set. Parentheses are required to specify order of operations in visibilities.
username@instance usertable> insert row f1 q1 v1 -l A
username@instance usertable> insert row f2 q2 v2 -l A&B
username@instance usertable> insert row f3 q3 v3 -l apple&carrot|broccoli|spinach
06 11:19:01,432 [shell.Shell] ERROR: org.apache.accumulo.core.util.BadArgumentException: cannot mix | and & near index 12 apple&carrot|broccoli|spinach ^
username@instance usertable> insert row f3 q3 v3 -l (apple&carrot)|broccoli|spinach
username@instance usertable>

8.     Scanning with authorizations:
usage: scan [-?] [-b <start-row>] [-c <<columnfamily>[:<columnqualifier>]>] [-e   
 <end-row>] [-np] [-s <comma-separated-authorizations>] [-st]   
description: scans the table, and displays the resulting records   
  -?,-help  display this help   
  -b,-begin-row <start-row>  begin row (inclusive)   
  -c,-columns <<columnfamily>[:<columnqualifier>]>  comma-separated columns   
  -e,-end-row <end-row>  end row (inclusive)   
  -np,-no-pagination  disables pagination of output   
  -s,-scan-authorizations <comma-separated-authorizations>  scan authorizations   
  -st,-show-timestamps  enables displaying timestamps

Authorizations are sets of authorization tokens. Each Accumulo user has authorizations and each Accumulo scan has authorizations. Scan authorizations are only allowed to be a subset of the user's authorizations. By default, a user's authorizations set is empty.

username@instance usertable> scan
username@instance usertable> scan -s A
06 11:43:14,951 [shell.Shell] ERROR: java.lang.RuntimeException: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_AUTHORIZATIONS - The user does not have the specified authorizations assigned
username@instance usertable>

9.     Setting authorizations for a user:
usage: setauths [-?] -c | -s <comma-separated-authorizations> [-u <user>]   
description: sets the maximum scan authorizations for a user   
-?,-help  display this help   
-c,-clear-authorizations  clears the scan authorizations 
           -s,-scan-authorizations <comma-separated-authorizations>  set the scan   
authorizations   
-u,-user <user>  user to operate on

username@instance usertable> setauths -s A
06 11:53:42,056 [shell.Shell] ERROR: org.apache.accumulo.core.client.AccumuloSecurityException: Error PERMISSION_DENIED - User does not have permission to perform this action
username@instance usertable>
A user cannot set authorizations unless the user has the System.ALTER_USER permission. The root user has this permission.
username@instance usertable> user root
Enter password for user root: ********
root@instance usertable> setauths -s A -u username
root@instance usertable> user username
Enter password for user username: ********
username@instance usertable> scan -s A
row f1:q1 [A]    v1
username@instance usertable> scan
row f1:q1 [A]    v1
username@instance usertable>
The default authorizations for a scan are the user's entire set of authorizations.
username@instance usertable> user root
Enter password for user root: ********
root@instance usertable> setauths -s A,B,broccoli -u username
root@instance usertable> user username
Enter password for user username: ********
username@instance usertable> scan
row f1:q1 [A]    v1
row f2:q2 [A&B]    v2
row f3:q3 [(apple&carrot)|broccoli|spinach]    v3
username@instance usertable> scan -s B
username@instance usertable>

If you want, you can limit a user to only be able to insert data which they can read themselves. It can be set with the following constraint.

username@instance usertable> user root
Enter password for user root: ******
root@instance usertable> config -t usertable -s table.constraint.1=org.apache.accumulo.core.security.VisibilityConstraint   
root@instance usertable> user username
Enter password for user username: ********
username@instance usertable> insert row f4 q4 v4 -l spinach                                                               
Constraint Failures:        ConstraintViolationSummary(constrainClass:org.apache.accumulo.core.security.VisibilityConstraint, violationCode:2, violationDescription:User does not have authorization on column visibility, numberOfViolatingMutations:1)
username@instance usertable> insert row f4 q4 v4 -l spinach|broccoli
username@instance usertable> scan
row f1:q1 [A]    v1
row f2:q2 [A&B]    v2
row f3:q3 [(apple&carrot)|broccoli|spinach]    v3
row f4:q4 [spinach|broccoli]    v4
username@instance usertable>

Java API

1.        Row operations – All basic row operations createtable/insert/delete etc

/*
 * Author : Amal Babu
 * Description : Row operations include create,insert,delete in JAVA API
 * */

package com.amal.accumulo;
import java.io.IOException;


import java.util.Map.Entry;

import org.apache.accumulo.core.Constants;
import org.apache.accumulo.core.client.AccumuloException;
import org.apache.accumulo.core.client.AccumuloSecurityException;
import org.apache.accumulo.core.client.BatchWriter;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.MutationsRejectedException;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableExistsException;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
import org.apache.hadoop.io.Text;
import org.apache.log4j.Logger;

/**
 * A demonstration of reading entire rows and deleting entire rows.
 */
public class RowOperations {
 
  private static final Logger log = Logger.getLogger(RowOperations.class);
 
  private static Connector connector;
  private static String table = "accumulotable";
  private static BatchWriter bw;
 
  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableExistsException, TableNotFoundException,
      MutationsRejectedException {
    if (args.length != 4) {
      log.error("Usage: <instance name> <zoo keepers> <username> <password>");
      return;
    }
   
    // First the setup work
    connector = new ZooKeeperInstance(args[0], args[1]).getConnector(args[2], args[3].getBytes());
   
    // lets create an example table
    connector.tableOperations().create(table);
   
    // lets create 3 rows of information
    Text row1 = new Text("row1");
    Text row2 = new Text("row2");
    Text row3 = new Text("row3");
   
    // Which means 3 different mutations
    Mutation mut1 = new Mutation(row1);
    Mutation mut2 = new Mutation(row2);
    Mutation mut3 = new Mutation(row3);
   
    // And we'll put 4 columns in each row
    Text col1 = new Text("1");
    Text col2 = new Text("2");
    Text col3 = new Text("3");
    Text col4 = new Text("4");
   
    // Now we'll add them to the mutations
    mut1.put(new Text("column"), col1, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut1.put(new Text("column"), col2, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut1.put(new Text("column"), col3, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut1.put(new Text("column"), col4, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
   
    mut2.put(new Text("column"), col1, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut2.put(new Text("column"), col2, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut2.put(new Text("column"), col3, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut2.put(new Text("column"), col4, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
   
    mut3.put(new Text("column"), col1, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut3.put(new Text("column"), col2, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut3.put(new Text("column"), col3, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
    mut3.put(new Text("column"), col4, System.currentTimeMillis(), new Value("This is the value for this key".getBytes()));
   
    // Now we'll make a Batch Writer
    bw = connector.createBatchWriter(table, 100000l, 30l, 1);
   
    // And add the mutations
    bw.addMutation(mut1);
    bw.addMutation(mut2);
    bw.addMutation(mut3);
   
    // Force a send
    bw.flush();
   
    // Now lets look at the rows
    Scanner rowThree = getRow(new Text("row3"));
    Scanner rowTwo = getRow(new Text("row2"));
    Scanner rowOne = getRow(new Text("row1"));
   
    // And print them
    log.info("This is everything");
    printRow(rowOne);
    printRow(rowTwo);
    printRow(rowThree);
    System.out.flush();
   
    // Now lets delete rowTwo with the iterator
    rowTwo = getRow(new Text("row2"));
    deleteRow(rowTwo);
   
    // Now lets look at the rows again
    rowThree = getRow(new Text("row3"));
    rowTwo = getRow(new Text("row2"));
    rowOne = getRow(new Text("row1"));
   
    // And print them
    log.info("This is row1 and row3");
    printRow(rowOne);
    printRow(rowTwo);
    printRow(rowThree);
    System.out.flush();
   
    // Should only see the two rows
    // Now lets delete rowOne without passing in the iterator
   
    deleteRow(row1);
   
    // Now lets look at the rows one last time
    rowThree = getRow(new Text("row3"));
    rowTwo = getRow(new Text("row2"));
    rowOne = getRow(new Text("row1"));
   
    // And print them
    log.info("This is just row3");
    printRow(rowOne);
    printRow(rowTwo);
    printRow(rowThree);
    System.out.flush();
   
    // Should only see rowThree
   
    // Always close your batchwriter
   
    bw.close();
   
    // and lets clean up our mess
    connector.tableOperations().delete(table);
   
    // fin~
   
  }
 
  /**
   * Deletes a row given a text object
   *
   * @param row
   * @throws TableNotFoundException
   * @throws AccumuloSecurityException
   * @throws AccumuloException
   */
  private static void deleteRow(Text row) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
    deleteRow(getRow(row));
  }
 
  /**
   * Deletes a row, given a Scanner of JUST that row
   *
   * @param scanner
   */
  private static void deleteRow(Scanner scanner) throws MutationsRejectedException {
    Mutation deleter = null;
    // iterate through the keys
    for (Entry<Key,Value> entry : scanner) {
      // create a mutation for the row
      if (deleter == null)
        deleter = new Mutation(entry.getKey().getRow());
      // the remove function adds the key with the delete flag set to true
      deleter.putDelete(entry.getKey().getColumnFamily(), entry.getKey().getColumnQualifier());
    }
    bw.addMutation(deleter);
    bw.flush();
  }
 
  /**
   * Just a generic print function given an iterator. Not necessarily just for printing a single row
   *
   * @param scanner
   */
  private static void printRow(Scanner scanner) {
    // iterates through and prints
    for (Entry<Key,Value> entry : scanner)
      log.info("Key: " + entry.getKey().toString() + " Value: " + entry.getValue().toString());
  }
 
  /**
   * Gets a scanner over one row
   *
   * @param row
   * @return
   * @throws TableNotFoundException
   * @throws AccumuloSecurityException
   * @throws AccumuloException
   * @throws IOException
   */
  private static Scanner getRow(Text row) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
    // Create a scanner
    Scanner scanner = connector.createScanner(table, Constants.NO_AUTHS);
    // Say start key is the one with key of row
    // and end key is the one that immediately follows the row
    scanner.setRange(new Range(row));
    return scanner;
  }
 
}

How to execute ?

·         Make it as runnable jar and copy it to /accumulo/lib/ext folder.
·         /accumulo/bin/accumulo com.amal.accumulo.RowOperations

2.        Adding cell level security – JAVA API
/*
 * Author : Amal Babu
 * Description : Row operations include create,insert,delete in JAVA API
 * */

package com.amal.accumulo;

import java.io.IOException;
import java.util.Map.Entry;

import org.apache.accumulo.core.client.AccumuloException;
import org.apache.accumulo.core.client.AccumuloSecurityException;
import org.apache.accumulo.core.client.BatchWriter;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.MutationsRejectedException;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableExistsException;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.security.ColumnVisibility;
import org.apache.hadoop.io.Text;
import org.apache.log4j.Logger;

public class CellLevelSecurity {

        private static String table;
        private static final Logger log = Logger.getLogger(RowOperations.class);

        private static Connector connector;
        // private static String table = "accumulotable";
        private static BatchWriter bw;
        // default value
        private static String securityLevelsWhichNeedsToAccess;

        public static void main(String[] args) throws AccumuloException,
                        AccumuloSecurityException, TableExistsException,
                        TableNotFoundException, MutationsRejectedException {
                if (args.length != 6) {
                        log.error("Usage: <instance name> <zoo keepers> <username> <password> <accumulotablename> <securitylevel>");
                        return;
                }

                // First the setup work
                connector = new ZooKeeperInstance(args[0], args[1]).getConnector(
                                args[2], args[3].getBytes());
                table = args[4];
                securityLevelsWhichNeedsToAccess = args[5];
                String[] arrayAuthorizations = securityLevelsWhichNeedsToAccess
                                .split(",");
                // listOfAccessQualifiers = new
                // ArrayList<String>(Arrays.asList(splitString));
                // lets create an example table
                connector.tableOperations().create(table);

                System.out.println("Table created successfully...!!");
                // lets create 3 rows of information
                Text row1 = new Text("row1");
                Text row2 = new Text("row2");

                Mutation mut1 = new Mutation(row1);
                Mutation mut2 = new Mutation(row2);

                Text colFam_name = new Text("Name");
                Text colQual_name = new Text("Identifier");
                ColumnVisibility colVis_name = new ColumnVisibility("public");
                // Now we'll add them to the mutations
                mut1.put(colFam_name, colQual_name, colVis_name,
                                new Value("Amal".getBytes()));

                Text colFam_pass = new Text("Password");
                Text colQual_pass = new Text("Secret");
                ColumnVisibility colVis_pass = new ColumnVisibility("private");
                // Now we'll add them to the mutations
                mut2.put(colFam_pass, colQual_pass, colVis_pass,
                                new Value("amal@123".getBytes()));

                // Now we'll make a Batch Writer
                bw = connector.createBatchWriter(table, 100000l, 30l, 1);

                // And add the mutations
                bw.addMutation(mut1);
                bw.addMutation(mut2);

                // Force a send
                bw.flush();

                // Now lets look at the rows
                Scanner rowOne = getRow(new Text("row1"), arrayAuthorizations);
                Scanner rowTwo = getRow(new Text("row2"), arrayAuthorizations);

                // And print them
                log.info("This is everything");
                printRow(rowOne);
                printRow(rowTwo);
                System.out.flush();
                bw.close();

                // and lets clean up our mess
                connector.tableOperations().delete(table);

        }

        /**
         * Just a generic print function given an iterator. Not necessarily just for
         * printing a single row
         *
         * @param scanner
         */
        private static void printRow(Scanner scanner) {
                // iterates through and prints
                for (Entry<Key, Value> entry : scanner)
                        log.info("Key: " + entry.getKey().toString() + " Value: "
                                        + entry.getValue().toString());
        }

        /**
         * Gets a scanner over one row
         *
         * @param row
         * @return
         * @throws TableNotFoundException
         * @throws AccumuloSecurityException
         * @throws AccumuloException
         * @throws IOException
         */
        private static Scanner getRow(Text row, String[] authorizations)
                        throws AccumuloException, AccumuloSecurityException,
                        TableNotFoundException {
                // Create a scanner
                Authorizations auth = new Authorizations(authorizations);

                Scanner scanner = connector.createScanner(table, auth);
                // Say start key is the one with key of row
                // and end key is the one that immediately follows the row
                scanner.setRange(new Range(row));
                return scanner;
        }

}
How to execute ?

·         Make it as runnable jar and copy it to /accumulo/lib/ext folder.

·         /accumulo/bin/accumulo com.amal.accumulo.CellLevelSecurity <instance name> <zoo keepers> <username> <password> <accumulotablename> <securitylevel>