| commit | 64abb33fada8748bdb7bfca09cb3b41a31acfbb5 | [log] [tgz] |
|---|---|---|
| author | Mike Miller <mmiller@apache.org> | Tue Jul 09 12:37:00 2019 -0400 |
| committer | Mike Miller <mmiller@apache.org> | Tue Jul 09 12:37:00 2019 -0400 |
| tree | 5b549a596fb61c599ece8718f6fb97e9bddf121f | |
| parent | f38a6db89fe6e26ce0b0fdba0305bbae1dfc943f [diff] |
Update Travis config * Log keeps filling up so switching maven to use "-q" seems to be the only thing that works
Follow the steps below to run the Accumulo examples:
Clone this repository
git clone https://github.com/apache/accumulo-examples.git
Follow Accumulo's quickstart to install and run an Accumulo instance. Accumulo has an accumulo-client.properties in conf/ that must be configured as the examples will use this file to connect to your instance.
Review env.sh.example in to see if you need to customize it. If ACCUMULO_HOME & HADOOP_HOME are set in your shell, you may be able skip this step. Make sure ACCUMULO_CLIENT_PROPS is set to the location of your accumulo-client.properties.
cp conf/env.sh.example conf/env.sh vim conf/env.sh
Build the examples repo and copy the examples jar to Accumulo's lib/ext directory:
./bin/build cp target/accumulo-examples.jar /path/to/accumulo/lib/ext/
Each Accumulo example has its own documentation and instructions for running the example which are linked to below.
When running the examples, remember the tips below:
runex or runmr commands which are located in the bin/ directory of this repo. The runex command is a simple script that use the examples shaded jar to run a a class. The runmr starts a MapReduce job in YARN.accumulo and accumulo-util commands which are expected to be on your PATH. These commands are found in the bin/ directory of your Accumulo installation.Each example below highlights a feature of Apache Accumulo.
| Example | Description |
|---|---|
| batch | Using the batch writer and batch scanner |
| bloom | Creating a bloom filter enabled table to increase query performance |
| bulkIngest | Ingesting bulk data using map/reduce jobs on Hadoop |
| classpath | Using per-table classpaths |
| client | Using table operations, reading and writing data in Java. |
| combiner | Using example StatsCombiner to find min, max, sum, and count. |
| compactionStrategy | Configuring a compaction strategy |
| constraints | Using constraints with tables. Limit the mutation size to avoid running out of memory |
| deleteKeyValuePair | Deleting a key/value pair and verifying the deletion in RFile. |
| dirlist | Storing filesystem information. |
| export | Exporting and importing tables. |
| filedata | Storing file data. |
| filter | Using the AgeOffFilter to remove records more than 30 seconds old. |
| helloworld | Inserting records both inside map/reduce jobs and outside. And reading records between two rows. |
| isolation | Using the isolated scanner to ensure partial changes are not seen. |
| regex | Using MapReduce and Accumulo to find data using regular expressions. |
| reservations | Using conditional mutations to implement simple reservation system. |
| rgbalancer | Using a balancer to spread groups of tablets within a table evenly |
| rowhash | Using MapReduce to read a table and write to a new column in the same table. |
| sample | Building and using sample data in Accumulo. |
| shard | Using the intersecting iterator with a term index partitioned by document. |
| spark | Using Accumulo as input and output for Apache Spark jobs |
| tabletofile | Using MapReduce to read a table and write one of its columns to a file in HDFS. |
| terasort | Generating random data and sorting it using Accumulo. |
| uniquecols | Use MapReduce to count unique columns in Accumulo |
| visibility | Using visibilities (or combinations of authorizations). Also shows user permissions. |
| wordcount | Use MapReduce and Accumulo to do a word count on text files |
This repository can be used to test Accumulo release candidates. See docs/release-testing.md.