Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This project had 2 modules :
In addition to falcon server and prism, running full falcon regression requires three clusters. Each of these clusters must have:
Prior to running tests Merlin.properties needs to be created and populated with cluster details.
Merlin.properties must be created before running falcon regression tests. The file must be created at the location:
falcon/falcon-regression/merlin/src/main/resources/Merlin.properties
Populate it with prism related properties:
#prism properties prism.oozie_url = http://node-1.example.com:11000/oozie/ prism.oozie_location = /usr/lib/oozie/bin prism.qa_host = node-1.example.com prism.service_user = falcon prism.hadoop_url = node-1.example.com:8020 prism.hadoop_location = /usr/lib/hadoop/bin/hadoop prism.hostname = http://node-1.example.com:15000 prism.storeLocation = hdfs://node-1.example.com:8020/apps/falcon
Specify the clusters that you would be using for testing:
servers = cluster1,cluster2,cluster3
For each cluster specify properties:
#cluster1 properties cluster1.oozie_url = http://node-1.example.com:11000/oozie/ cluster1.oozie_location = /usr/lib/oozie/bin cluster1.qa_host = node-1.example.com cluster1.service_user = falcon cluster1.password = rgautam cluster1.hadoop_url = node-1.example.com:8020 cluster1.hadoop_location = /usr/lib/hadoop/bin/hadoop cluster1.hostname = http://node-1.example.com:15000 cluster1.cluster_readonly = webhdfs://node-1.example.com:50070 cluster1.cluster_execute = node-1.example.com:8032 cluster1.cluster_write = hdfs://node-1.example.com:8020 cluster1.activemq_url = tcp://node-1.example.com:61616?daemon=true cluster1.storeLocation = hdfs://node-1.example.com:8020/apps/falcon cluster1.colo = default cluster1.namenode.kerberos.principal = nn/node-1.example.com@none cluster1.hive.metastore.kerberos.principal = hive/node-1.example.com@none cluster1.hcat_endpoint = thrift://node-1.example.com:9083 cluster1.service_stop_cmd = /usr/lib/falcon/bin/falcon-stop cluster1.service_start_cmd = /usr/lib/falcon/bin/falcon-start
To not clean root tests dir before every test:
clean_tests_dir=false
On all cluster as user that started falcon server do:
hdfs dfs -mkdir -p /tmp/falcon-regression-staging hdfs dfs -chmod 777 /tmp/falcon-regression-staging hdfs dfs -mkdir -p /tmp/falcon-regression-working hdfs dfs -chmod 755 /tmp/falcon-regression-working
After creating Merlin.properties file. You can run the following commands to run the tests.
cd falcon-regression mvn clean test -Phadoop-2
Profiles Supported: hadoop-2
To run a specific test:
mvn clean test -Phadoop-2 -Dtest=EmbeddedPigScriptTest
If you want to use specific version of any component, they can be specified using -D, for eg:
mvn clean test -Phadoop-2 -Doozie.version=4.1.0 -Dhadoop.version=2.6.0
ACL tests require multiple user account setup:
other.user.name=root falcon.super.user.name=falcon falcon.super2.user.name=falcon2
ACL tests also require group name of the current user:
current_user.group.name=users
For testing with kerberos set keytabs properties for different users:
current_user_keytab=/home/qa/hadoopqa/keytabs/qa.headless.keytab falcon.super.user.keytab=/home/qa/hadoopqa/keytabs/falcon.headless.keytab falcon.super2.user.keytab=/home/qa/hadoopqa/keytabs/falcon2.headless.keytab other.user.keytab=/home/qa/hadoopqa/keytabs/root.headless.keytab
If you wish to contribute to falcon regression, it's as easy as it gets. All test classes must be added to the directory:
falcon/falcon-regression/merlin/src/test/java
This directory contains sub directories such as prism, ui, security, etc which contain tests specific to these aspects of falcon. Any general test can be added directly to the parent directory above. If you wish to write a series of tests for a new feature, feel free to create a new sub directory. Your test can use the various process/feed/cluster/workflow templates present in:
falcon/falcon-regression/merlin/src/test/resources
or you can add your own bundle of XMLs in this directory. Please avoid redundancy of any resource.
Each test class can contain multiple related tests. Let us look at a sample test class. Refer to comments in code for aid :
//The License note must be added to each test /** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.falcon.regression; import org.apache.falcon.regression.core.bundle.Bundle; import org.apache.falcon.regression.core.helpers.ColoHelper; import org.apache.falcon.regression.core.response.ServiceResponse; import org.apache.falcon.regression.core.util.AssertUtil; import org.apache.falcon.regression.core.util.BundleUtil; import org.apache.falcon.regression.testHelper.BaseTestClass; import org.testng.annotations.AfterMethod; import org.testng.annotations.BeforeMethod; import org.testng.annotations.Test; @Test(groups = "embedded") //Every test class must inherit the BaseTestClass. This class //helps using properties mentioned in Merlin.properties, in the test. public class FeedSubmitTest extends BaseTestClass { private ColoHelper cluster = servers.get(0); private String feed; @BeforeMethod(alwaysRun = true) public void setUp() throws Exception { //Several Util classes are available, such as BundleUtil, which for example //has been used here to read the ELBundle present in falcon/falcon-regression/src/test/resources bundles[0] = BundleUtil.readELBundle(); bundles[0].generateUniqueBundle(); bundles[0] = new Bundle(bundles[0], cluster); //submit the cluster ServiceResponse response = prism.getClusterHelper().submitEntity(bundles[0].getClusters().get(0)); AssertUtil.assertSucceeded(response); feed = bundles[0].getInputFeedFromBundle(); } @AfterMethod(alwaysRun = true) public void tearDown() { removeBundles(); } //Java docs must be added for each test function, explaining what the function does /** * Submit correctly adjusted feed. Response should reflect success. * * @throws Exception */ @Test(groups = {"singleCluster"}) public void submitValidFeed() throws Exception { ServiceResponse response = prism.getFeedHelper().submitEntity(feed); AssertUtil.assertSucceeded(response); } /** * Submit and remove feed. Try to submit it again. Response should reflect success. * * @throws Exception */ @Test(groups = {"singleCluster"}) public void submitValidFeedPostDeletion() throws Exception { ServiceResponse response = prism.getFeedHelper().submitEntity(feed); AssertUtil.assertSucceeded(response); response = prism.getFeedHelper().delete(feed); AssertUtil.assertSucceeded(response); response = prism.getFeedHelper().submitEntity(feed); AssertUtil.assertSucceeded(response); } }
This class, as the name suggests was to test the Feed Submition aspect of Falcon. It contains multiple test functions, all of which however are various test cases for the same feature. This organisation in code must be maintained.
In order to be able to manipulate feeds, processes and clusters for the various tests, objects of classes FeedMerlin, ProcessMerlin, ClusterMerlin can be used. There are already existing functions which use these objects, such as setProcessInput, setFeedValidity, setProcessConcurrency, setInputFeedPeriodicity etc. in Bundle.java which should serve your purpose well enough.
To add more on the utils, you can use functions in HadoopUtil to create HDFS dirs, delete them, and add data on HDFS, OozieUtil to hit Oozie for checking coordinator/workflow status, TimeUtil to get lists of dates and directories to aid in data creation, HCatUtil for Hcatalog related utilities, and many others to make writing tests very easy.
Coding conventions are strictly followed. Use the checkstyle xml present in falcon/checkstyle/src/main/resources/falcon
in your project to not get checkstyle errors.
Some tests switch user to run commands as a different user. Location of binary to switch user is configurable:
windows.su.binary=ExecuteAs.exe
For full falcon regression runs. It might be desirable to pull all oozie job info and logs at the end of the test. This can be done by configuring Merlin.properties:
log.capture.oozie = true log.capture.oozie.skip_info = false log.capture.oozie.skip_log = true log.capture.location = ../
Dumping entities generated by falcon
Add -Dmerlin.dump.staging to the maven command. For example:
mvn clean test -Phadoop-2 -Dmerlin.dump.staging=true