commit | 0c4a93fa8c685e332aaf60460bc50690bd042e7a | [log] [tgz] |
---|---|---|
author | Prasad Nagaraj Subramanya <prasadns14@gmail.com> | Mon Sep 18 10:56:44 2017 -0700 |
committer | Abhishek Girish <agirish@apache.org> | Mon Sep 18 10:56:44 2017 -0700 |
tree | 291e09ec43323cb5c9c3c8a0824c62d4e3799e12 | |
parent | 27aa15918f8f4880340f9df0632945ca359a1bb8 [diff] |
MD-734: Remove multiple copies of execHive.sh (#322)
Test Framework for SQL on Hadoop technologies. Currently supports Apache Drill, a schema-free SQL query engine for Hadoop, NoSQL and cloud storage.
The framework is built for regression, integration & sanity testing. Includes test coverage (with baselines) for core Drill functionality, and supported features. And are used by the Apache Drill community for pre-commit regression and part of the release criteria.
Refer to Github documentation on how to clone a repository.
cd
to an existing test suite).query1.sql
) and expected result files (ex: query1.e_tsv
) are co-located and share the same name.A testcase consists a query file and a baseline file containing the expected result. The query file contains a list of SQL statements. Only one query statement that proceeded with a “--@test” line (can't contain “;”) is verified against the baseline. If no query meets that requirement, the middle query is verified. Here is an example:
-- Some setup queries setup query one; ... --@test The following query will be verified. select * from table; -- Some teardown queries teardown query one; ...
A test definition file is a JSON file that defines one or a set of tests within a directory. The framework scans for files with .json extension in the specified location(s) and executes all test(s) defined.
In the root directory of you repository, issue
bin/build_framework -Pdownload
to build the project and also download any dependent datasets configured in pom.xml
. If you already downloaded the dependent datasets, you can simply issue
bin/build_framework
In the root directory of your repository, execute the following command to run tests:
bin/run_tests -s <suites> -g <groups> -t <Timeout> -x <Exclude> -n <Concurrency> -d
Example:
The Drill version in a regex based baseline file can be substituted with “{DRILL_VERSION}”. At runtime, it will be replaced with actual Drill version and verified against the running version.
The Test Framework is able to verify the order in which rows are returned for some SQL statements that have an order-by clause. If the columns in the order-by clause are also projected (present) in the output, then the Test Framework may be able to check the order of the rows that are returned. Here are some requirements for a query to be validated by the Test Framework:
Here are some queries that can be validated:
complex.json
order by id limit 10;+-----+-------+ | id | gbyi | +-----+-------+ | 1 | 1 | | 2 | 2 | | 3 | 3 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 10 | 10 | +-----+-------+
This query can be validated because the results are ordered by the “id” column, and the “id” column is projected in the output. The Test Framework can examine the output and verify that the rows are in order.
data.json
d order by d.uid) s order by s.evnts.event_time, s.evnts.campaign_id;+-------+--------------------------------------------------------------------------------------------------+ | uid | evnts | +-------+--------------------------------------------------------------------------------------------------+ | 1 | {“event_name”:“e1_name”,“event_time”:1000000,“type”:“cmpgn1”} | | 1 | {“event_name”:“e2_name”,“event_time”:2000000,“type”:“cmpgn4”,“evnt_id”:“e2”,“campaign_id”:“c1”} | | null | {“event_name”:“e2_name”,“event_time”:2000000,“type”:“cmpgn4”,“evnt_id”:“e2”,“campaign_id”:“c1”} | | 1 | {“event_name”:“e3_name”,“event_time”:3000000,“type”:“cmpgn1”,“evnt_id”:“e3”,“campaign_id”:“c1”} | | null | {“event_name”:“e3_name”,“event_time”:3000000,“type”:“cmpgn1”,“evnt_id”:“e3”,“campaign_id”:“c1”} | | null | {“event_name”:“e4_name”,“event_time”:4000000,“type”:“cmpgn1”,“evnt_id”:“e4”,“campaign_id”:“c1”} | | 1 | {“event_name”:“e5_name”,“event_time”:5000000,“type”:“cmpgn3”,“evnt_id”:“e5”,“campaign_id”:“c2”} | | null | {“event_time”:6000000,“type”:“cmpgn9”,“evnt_id”:“e6”,“campaign_id”:“c1”} | | 1 | {“event_name”:“e6_name”,“event_time”:6000000,“type”:“cmpgn9”,“evnt_id”:“e6”} | | null | {“event_name”:“e7_name”,“event_time”:7000000,“type”:“cmpgn3”,“evnt_id”:“e7”,“campaign_id”:“c1”} | | null | {“event_name”:“e8_name”,“event_time”:8000000,“type”:“null”,“evnt_id”:“e8”,“campaign_id”:“c2”} | | 1 | {“event_name”:“e8_name”,“event_time”:8000000,“type”:“cmpgn2”,“evnt_id”:“e8”,“campaign_id”:“c2”} | | null | {“event_time”:9000000,“type”:“cmpgn4”,“evnt_id”:“e9”,“campaign_id”:“c2”} | | 1 | {“event_name”:“e9_name”,“event_time”:9000000,“type”:“cmpgn4”,“evnt_id”:“e9”} | | 1 | {“event_name”:“e7_name”,“type”:“cmpgn3”,“evnt_id”:“e7”,“campaign_id”:“c1”} | | 1 | {“event_name”:“e4_name”,“type”:“cmpgn1”,“evnt_id”:“e4”,“campaign_id”:“c1”} | | null | {“event_name”:“e1_name”,“type”:“cmpgn9”,“campaign_id”:“c1”} | | null | {“event_name”:“e5_name”,“type”:“cmpgn2”,“evnt_id”:“e5”,“campaign_id”:“c2”} | +-------+--------------------------------------------------------------------------------------------------+
This query can be validated because the results are ordered by the data in the “evnts” column, and the “evnts” column is projected in the output. The Test Framework can parse the JSON string in the “evnts” column and examine the event_time and campaign_id values.
These queries cannot be validated:
in
zeroin, t.ooa[1].fl.f1 flf1, t.ooa[1].fl.f2 flf2, t.ooa[1].in
onein, t.ooa[2].a.aa.aaa, t.ooa[2].b.bb.bbb, t.ooa[2].c.cc.ccc from complex.json
t where t.ooa[2].b.bb.bbb is not null order by t.ooa[2].c.cc.ccc limit 10;+-------+--------+---------+-------------+-----------+--------+------------+------------+------------+ | gbyt | id | zeroin | flf1 | flf2 | onein | EXPR$6 | EXPR$7 | EXPR$8 | +-------+--------+---------+-------------+-----------+--------+------------+------------+------------+ | aaa | 10 | null | null | null | 10 | aaa 10 | bbb 10 | ccc 10 | | ooos | 1000 | null | null | null | 1000 | aaa 1000 | bbb 1000 | ccc 1000 | | nul | 10002 | null | null | null | 10002 | aaa 10002 | bbb 10002 | ccc 10002 | | sba | 10003 | 10003 | 10003.6789 | 154351.0 | 10003 | aaa 10003 | bbb 10003 | ccc 10003 | | str | 10008 | 10008 | null | null | 10008 | aaa 10008 | bbb 10008 | ccc 10008 | | fl | 10009 | 10009 | null | null | 10009 | aaa 10009 | bbb 10009 | ccc 10009 | | saa | 1001 | null | 1001.6789 | 64331.0 | 1001 | aaa 1001 | bbb 1001 | ccc 1001 | | soa | 10023 | null | null | null | 10023 | aaa 10023 | bbb 10023 | ccc 10023 | | nul | 10028 | null | 10028.6789 | 154601.0 | 10028 | aaa 10028 | bbb 10028 | ccc 10028 | | ooos | 10029 | null | 10029.6789 | 154611.0 | 10029 | aaa 10029 | bbb 10029 | ccc 10029 | +-------+--------+---------+-------------+-----------+--------+------------+------------+------------+
This query cannot be validated because the order-by has an expression, “t.ooa[2].c.cc.ccc”. The Test Framework cannot evaluate the array reference “ooa[2]”.
complex.json
order by gbyi limit 10;+------+ | id | +------+ | 106 | | 121 | | 91 | | 46 | | 61 | | 31 | | 1 | | 76 | | 16 | | 136 | +------+
This query cannot be validated because the “gbyi” column is not projected so the Test Framework cannot determine what the order should be.
Zhiyong
Chun
Krystal
Ramana
Abhishek
Vicky
Suresh
Khurram
Rahul
Steven
Jacques
Jason
Sudheesh
Robert