| ## Structure of Test Framework |
| <pre><code> |
| drill-test-framework |
| |_ framework |
| |_ resources |
| |_ Functional (default location for test suites) |
| |_ Advanced (test suites requiring large datasets) |
| |_ Datasources (datasets and scripts) |
| |_ bin |
| |_ build_framework (script used to compile the framework) |
| |_ run_tests (script used to execute tests) |
| </code></pre> |
| |
| ## Adding Tests |
| 1. In the [Functional](framework/resources/Functional) / [Advanced](framework/resources/Advanced) directory, create a directory for a new test suite (or `cd` to an existing test suite). |
| 2. In the test suite directory, add testcases, expected results, and test definition file(s). Optionally include test suite sub directories to organize tests. |
| 3. Pairs of testcase (ex: `query1.sql`) and expected result files (ex: `query1.e_tsv`) are co-located and share the same name. |
| 4. You could generate expected result files using Postgres or any such database. |
| 5. In the [Datasources][framework/resources/Datasources] directory, create corresponding datasource directories and copy over any required scripts and datasets required by the tests. |
| 6. Refer to [README](framework/resources/Advanced/README.md) for downloading datasets required for the Advanced tests. |
| |
| ## Add testcases |
| A testcase consists a query file and a baseline file containing the expected result. The query file contains a list of SQL statements. Only one query statement that proceeded with a "--@test" line (can't contain ";") is verified against the baseline. If no query meets that requirement, the middle query is verified. Here is an example: |
| |
| ``` |
| -- Some setup queries |
| setup query one; |
| ... |
| --@test The following query will be verified. |
| select * from table; |
| -- Some teardown queries |
| teardown query one; |
| ... |
| ``` |
| |
| |
| ## Structure of test definition files. |
| A test definition file is a JSON file that defines one or a set of tests within a directory. The framework scans for files with .json extension in the specified location(s) and executes all test(s) defined. |
| |
| <pre><code> |
| { |
| "testId": "...", // Unique identification string for test suite. Ex: window_function_tests |
| "type": "group", // Group of tests |
| "negative": false, // Tests are positive or negative (expected success or expected failures) |
| "description": "...", // Description for test suites |
| "submit-type": "jdbc", // Currently the support types: jdbc, odbc, script |
| "script": "Functional/script_location/script.sh paras" // Specify the test script to execute |
| "queryType": "sql", |
| "timeout": null, |
| "categories": [ |
| "...", // Labels for categorizing test suites. Ex: smoke |
| "..." |
| ], |
| "dependencies": [ |
| "...", // Additional dependencies in setup. Ex: hive, hbase, maprdb |
| "..." |
| ], |
| "matrices": [ |
| { |
| "query-file": `".*.sql"`, // File name pattern of query files (REGEX). The example will match any filename with ".sql" extension |
| "schema": "dfs.tmp", // Drill storage plugin.workspace, or just the storage plugin name in case of hive, hbase |
| "output-format": "tsv", // Delimiter to seperate columns in expected result files |
| "expected-file": `".*.e_tsv"`, // File name pattern of expected result files (REGEX) |
| "username": "user", |
| "password": "pass", |
| "verification-type": [ // Supported type "in-memory", "regex" |
| "in-memory", |
| "..." |
| ] |
| } |
| ], |
| "datasources": [ |
| { |
| "mode": "cp", // Use "cp" to copy datasets |
| "src": "`<Source path on local file system>`", |
| "dest": "`<Destination path on DFS>`" |
| }, |
| { |
| "mode": "gen", // Use "gen" to execute scripts |
| "src": "`<Source path on local file system>`", |
| "dest": "" |
| }, |
| ... |
| ] |
| } |
| </code></pre> |
| |
| ### Special handling for Drill version in regex based baseline verification |
| |
| The Drill version in a regex based baseline file can be substituted with "{DRILL_VERSION}". |
| At runtime, it will be replaced with actual Drill version and verified against the running version. |
| |
| ### Special handling for order-by tests |
| |
| The Test Framework is able to verify the order in which rows are returned for |
| some SQL statements that have an order-by clause. If the columns in the order-by |
| clause are also projected (present) in the output, then the Test Framework may |
| be able to check the order of the rows that are returned. Here are some |
| requirements for a query to be validated by the Test Framework: |
| |
| 1) All columns/fields in the order-by clause must appear in the projection list |
| 2) Expressions cannot be used in the order-by clause. Things like |
| "order by column.field[2]". The [2] indicates an expression which is the |
| third element in the field array. |
| 3) Referencing a field within a json string in a column is more complicated. |
| Most cases are supported. Some cases may not work. |
| 4) If a query references more than one table, then use aliases for each column |
| in the projection list, and reference these aliases in the order-by clause. |
| Using aliases is a good practice in general when verifying an order-by clause. |
| 5) The order-by clause cannot be followed by another SQL operation except for |
| limit. If the order-by clause is followed by an offset or collate, for |
| example, it might not work. |
| |
| Here are some queries that can be validated: |
| |
| 1) select id, gbyi from `complex.json` order by id limit 10; |
| |
| | id | gbyi | |
| |-----|-------| |
| | 1 | 1 | |
| | 2 | 2 | |
| | 3 | 3 | |
| | 4 | 4 | |
| | 5 | 5 | |
| | 6 | 6 | |
| | 7 | 7 | |
| | 8 | 8 | |
| | 9 | 9 | |
| | 10 | 10 | |
| |
| This query can be validated because the results are ordered by the "id" column, |
| and the "id" column is projected in the output. The Test Framework can |
| examine the output and verify that the rows are in order. |
| |
| 2) select * from |
| (select d.uid uid, flatten(d.events) evnts from `data.json` d order by d.uid) s |
| order by s.evnts.event_time, s.evnts.campaign_id; |
| |
| | uid | evnts | |
| |-------|--------------------------------------------------------------------------------------------------| |
| | 1 | {"event_name":"e1_name","event_time":1000000,"type":"cmpgn1"} | |
| | 1 | {"event_name":"e2_name","event_time":2000000,"type":"cmpgn4","evnt_id":"e2","campaign_id":"c1"} | |
| | null | {"event_name":"e2_name","event_time":2000000,"type":"cmpgn4","evnt_id":"e2","campaign_id":"c1"} | |
| | 1 | {"event_name":"e3_name","event_time":3000000,"type":"cmpgn1","evnt_id":"e3","campaign_id":"c1"} | |
| | null | {"event_name":"e3_name","event_time":3000000,"type":"cmpgn1","evnt_id":"e3","campaign_id":"c1"} | |
| | null | {"event_name":"e4_name","event_time":4000000,"type":"cmpgn1","evnt_id":"e4","campaign_id":"c1"} | |
| | 1 | {"event_name":"e5_name","event_time":5000000,"type":"cmpgn3","evnt_id":"e5","campaign_id":"c2"} | |
| | null | {"event_time":6000000,"type":"cmpgn9","evnt_id":"e6","campaign_id":"c1"} | |
| | 1 | {"event_name":"e6_name","event_time":6000000,"type":"cmpgn9","evnt_id":"e6"} | |
| | null | {"event_name":"e7_name","event_time":7000000,"type":"cmpgn3","evnt_id":"e7","campaign_id":"c1"} | |
| | null | {"event_name":"e8_name","event_time":8000000,"type":"null","evnt_id":"e8","campaign_id":"c2"} | |
| | 1 | {"event_name":"e8_name","event_time":8000000,"type":"cmpgn2","evnt_id":"e8","campaign_id":"c2"} | |
| | null | {"event_time":9000000,"type":"cmpgn4","evnt_id":"e9","campaign_id":"c2"} | |
| | 1 | {"event_name":"e9_name","event_time":9000000,"type":"cmpgn4","evnt_id":"e9"} | |
| | 1 | {"event_name":"e7_name","type":"cmpgn3","evnt_id":"e7","campaign_id":"c1"} | |
| | 1 | {"event_name":"e4_name","type":"cmpgn1","evnt_id":"e4","campaign_id":"c1"} | |
| | null | {"event_name":"e1_name","type":"cmpgn9","campaign_id":"c1"} | |
| | null | {"event_name":"e5_name","type":"cmpgn2","evnt_id":"e5","campaign_id":"c2"} | |
| |
| This query can be validated because the results are ordered by the data in the |
| "evnts" column, and the "evnts" column is projected in the output. The Test |
| Framework can parse the JSON string in the "evnts" column and examine the |
| event_time and campaign_id values. |
| |
| |
| These queries cannot be validated: |
| |
| 1) select t.gbyt, t.id, t.ooa[0].`in` zeroin, t.ooa[1].fl.f1 flf1, t.ooa[1].fl.f2 flf2, t.ooa[1].`in` onein, t.ooa[2].a.aa.aaa, t.ooa[2].b.bb.bbb, t.ooa[2].c.cc.ccc from `complex.json` t where t.ooa[2].b.bb.bbb is not null order by t.ooa[2].c.cc.ccc limit 10; |
| |
| | gbyt | id | zeroin | flf1 | flf2 | onein | EXPR$6 | EXPR$7 | EXPR$8 | |
| |-------|--------|---------|-------------|-----------|--------|------------|------------|------------| |
| | aaa | 10 | null | null | null | 10 | aaa 10 | bbb 10 | ccc 10 | |
| | ooos | 1000 | null | null | null | 1000 | aaa 1000 | bbb 1000 | ccc 1000 | |
| | nul | 10002 | null | null | null | 10002 | aaa 10002 | bbb 10002 | ccc 10002 | |
| | sba | 10003 | 10003 | 10003.6789 | 154351.0 | 10003 | aaa 10003 | bbb 10003 | ccc 10003 | |
| | str | 10008 | 10008 | null | null | 10008 | aaa 10008 | bbb 10008 | ccc 10008 | |
| | fl | 10009 | 10009 | null | null | 10009 | aaa 10009 | bbb 10009 | ccc 10009 | |
| | saa | 1001 | null | 1001.6789 | 64331.0 | 1001 | aaa 1001 | bbb 1001 | ccc 1001 | |
| | soa | 10023 | null | null | null | 10023 | aaa 10023 | bbb 10023 | ccc 10023 | |
| | nul | 10028 | null | 10028.6789 | 154601.0 | 10028 | aaa 10028 | bbb 10028 | ccc 10028 | |
| | ooos | 10029 | null | 10029.6789 | 154611.0 | 10029 | aaa 10029 | bbb 10029 | ccc 10029 | |
| |
| This query cannot be validated because the order-by has an expression, |
| "t.ooa[2].c.cc.ccc". The Test Framework cannot evaluate the array reference |
| "ooa[2]". |
| |
| 2) select id from `complex.json` order by gbyi limit 10; |
| |
| | id | |
| |------| |
| | 106 | |
| | 121 | |
| | 91 | |
| | 46 | |
| | 61 | |
| | 31 | |
| | 1 | |
| | 76 | |
| | 16 | |
| | 136 | |
| |
| This query cannot be validated because the "gbyi" column is not projected so |
| the Test Framework cannot determine what the order should be. |