commit	71a2d79bb61fba46b4466b626596a431a992cfbc	[log] [tgz]
author	wankunde <wankunde@163.com>	Fri Nov 01 21:56:28 2019 +0800
committer	William Guo <guoyp@apache.org>	Fri Nov 01 21:56:28 2019 +0800
tree	3730e1332ea5c75a57af47db07f27294ddacc697
parent	0119a19c93d0607ecfda2c5e029169e689c2bfd0 [diff]

commit

71a2d79bb61fba46b4466b626596a431a992cfbc

[log] [tgz]

author

wankunde <wankunde@163.com>

Fri Nov 01 21:56:28 2019 +0800

committer

William Guo <guoyp@apache.org>

Fri Nov 01 21:56:28 2019 +0800

tree

3730e1332ea5c75a57af47db07f27294ddacc697

parent

0119a19c93d0607ecfda2c5e029169e689c2bfd0 [diff]

[GRIFFIN-295] Limit the memory used by test case The container memory size is 3G in travis, but out test cases always uses more than 3G memory, so `Cannot allocate memory` will be thrown. ``` Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000fe980000, 23592960, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 23592960 bytes for committing reserved memory. # An error report file with more information is saved as: # /home/travis/build/apache/griffin/measure/hs_err_pid11948.log # [ timer expired, abort... ] ``` There are two kind of programs in our tests, the maven main program and the tests run by maven-surefire-plugin and scalatest-maven-plugin. If the memory is unlimited, test cases will occupy as much memory as possible especially spark jobs. Spark jobs will not free the memory until a full GC occurs , even if we have stopped the spark context .so we need to limit the momery used by test cases. We can limit the maven memory used by set export MAVEN_OPTS=" -Xmx1024m -XX:ReservedCodeCacheSize=128m" , and we can limit the memory used by spark job tests by configuring the maven-surefire-plugin and scalatest-maven-plugin. For example: Before we limit the memory used, maven program occupy 1.5G memory and spark job occupy 1.8G memory. <img width="1153" alt="1" src="https://user-images.githubusercontent.com/3626747/67956554-40108e00-fc2f-11e9-83de-d0840fb42cb7.png"> <img width="1150" alt="2" src="https://user-images.githubusercontent.com/3626747/67956567-46066f00-fc2f-11e9-8a73-6d141be28e70.png"> After we limit the memory used, maven program occupy 1G memory and spark job occupy 1G memory. <img width="1142" alt="3" src="https://user-images.githubusercontent.com/3626747/67956579-4999f600-fc2f-11e9-9cd4-9032966ca923.png"> <img width="1139" alt="4" src="https://user-images.githubusercontent.com/3626747/67956586-4dc61380-fc2f-11e9-800b-1d26d637a479.png"> Author: wankunde <wankunde@163.com> Closes #546 from wankunde/testcase_memory_limit.

tree: 3730e1332ea5c75a57af47db07f27294ddacc697

README.md

Apache Griffin

The data quality (DQ) is a key criteria for many data consumers like IoT, machine learning etc., however, there is no standard agreement on how to determine “good” data. Apache Griffin is a model-driven data quality service platform where you can examine your data on-demand. It provides a standard process to define data quality measures, executions and reports, allowing those examinations across multiple data systems. When you don't trust your data, or concern that poorly controlled data can negatively impact critical decision, you can utilize Apache Griffin to ensure data quality.

Getting Started

Quick Start

You can try running Griffin in docker following the docker guide.

Environment for Dev

Follow Apache Griffin Development Environment Build Guide to set up development environment.
If you want to contribute codes to Griffin, please follow Apache Griffin Development Code Style Config Guide to keep consistent code style.

Deployment at Local

If you want to deploy Griffin in your local environment, please follow Apache Griffin Deployment Guide.

Community

For more information about Griffin, please visit our website at: griffin home page.

You can contact us via email:

dev-list: dev@griffin.apache.org
user-list: users@griffin.apache.org

You can also subscribe the latest information by sending a email to subscribe dev-list and subscribe user-list. You can also subscribe the latest information by sending a email to subscribe dev-list and user-list:

dev-subscribe@griffin.apache.org
users-subscribe@griffin.apache.org

You can access our issues on JIRA page

Contributing

See How to Contribute for details on how to contribute code, documentation, etc.

Here's the most direct way to contribute your work merged into Apache Griffin.

Fork the project from github
Clone down your fork
Implement your feature or bug fix and commit changes
Push the branch up to your fork
Send a pull request to Apache Griffin master branch