GitHub has a rate limit of 5,000 API calls per hour for their REST API. As a result, it may take hours to collect commits data from GitHub API for a repo that has 10,000+ commits. To accelerate the process, DevLake introduces GitExtractor, a new plugin that collects git data by cloning the git repo instead of by calling GitHub APIs.
Starting from v0.10.0, DevLake will collect GitHub data in 2 separate plugins:
Note that GitLab plugin still collects commits via API by default since GitLab has a much higher API rate limit.
This doc details the process of collecting GitHub data in v0.10.0. We're working on simplifying this process in the next releases.
Before start, please make sure all services are started.
There're 3 steps.
Visit config-ui at http://localhost:4000 and click the GitHub icon
Click the default connection ‘Github’ in the list 
Configure connection by providing your GitHub API endpoint URL and your personal access token(s). 
Click ‘Test Connection’ and see it's working, then click ‘Save Connection’.
[Optional] Help DevLake understand your GitHub data by customizing data enrichment rules shown below. 
Pull Request Enrichment Options
Type: PRs with label that matches given Regular Expression, their properties type will be set to the value of first sub match. For example, with Type being set to type/(.*)$, a PR with label type/bug, its type would be set to bug, with label type/doc, it would be doc.Component: Same as above, but for component property.Issue Enrichment Options
Severity: Same as above, but for issue.severity of course.
Component: Same as above.
Priority: Same as above.
Requirement : Issues with label that matches given Regular Expression, their properties type will be set to REQUIREMENT. Unlike PR.type, submatch does nothing, because for Issue Management Analysis, people tend to focus on 3 kinds of types (Requirement/Bug/Incident), however, the concrete naming varies from repo to repo, time to time, so we decided to standardize them to help analysts make general purpose metrics.
Bug: Same as above, with type setting to BUG
Incident: Same as above, with type setting to INCIDENT
Click ‘Save Settings’
config-ui

You'll be redirected to newly created pipeline:

See the pipeline finishes (progress 100%):

GitExtractor plugin, and enter your Git URL and, select the Repository ID from dropdown menu.
Click ‘Run Pipeline’ and wait until it's finished.
Click View Dashboards on the top left corner of config-ui, the default username and password of Grafana are admin.

Please see How to create recurring pipelines for details.