These commands should work by copy-pasting.
# clone the repo ~ $ git clone --single-branch --branch main git@github.com:apache/lucene-jira-archive.git # move to the tool's directory ~ $ cd lucene-jira-archive/migration/ # download and unarchive the GitHub importable data (i will upload the tgz) migration $ wget https://home.apache.org/~tomoko/github-import-data.tgz migration $ tar xzf github-import-data.tgz migration $ tree -L 1 . ├── README.md ├── github-import-data ├── github-import-data.tgz ├── mappings-data ├── requirements.txt └── src migration $ ls -1 github-import-data GH-LUCENE-1.json GH-LUCENE-2.json GH-LUCENE-3.json ... # set the PAT token to an env variable migration $ cp .env.example .env migration $ vi .env export GITHUB_PAT=<set the personal access token here> # other lines don't need to be touched # set env variables from .env migration $ source .env # setup python virtual env # not that the script was tested with python 3.9 migration $ python -V Python 3.9.13 migration $ python -m venv .venv migration $ . .venv/bin/activate (.venv) migration $ pip install -r requirements.txt (.venv) migration $ pip freeze certifi==2022.6.15 charset-normalizer==2.0.12 idna==3.3 jira2markdown==0.2.1 pyparsing==2.4.7 python-dateutil==2.8.2 requests==2.28.0 six==1.16.0 urllib3==1.26.9
To make sure everything is correctly set up, you can import one issue for a trial. This command imports only LUCENE-1 to GitHub apache/lucene
repo.
(.venv) migration $ python src/import_github_issues.py --min 1
If the command is successfully done, you'll see an issue id mapping file mapping-data/issue-map.csv
. This will look like this.
(.venv) migration $ cat mappings-data/issue-map.csv JiraKey,GitHubUrl,GitHubNumber LUCENE-1,https://github.com/apache/lucene/issues/1080,1080
Once the test is done, please delete mapping-data/issue-map.csv
file and the imported issue (only admin accounts can delete an issue) before the actual migration.
Please specify the min
option to 1 and max
option to the maximum number of the Lucene Jira issue, that will be known by then.
(.venv) migration $ nohup python src/import_github_issues.py --min 1 --max <will be known> & # would take 24 hours
The import script outputs two files. Both are important for succeeding steps, please send me back them via any channels (e.g., attach them to the Jira issue).
migration $ ls log/import_github_issues_yyyy-mm-ddTHH:MM:SS.log migration $ ls mappings-data/issue-map.csv
Thank you!