These commands should work by copy-pasting.
# clone the repo ~ $ git clone --single-branch --branch main git@github.com:apache/lucene-jira-archive.git # move to the tool's directory ~ $ cd lucene-jira-archive/migration/ # download and unarchive the GitHub importable data (we will upload the tgz) migration $ wget https://home.apache.org/~tomoko/github-import-data.tgz migration $ tar xzf github-import-data.tgz migration $ tree -L 1 . ├── README.md ├── github-import-data ├── github-import-data.tgz ├── mappings-data ├── requirements.txt └── src migration $ ls -1 github-import-data GH-LUCENE-1.json GH-LUCENE-2.json GH-LUCENE-3.json ... GH-LUCENE-10676.json GH-LUCENE-10677.json # set the GitHub PAT token to an env variable migration $ cp .env.example .env migration $ vi .env export GITHUB_PAT=<set the personal access token to be used for importing here> # other lines don't need to be touched # set env variables from .env migration $ source .env # setup python virtual env # note that the script was tested with python 3.9 migration $ python -V Python 3.9.13 migration $ python -m venv .venv migration $ . .venv/bin/activate (.venv) migration $ pip install -r requirements.txt (.venv) migration $ pip freeze certifi==2022.6.15 charset-normalizer==2.0.12 idna==3.3 jira2markdown==0.2.1 pyparsing==2.4.7 python-dateutil==2.8.2 requests==2.28.0 six==1.16.0 urllib3==1.26.9
To make sure everything is correctly set up, you can import one issue for a trial. This command imports only LUCENE-1 to GitHub apache/lucene
repo.
(.venv) migration $ python src/import_github_issues.py --min 1
If the command is successfully done, you'll see an issue id mapping file mapping-data/issue-map.csv
. This will look like this.
(.venv) migration $ cat mappings-data/issue-map.csv JiraKey,GitHubUrl,GitHubNumber LUCENE-1,https://github.com/apache/lucene/issues/1080,1080
Once the test is done, please delete mapping-data/issue-map.csv
file and the imported issue (only admin accounts can delete an issue) before the actual migration.
Please specify the min
option to 1 and max
option to the maximum number of the Lucene Jira issue, that will be known by then.
(.venv) migration $ nohup python src/import_github_issues.py --min 1 --max <will be known> & # would take 24 hours
The import script outputs two files. Both are important for succeeding steps, please send them back to us via any channels (e.g., attach them to the Jira issue).
migration $ ls log/import_github_issues_yyyy-mm-ddTHH:MM:SS.log # log file migration $ ls mappings-data/issue-map.csv # Jira - GitHub issue id mapping file
Thank you!