Python 2 (not python 3)
Install the Usergrid Python SDK: https://github.com/jwest-apigee/usergrid-python
With Pip (requires python-pip to be installed): pip install usergrid
With Pip (requires python-pip to be installed): pip install usergrid-tools
The purpose of this document is to provide an overview of the Python Script provided in the same directory which allows you to migrate data, connections and users from one Usergrid platform / org / app to another. This can be used in the upgrade process from Usergrid 1.0 to 2.x since there is no upgrade path.
This script functions by taking source and target endpoint configurations (with credentials) and a set of command-line parameters to read data from one Usergrid instance and write to another. It is written in Python and requires Python 2.7.6+.
There are multiple processes at work in the migration to speed the process up. There is a main thread which reads entities from the API and then publishes the entities with metadata into a Python Queue which has multiple worker processes listening for work. The number of worker threads is configurable by command line parameters.
Usergrid is a Graph database and allows for connections between entities. In order for a connection to be made, both the source entity and the target entity must exist. Therefore, in order to migrate connections it is adviseable to first migrate all the data and then all the connections associated with that data.
As with any migration process there is a source and a target. The source and target have the following parameters:
When iterating a graph it is possible to get stuck in a loop. For example:
A --follows--> B B --likes--> C C --loves--> A
There are two options to prevent getting stuck in a loop:
graph_depth
option - this will limit the graph depth which will be traversed from a given entity.Redis can be used for the following:
If using Redis, version 2.8+ is needed because TTL is used with the ‘ex’ parameter.
Using this script it is not necessary to keep the same application name, org name and/or collection name as the source at the target. For example, you could migrate from /myOrg/myApp/myCollection to /org123/app456/collections789.
Example source/target configuration files:
{ "endpoint": { "api_url": "https://api.usergrid.com" }, "credentials": { "myOrg1": { "client_id": "YXA6lpg9sEaaEeONxG0g3Uz44Q", "client_secret": "ZdF66u2h3Hc7csOcsEtgewmxalB1Ygg" }, "myOrg2": { "client_id": "ZXf63p239sDaaEeONSG0g3Uz44Z", "client_secret": "ZdF66u2h3Hc7csOcsEtgewmxajsadfj32" } } }
Usergrid Org/App Data Migrator optional arguments: -h, --help show this help message and exit --log_dir LOG_DIR path to the place where logs will be written --log_level LOG_LEVEL log level - DEBUG, INFO, WARN, ERROR, CRITICAL -o ORG, --org ORG Name of the org to migrate -a APP, --app APP Name of one or more apps to include, specify none to include all apps -e INCLUDE_EDGE, --include_edge INCLUDE_EDGE Name of one or more edges/connection types to INCLUDE, specify none to include all edges --exclude_edge EXCLUDE_EDGE Name of one or more edges/connection types to EXCLUDE, specify none to include all edges --exclude_collection EXCLUDE_COLLECTION Name of one or more collections to EXCLUDE, specify none to include all collections -c COLLECTION, --collection COLLECTION Name of one or more collections to include, specify none to include all collections --use_name_for_collection USE_NAME_FOR_COLLECTION Name of one or more collections to use [name] instead of [uuid] for creating entities and edges -m {data,none,reput,credentials,graph}, --migrate {data,none,reput,credentials,graph} Specifies what to migrate: data, connections, credentials, audit or none (just iterate the apps/collections) -s SOURCE_CONFIG, --source_config SOURCE_CONFIG The path to the source endpoint/org configuration file -d TARGET_CONFIG, --target_config TARGET_CONFIG The path to the target endpoint/org configuration file --limit LIMIT The number of entities to return per query request -w ENTITY_WORKERS, --entity_workers ENTITY_WORKERS The number of worker processes to do the migration --visit_cache_ttl VISIT_CACHE_TTL The TTL of the cache of visiting nodes in the graph for connections --error_retry_sleep ERROR_RETRY_SLEEP The number of seconds to wait between retrieving after an error --page_sleep_time PAGE_SLEEP_TIME The number of seconds to wait between retrieving pages from the UsergridQueryIterator --entity_sleep_time ENTITY_SLEEP_TIME The number of seconds to wait between retrieving pages from the UsergridQueryIterator --collection_workers COLLECTION_WORKERS The number of worker processes to do the migration --queue_size_max QUEUE_SIZE_MAX The max size of entities to allow in the queue --graph_depth GRAPH_DEPTH The graph depth to traverse to copy --queue_watermark_high QUEUE_WATERMARK_HIGH The point at which publishing to the queue will PAUSE until it is at or below low watermark --min_modified MIN_MODIFIED Break when encountering a modified date before this, per collection --max_modified MAX_MODIFIED Break when encountering a modified date after this, per collection --queue_watermark_low QUEUE_WATERMARK_LOW The point at which publishing to the queue will RESUME after it has reached the high watermark --ql QL The QL to use in the filter for reading data from collections --skip_cache_read Skip reading the cache (modified timestamps and graph edges) --skip_cache_write Skip updating the cache with modified timestamps of entities and graph edges --create_apps Create apps at the target if they do not exist --nohup specifies not to use stdout for logging --graph Use GRAPH instead of Query --su_username SU_USERNAME Superuser username --su_password SU_PASSWORD Superuser Password --inbound_connections Name of the org to migrate --map_app MAP_APP Multiple allowed: A colon-separated string such as 'apples:oranges' which indicates to put data from the app named 'apples' from the source endpoint into app named 'oranges' in the target endpoint --map_collection MAP_COLLECTION One or more colon-separated string such as 'cats:dogs' which indicates to put data from collections named 'cats' from the source endpoint into a collection named 'dogs' in the target endpoint, applicable globally to all apps --map_org MAP_ORG One or more colon-separated strings such as 'red:blue' which indicates to put data from org named 'red' from the source endpoint into a collection named 'blue' in the target endpoint
Use the following command to migrate DATA AND GRAPH (no graph edges or connections between entities). If there are no graph edges (connections) then using -m graph
is not necessary. This will copy all data from all apps in the org ‘myorg’, creating apps in the target org if they do not already exist. Note that --create_apps will be required if the Apps in the target org have not been created.
$ usergrid_data_migrator -o myorg -m graph -w 4 -s mySourceConfig.json -d myTargetConfiguration.json --create_apps
Use the following command to migrate DATA ONLY (no graph edges or connections between entities). This will copy all data from all apps in the org ‘myorg’, creating apps in the target org if they do not already exist. Note that --create_apps will be required if the Apps in the target org have not been created.
$ usergrid_data_migrator -o myorg -m data -w 4 -s mySourceConfig.json -d myTargetConfiguration.json --create_apps
Use the following command to migrate CREDENTIALS for Application-level Users. Note that usergrid.sysadmin.login.allowed=true
must be set in the usergrid-deployment.properties
file on the source and target Tomcat nodes.
$ usergrid_data_migrator -o myorg -m credentails -w 4 -s mySourceConfig.json -d myTargetConfiguration.json --create_apps --su_username foo --su_password bar
This command:
$ usergrid_data_migrator -o myorg -a app1 -a app2 -m data -w 4 --map_app app1:appplication_1 --map_app app2:application_2 --map_collection pets:animals --map_org myorg:my_new_org -s mySourceConfig.json -d myTargetConfiguration.json
will do the following: