Beam Playground uses Google Cloud Platform Datastore for storing examples and snippets. Redis is used for caching catalog reads from Datastore to avoid having to enumerate all of the exmaples on each catalog request.
Playground can use custom namespace to store entities in Datastore to support simultaneous deployment of several backend instances in the same GCP project. By default Playground
namespace is used. custom namespace can be selected by setting DATASTORE_NAMESPACE
environment variable.
If a breaking change to DB schema is made, it's adviced to implement a migration procedure to handle the data change in the Datastore. There are no formalized rules on when a migration should be implemented, but in general the following should be considered:
In order to implement a migration a new Go file wtih name like migration_xxx.go
should be created under the internal/db/schema/migration
path. The file should contain a structure which implements the following interface:
type Version interface { // GetVersion returns the version string of the schema GetVersion() string // GetDescription returns the description of the schema version GetDescription() string // InitiateData initializes the data for the schema or performs a migration InitiateData(args *DBArgs) error }
After implementing the migration logic inside of the InitiateData()
function it should be covered by tests and the new migration should be added to the list of exisitng migrations in the cmd/server/server.go
file inside of setupDBStructure()
function.
There are several entity kinds in Datastore: | Entity kind | Description | Corresponding Go struct | |-------------|-------------|-------------------------| | pg_schema_versions
| Schema version entity | entity.SchemaEntity
| | pg_sdks
| SDK entity | entity.SDKEntity
| | pg_examples
| Example entity | entity.ExampleEntity
| | pg_snippets
| Snippet entity | entity.SnippetEntity
| | pg_datasets
| Dataset entity | entity.DatasetEntity
| | pg_files
| File entity | entity.FileEntity
| | pg_pc_objects
| Precompiled object entity | entity.PrecompiledObjectEntity
|
This entity kind is used to store schema version and description of changes. | Field name | Description | Type | |------------|-------------|------| | Name/ID | Schema version | Key
| | descr
| Description of changes | string
|
This entity kind is used to store information about each supported SDK.
Field name | Description | Type |
---|---|---|
Name/ID | SDK name | Key |
defaultExample | Name of the default example for this SDK | string |
This entity kind is used to store example catalog items.
Field name | Description | Type |
---|---|---|
Name/ID | Example ID. Has form of <SDK>_<Example name> | Key |
name | Example name | string |
sdk | SDK ID | Key |
descr | Example description | string |
tags | Example tags | []string |
cats | Example categories | []string |
path | Url of the example on Github | string |
type | Type of the example. Possible values are PRECOMPILED_OBJECT_TYPE_UNSPECIFIED , PRECOMPILED_OBJECT_TYPE_EXAMPLE , PRECOMPILED_OBJECT_TYPE_KATA , PRECOMPILED_OBJECT_TYPE_UNIT_TEST | string |
origin | PG_EXAMPLES for Playground examples, TB_EXAMPLES for Tour of Beam examples | string |
schVer | Schema version | Key |
urlVCS | Url of the example on Github | string |
urlNotebook | Url to a Collab notebook which has the example code | string |
alwaysRun | If true, frontend will ignore any precompiled objects assosciated with the example and run it always | bool |
This entity kind is used to store snippets.
Field name | Description | Type |
---|---|---|
Name/ID | Snippet ID. For shared user code the ID is computed based on the snippet content hash, for the snippets containing examples code the ID is the same as for the related example | Key |
ownerId | Cannot find any usage | string |
sdk | SDK ID | Key |
pipeOpts | Pipeline options | string |
created | Creation time | time.Time |
lVisited | Last visit time | time.Time |
origin | PG_SNIPPETS for Playground examples, TB_SNIPPETS for Tour of Beam examples, PG_USER for snippets with code shared by users, TB_USER for snippets created by Tour Of Beam users | string |
visitCount | Number of times the snippet was visited | int |
schVer | Schema version | Key |
numberOfFiles | Number of files in the snippet. Used to derive file keys. | int |
complexity | Complexity of the snippet. Possible values are COMPLEXITY_UNSPECIFIED , COMPLEXITY_BASIC , COMPLEXITY_MEDIUM , COMPLEXITY_ADVANCED | string |
persistenceKey | Used to track snippets created with Tour of Beam. When Tour of Beam user save a new snippet, all other snippets with the same persistenceKey are removed. | string |
datasets | Contains an array of DatasetNestedEntity objects which describe datasets and emulators assosciated with the snippet | []DatasetNestedEntity |
Field name | Description | Type |
---|---|---|
config | A JSON serialized map[string]string object which contains emulator configuration | string |
dataset | Dataset ID | Key |
emulator | Emulator name. Currently only kafka is supported | string |
Field name | Description | Type |
---|---|---|
Name/ID | Name of the dataset | Key |
path | Path to the dataset file on the runner filesystem under path specified in DATASETS_PATH environment variable (/opt/playground/backend/datasets by default) | string |
Field name | Description | Type |
---|---|---|
Name/ID | This field is constructed by concatenating snippet ID with an underscore (_ ) and an ordinal number of the file in the snippet. For example, if snippet SDK_JAVA_Example has numberOfFiles set to 2 then there will be two pg_files entities with SDK_JAVA_Example_0 and SDK_JAVA_Example_1 keys. | Key |
content | Content of the file | string |
cntxLine | Line number on which frontend will initially focus the text editor cursor when the file is being displayed | int32 |
isMain | Whether the file is the main file in the snippet. There can only be one main file in the snippet | bool |
name | Name of the file which will shown to the user by the frontend | string |
These entities contain pre-compiled (cached) outputs of examples. There are three types of precompiled objects:
OUTPUT
, containing example's run outputLOG
, contianing example's log outputGRAPH
, containing example's execution graph output All of these precompiled objects share the same schemaField name | Description | Type |
---|---|---|
Name/ID | Key is constructed by concatenating example's ID with precompiled object type, e.g. SDK_GO_WordCount_OUTPUT , SDK_GO_WordCount_LOG , SDK_GO_WordCount_GRAPH | Key |
content | Saved output of the example's run | string |
Indexes are defined in index.yaml
file. The file is used during deployment to create indexes in the Datastore.
Playground uses Redis as a cache for examples catalog to avoid having to re-enumerate all exmaples upon each request, as a temporary storage for examples output (logs, graphs, etc.) and as a message bus to relay events like a user request for pipeline cancellation.
Each pipeline run uses pipleine id as a Redis key, with the following subkeys (source): | Key | Subkey | Description | |-----|--------|-------------| | Pipeline Id | STATUS
| Pipeline status. Possible values can be found in api.proto
in Status
enum. | | Pipeline Id | RUN_OUTPUT
| Pipeline run output. | | Pipeline Id | RUN_ERROR
| Pipeline run error message. | | Pipeline Id | VALIDATION_OUTPUT
| Pipeline validation step output. | | Pipeline Id | PREPARATION_OUTPUT
| Pipeline preparation step output. | | Pipeline Id | COMPILE_OUTPUT
| Pipeline compilation step output. | | Pipeline Id | CANCELED
| Used to signal that user has requested pipeline cancellation. Runner periodically polls the cache to check if this key has been set to true
and cancels the pipeline if it has. | | Pipeline Id | RUN_OUTPUT_INDEX
| Index of the start of the run step's output. Upon each request of the pipeline execution logs this value is set to the end of the returned log and used in subsequent requests to skip already sent log fragment. | | Pipeline Id | LOGS
| Pipeline execution logs. |
Additionally there are keys used globally by the Playground: | Key | Subkey | Description | |-----|--------|-------------| | EXAMPLES_CATALOG
| None | Used to store cached version of examples catalog. | | SDKS_CATALOG
| None | Used to store cached version of supported SDKS list with list of names of default examples. | | DEFAULT_PRECOMPILED_OBJECTS
| Sdk | Used to store a default example metadata in cache. |