Contributions are welcome and are greatly appreciated! Every little bit helps, and credit will always be given.
Here's a list of repositories that contain Superset-related packages:
apache-superset
Python package distributed on pypi. This repository also includes Superset's main TypeScript/JavaScript bundles and react apps under the superset-frontend folder.The best way to report a bug is to file an issue on GitHub. Please include:
When posting Python stack traces, please quote them using Markdown blocks.
The best way is to file an issue on GitHub:
For large features or major changes to codebase, please create Superset Improvement Proposal (SIP). See template from SIP-0
Look through the GitHub issues. Issues tagged with #bug
are open to whoever wants to implement them.
Look through the GitHub issues. Issues tagged with #feature
is open to whoever wants to implement it.
Superset could always use better documentation, whether as part of the official Superset docs, in docstrings, docs/*.rst
or even on the web as blog posts or articles. See Documentation for more details.
If you are proficient in a non-English language, you can help translate text strings from Superset's UI. You can jump in to the existing language dictionaries at superset/translations/<language_code>/LC_MESSAGES/messages.po
, or even create a dictionary for a new language altogether. See Translating for more details.
There is a dedicated apache-superset
tag on StackOverflow. Please use it when asking questions.
A philosophy we would like to strongly encourage is
Before creating a PR, create an issue.
The purpose is to separate problem from possible solutions.
Bug fixes: If you’re only fixing a small bug, it’s fine to submit a pull request right away but we highly recommend to file an issue detailing what you’re fixing. This is helpful in case we don’t accept that specific fix but want to keep track of the issue. Please keep in mind that the project maintainers reserve the rights to accept or reject incoming PRs, so it is better to separate the issue and the code to fix it from each other. In some cases, project maintainers may request you to create a separate issue from PR before proceeding.
Refactor: For small refactors, it can be a standalone PR itself detailing what you are refactoring and why. If there are concerns, project maintainers may request you to create a #SIP
for the PR before proceeding.
Feature/Large changes: If you intend to change the public API, or make any non-trivial changes to the implementation, we require you to file a new issue as #SIP
(Superset Improvement Proposal). This lets us reach an agreement on your proposal before you put significant effort into it. You are welcome to submit a PR along with the SIP (sometimes necessary for demonstration), but we will not review/merge the code until the SIP is approved.
In general, small PRs are always easier to review than large PRs. The best practice is to break your work into smaller independent PRs and refer to the same issue. This will greatly reduce turnaround time.
If you wish to share your work which is not ready to merge yet, create a Draft PR. This will enable maintainers and the CI runner to prioritize mature PR's.
Finally, never submit a PR that will put master branch in broken state. If the PR is part of multiple PRs to complete a large feature and cannot work on its own, you can create a feature branch and merge all related PRs into the feature branch before creating a PR from feature branch to master.
Fill in all sections of the PR template.
Title the PR with one of the following semantic prefixes (inspired by Karma):
feat
(new feature)fix
(bug fix)docs
(changes to the documentation)style
(formatting, missing semi colons, etc; no application logic change)refactor
(refactoring code)test
(adding missing tests, refactoring tests; no application logic change)chore
(updating tasks etc; no application logic change)perf
(performance-related change)build
(build tooling, Docker configuration change)ci
(test runner, Github Actions workflow changes)other
(changes that don't correspond to the above -- should be rare!)feat: export charts as ZIP files
perf(api): improve API info performance
fix(chart-api): cached-indicator always shows value is cached
Add prefix [WIP]
to title if not ready for review (WIP = work-in-progress). We recommend creating a PR with [WIP]
first and remove it once you have passed CI test and read through your code changes at least once.
If you believe your PR contributes a potentially breaking change, put a !
after the semantic prefix but before the colon in the PR title, like so: feat!: Added foo functionality to bar
Screenshots/GIFs: Changes to user interface require before/after screenshots, or GIF for interactions
Dependencies: Be careful about adding new dependency and avoid unnecessary dependencies.
setup.py
denoting any specific restrictions and in requirements.txt
pinned to a specific version which ensures that the application build is deterministic.package.json
Tests: The pull request should include tests, either as doctests, unit tests, or both. Make sure to resolve all errors and test failures. See Testing for how to run tests.
Documentation: If the pull request adds functionality, the docs should be updated as part of the same PR.
CI: Reviewers will not review the code until all CI tests are passed. Sometimes there can be flaky tests. You can close and open PR to re-run CI test. Please report if the issue persists. After the CI fix has been deployed to master
, please rebase your PR.
Code coverage: Please ensure that code coverage does not decrease.
Remove [WIP]
when ready for review. Please note that it may be merged soon after approved so please make sure the PR is ready to merge and do not expect more time for post-approval edits.
If the PR was not ready for review and inactive for > 30 days, we will close it due to inactivity. The author is welcome to re-open and update.
/testenv up
.FEATURE_
) and value after the command./testenv up FEATURE_<feature flag name>=true|false
/testenv up FEATURE_DASHBOARD_NATIVE_FILTERS=true
Use sentence-case capitalization for everything in the UI (except these **).
Sentence case is predominantly lowercase. Capitalize only the initial character of the first word, and other words that require capitalization, like:
Sentence case vs. Title case: Title case: “A Dog Takes a Walk in Paris” Sentence case: “A dog takes a walk in Paris”
Why sentence case?
When writing about a UI element, use the same capitalization as used in the UI.
For example, if an input field is labeled “Name” then you refer to this as the “Name input field”. Similarly, if a button has the label “Save” in it, then it is correct to refer to the “Save button”.
Where a product page is titled “Settings”, you refer to this in writing as follows: “Edit your personal information on the Settings page”.
Often a product page will have the same title as the objects it contains. In this case, refer to the page as it appears in the UI, and the objects as common nouns:
To handle issues and PRs that are coming in, committers read issues/PRs and flag them with labels to categorize and help contributors spot where to take actions, as contributors usually have different expertises.
Triaging goals
First, add Category labels (a.k.a. hash labels). Every issue/PR must have one hash label (except spam entry). Labels that begin with #
defines issue/PR type:
Label | for Issue | for PR |
---|---|---|
#bug | Bug report | Bug fix |
#code-quality | Describe problem with code, architecture or productivity | Refactor, tests, tooling |
#feature | New feature request | New feature implementation |
#refine | Propose improvement that does not provide new features and is also not a bug fix nor refactor, such as adjust padding, refine UI style. | Implementation of improvement that does not provide new features and is also not a bug fix nor refactor, such as adjust padding, refine UI style. |
#doc | Documentation | Documentation |
#question | Troubleshooting: Installation, Running locally, Ask how to do something. Can be changed to #bug later. | N/A |
#SIP | Superset Improvement Proposal | N/A |
#ASF | Tasks related to Apache Software Foundation policy | Tasks related to Apache Software Foundation policy |
Then add other types of labels as appropriate.
.
describe the details of the issue/PR, such as .ui
, .js
, .install
, .backend
, etc. Each issue/PR can have zero or more dot labels.need:xxx
, which describe the work required to progress, such as need:rebase
, need:update
, need:screenshot
.risk:xxx
, which describe the potential risk on adopting the work, such as risk:db-migration
. The intention was to better understand the impact and create awareness for PRs that need more rigorous testing.abandoned
, wontfix
, cant-reproduce
, etc.) Issue/PRs that are rejected or closed without completion should have one or more status labels.vx.x
such as v0.28
. Version labels on issues describe the version the bug was reported on. Version labels on PR describe the first release that will include the PR.Committers may also update title to reflect the issue/PR content if the author-provided title is not descriptive enough.
If the PR passes CI tests and does not have any need:
labels, it is ready for review, add label review
and/or design-review
.
If an issue/PR has been inactive for >=30 days, it will be closed. If it does not have any status label, add inactive
.
When creating a PR, if you're aiming to have it included in a specific release, please tag it with the version label. For example, to have a PR considered for inclusion in Superset 1.1 use the label v1.1
.
Please report security vulnerabilities to private@superset.apache.org.
In the event a community member discovers a security flaw in Superset, it is important to follow the Apache Security Guidelines and release a fix as quickly as possible before public disclosure. Reporting security vulnerabilities through the usual GitHub Issues channel is not ideal as it will publicize the flaw before a fix can be applied.
Reverting changes that are causing issues in the master branch is a normal and expected part of the development process. In an open source community, the ramifications of a change cannot always be fully understood. With that in mind, here are some considerations to keep in mind when considering a revert:
Should you decide that reverting is desirable, it is the responsibility of the Contributor performing the revert to:
First, fork the repository on GitHub, then clone it. You can clone the main repository directly, but you won't be able to send pull requests.
git clone git@github.com:your-username/superset.git cd superset
The latest documentation and tutorial are available at https://superset.apache.org/.
The site is written using the Gatsby framework and docz for the documentation subsection. Find out more about it in docs/README.md
If you‘re adding new images to the documentation, you’ll notice that the images referenced in the rst, e.g.
.. image:: _static/images/tutorial/tutorial_01_sources_database.png
aren‘t actually stored in that directory. Instead, you should add and commit images (and any other static assets) to the superset-frontend/images
directory. When the docs are deployed to https://superset.apache.org/, images are copied from there to the _static/images
directory, just like they’re referenced in the docs.
For example, the image referenced above actually lives in superset-frontend/images/tutorial
. Since the image is moved during the documentation build process, the docs reference the image in _static/images/tutorial
instead.
Make sure your machine meets the OS dependencies before following these steps.
You also need to install MySQL or MariaDB.
Ensure that you are using Python version 3.7 or 3.8, then proceed with:
# Create a virtual environment and activate it (recommended) python3 -m venv venv # setup a python3 virtualenv source venv/bin/activate # Install external dependencies pip install -r requirements/testing.txt # Install Superset in editable (development) mode pip install -e . # Create an admin user in your metadata database (use `admin` as username to be able to load the examples) superset fab create-admin # Initialize the database superset db upgrade # Create default roles and permissions superset init # Load some data to play with. # Note: you MUST have previously created an admin user with the username `admin` for this command to work. superset load-examples # Start the Flask dev web server from inside your virtualenv. # Note that your page may not have CSS at this point. # See instructions below how to build the front-end assets. FLASK_ENV=development superset run -p 8088 --with-threads --reload --debugger ``` Or you can install via our Makefile ```bash # Create a virtual environment and activate it (recommended) $ python3 -m venv venv # setup a python3 virtualenv $ source venv/bin/activate # install pip packages + pre-commit $ make install # Install superset pip packages and setup env only $ make superset # Setup pre-commit only $ make pre-commit
Note: the FLASK_APP env var should not need to be set, as it's currently controlled via .flaskenv
, however if needed, it should be set to superset.app:create_app()
If you have made changes to the FAB-managed templates, which are not built the same way as the newer, React-powered front-end assets, you need to start the app without the --with-threads
argument like so: FLASK_ENV=development superset run -p 8088 --reload --debugger
If you add a new requirement or update an existing requirement (per the install_requires
section in setup.py
) you must recompile (freeze) the Python dependencies to ensure that for CI, testing, etc. the build is deterministic. This can be achieved via,
$ python3 -m venv venv $ source venv/bin/activate $ python3 -m pip install -r requirements/integration.txt $ pip-compile-multi --no-upgrade
This feature is only available on Python 3. When debugging your application, you can have the server logs sent directly to the browser console using the ConsoleLog package. You need to mutate the app, by adding the following to your config.py
or superset_config.py
:
from console_log import ConsoleLog def FLASK_APP_MUTATOR(app): app.wsgi_app = ConsoleLog(app.wsgi_app, app.logger)
Then make sure you run your WSGI server using the right worker type:
FLASK_ENV=development gunicorn "superset.app:create_app()" -k "geventwebsocket.gunicorn.workers.GeventWebSocketWorker" -b 127.0.0.1:8088 --reload
You can log anything to the browser console, including objects:
from superset import app app.logger.error('An exception occurred!') app.logger.info(form_data)
Frontend assets (TypeScript, JavaScript, CSS, and images) must be compiled in order to properly display the web UI. The superset-frontend
directory contains all NPM-managed frontend assets. Note that for some legacy pages there are additional frontend assets bundled with Flask-Appbuilder (e.g. jQuery and bootstrap). These are not managed by NPM and may be phased out in the future.
First, be sure you are using recent versions of Node.js and npm. We recommend using nvm to manage your node environment:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.0/install.sh | bash cd superset-frontend nvm install --lts nvm use --lts
Or if you use the default macOS starting with Catalina shell zsh
, try:
sh -c "$(curl -fsSL https://raw.githubusercontent.com/nvm-sh/nvm/v0.37.0/install.sh)"
For those interested, you may also try out avn to automatically switch to the node version that is required to run Superset frontend.
We have upgraded our package-lock.json
to use lockfileversion: 2
from npm 7, so please make sure you have installed npm 7, too:
npm install -g npm@7
Install third-party dependencies listed in package.json
via:
# From the root of the repository cd superset-frontend # Install dependencies from `package-lock.json` npm ci
There are three types of assets you can build:
npm run build
: the production assets, CSS/JSS minified and optimizednpm run dev-server
: local development assets, with sourcemaps and hot refresh supportnpm run build-instrumented
: instrumented application code for collecting code coverage from Cypress testsThe dev server by default starts at http://localhost:9000
and proxies the backend requests to http://localhost:8088
. It's possible to change these settings:
# Start the dev server at http://localhost:9000 npm run dev-server # Run the dev server on a non-default port npm run dev-server -- --devserverPort=9001 # Proxy backend requests to a Flask server running on a non-default port npm run dev-server -- --supersetPort=8081 # Proxy to a remote backend but serve local assets npm run dev-server -- --superset=https://superset-dev.example.com
The --superset=
option is useful in case you want to debug a production issue or have to setup Superset behind a firewall. It allows you to run Flask server in another environment while keep assets building locally for the best developer experience.
Alternatively, there are other NPM commands you may find useful:
npm run build-dev
: build assets in development mode.npm run dev
: built dev assets in watch mode, will automatically rebuild when a file changesSee docs here
Use npm in the prescribed way, making sure that superset-frontend/package-lock.json
is updated according to npm
-prescribed best practices.
Superset supports a server-wide feature flag system, which eases the incremental development of features. To add a new feature flag, simply modify superset_config.py
with something like the following:
FEATURE_FLAGS = { 'SCOPED_FILTER': True, }
If you want to use the same flag in the client code, also add it to the FeatureFlag TypeScript enum in @superset-ui/core. For example,
export enum FeatureFlag { SCOPED_FILTER = "SCOPED_FILTER", }
superset/config.py
contains DEFAULT_FEATURE_FLAGS
which will be overwritten by those specified under FEATURE_FLAGS in superset_config.py
. For example, DEFAULT_FEATURE_FLAGS = { 'FOO': True, 'BAR': False }
in superset/config.py
and FEATURE_FLAGS = { 'BAR': True, 'BAZ': True }
in superset_config.py
will result in combined feature flags of { 'FOO': True, 'BAR': True, 'BAZ': True }
.
The current status of the usability of each flag (stable vs testing, etc) can be found in RESOURCES/FEATURE_FLAGS.md
.
Superset uses Git pre-commit hooks courtesy of pre-commit. To install run the following:
pip3 install -r requirements/integration.txt pre-commit install
A series of checks will now run when you make a git commit.
Alternatively it is possible to run pre-commit via tox:
tox -e pre-commit
Or by running pre-commit manually:
pre-commit run --all-files
Lint the project with:
# for python tox -e pylint Alternatively, you can use pre-commit (mentioned above) for python linting The Python code is auto-formatted using [Black](https://github.com/python/black) which is configured as a pre-commit hook. There are also numerous [editor integrations](https://black.readthedocs.io/en/stable/editor_integration.html) # for frontend cd superset-frontend npm ci npm run lint
Parameters in the config.py
(which are accessible via the Flask app.config dictionary) are assumed to always be defined and thus should be accessed directly via,
blueprints = app.config["BLUEPRINTS"]
rather than,
blueprints = app.config.get("BLUEPRINTS")
or similar as the later will cause typing issues. The former is of type List[Callable]
whereas the later is of type Optional[List[Callable]]
.
To ensure clarity, consistency, all readability, all new functions should use type hints and include a docstring.
Note per PEP-484 no syntax for listing explicitly raised exceptions is proposed and thus the recommendation is to put this information in a docstring, i.e.,
import math from typing import Union def sqrt(x: Union[float, int]) -> Union[float, int]: """ Return the square root of x. :param x: A number :returns: The square root of the given number :raises ValueError: If the number is negative """ return math.sqrt(x)
TypeScript is fully supported and is the recommended language for writing all new frontend components. When modifying existing functions/components, migrating to TypeScript is appreciated, but not required. Examples of migrating functions/components to TypeScript can be found in #9162 and #9180.
All python tests are carried out in tox a standardized testing framework. All python tests can be run with any of the tox environments, via,
tox -e <environment>
For example,
tox -e py38
Alternatively, you can run all tests in a single file via,
tox -e <environment> -- tests/test_file.py
or for a specific test via,
tox -e <environment> -- tests/test_file.py::TestClassName::test_method_name
Note that the test environment uses a temporary directory for defining the SQLite databases which will be cleared each time before the group of test commands are invoked.
There is also a utility script included in the Superset codebase to run python tests. The readme can be found here
To run all tests for example, run this script from the root directory:
scripts/tests/run.sh
We use Jest and Enzyme to test TypeScript/JavaScript. Tests can be run with:
cd superset-frontend npm run test
To run a single test file:
npm run test -- path/to/file.js
We use Cypress for integration tests. Tests can be run by tox -e cypress
. To open Cypress and explore tests first setup and run test server:
export SUPERSET_CONFIG=tests.integration_tests.superset_test_config export SUPERSET_TESTENV=true export ENABLE_REACT_CRUD_VIEWS=true export CYPRESS_BASE_URL="http://localhost:8081" superset db upgrade superset load_test_users superset load-examples --load-test-data superset init superset run --port 8081
Run Cypress tests:
cd superset-frontend npm run build-instrumented cd cypress-base npm install # run tests via headless Chrome browser (requires Chrome 64+) npm run cypress-run-chrome # run tests from a specific file npm run cypress-run-chrome -- --spec cypress/integration/explore/link.test.js # run specific file with video capture npm run cypress-run-chrome -- --spec cypress/integration/dashboard/index.test.js --config video=true # to open the cypress ui npm run cypress-debug # to point cypress to a url other than the default (http://localhost:8088) set the environment variable before running the script # e.g., CYPRESS_BASE_URL="http://localhost:9000" CYPRESS_BASE_URL=<your url> npm run cypress open
See superset-frontend/cypress_build.sh
.
As an alternative you can use docker-compose environment for testing:
Make sure you have added below line to your /etc/hosts file: 127.0.0.1 db
If you already have launched Docker environment please use the following command to assure a fresh database instance: docker-compose down -v
Launch environment:
CYPRESS_CONFIG=true docker-compose up
It will serve backend and frontend on port 8088.
Run Cypress tests:
cd cypress-base npm install npm run cypress open
Follow these instructions to debug the Flask app running inside a docker container.
First add the following to the ./docker-compose.yaml file
superset: env_file: docker/.env image: *superset-image container_name: superset_app command: ["/app/docker/docker-bootstrap.sh", "app"] restart: unless-stopped + cap_add: + - SYS_PTRACE ports: - 8088:8088 + - 5678:5678 user: "root" depends_on: *superset-depends-on volumes: *superset-volumes environment: CYPRESS_CONFIG: "${CYPRESS_CONFIG}"
Start Superset as usual
docker-compose up
Install the required libraries and packages to the docker container
Enter the superset_app container
docker exec -it superset_app /bin/bash root@39ce8cf9d6ab:/app#
Run the following commands inside the container
apt update apt install -y gdb apt install -y net-tools pip install debugpy
Find the PID for the Flask process. Make sure to use the first PID. The Flask app will re-spawn a sub-process everytime you change any of the python code. So it's important to use the first PID.
ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 14:09 ? 00:00:00 bash /app/docker/docker-bootstrap.sh app root 6 1 4 14:09 ? 00:00:04 /usr/local/bin/python /usr/bin/flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0 root 10 6 7 14:09 ? 00:00:07 /usr/local/bin/python /usr/bin/flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0
Inject debugpy into the running Flask process. In this case PID 6.
python3 -m debugpy --listen 0.0.0.0:5678 --pid 6
Verify that debugpy is listening on port 5678
netstat -tunap Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:5678 0.0.0.0:* LISTEN 462/python tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 6/python
You are now ready to attach a debugger to the process. Using VSCode you can configure a launch configuration file .vscode/launch.json like so.
{ "version": "0.2.0", "configurations": [ { "name": "Attach to Superset App in Docker Container", "type": "python", "request": "attach", "connect": { "host": "127.0.0.1", "port": 5678 }, "pathMappings": [ { "localRoot": "${workspaceFolder}", "remoteRoot": "/app" } ] }, ] }
VSCode will not stop on breakpoints right away. We've attached to PID 6 however it does not yet know of any sub-processes. In order to “wakeup” the debugger you need to modify a python file. This will trigger Flask to reload the code and create a new sub-process. This new sub-process will be detected by VSCode and breakpoints will be activated.
To debug Flask running in POD inside kubernetes cluster. You'll need to make sure the pod runs as root and is granted the SYS_TRACE capability.These settings should not be used in production environments.
securityContext: capabilities: add: ["SYS_PTRACE"]
See (set capabilities for a container)[https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-capabilities-for-a-container] for more details.
Once the pod is running as root and has the SYS_PTRACE capability it will be able to debug the Flask app.
You can follow the same instructions as in the docker-compose. Enter the pod and install the required library and packages; gdb, netstat and debugpy.
Often in a kuernetes environment nodes are not addressable from ouside the cluster. VSCode will thus be unable to remotely connect to port 5678 on a kubernetes node. In order to do this you need to create a tunnel that port forwards 5678 to your local machine.
kubectl port-forward pod/superset-<some random id> 5678:5678
You can now launch your VSCode debugger with the same config as above. VSCode will connect to to 127.0.0.1:5678 which is forwarded by kubectl to your remote kubernetes POD.
Superset includes a Storybook to preview the layout/styling of various Superset components, and variations thereof. To open and view the Storybook:
cd superset-frontend npm run storybook
When contributing new React components to Superset, please try to add a Story alongside the component's jsx/tsx
file.
We use Babel to translate Superset. In Python files, we import the magic _
function using:
from flask_babel import lazy_gettext as _
then wrap our translatable strings with it, e.g. _('Translate me')
. During extraction, string literals passed to _
will be added to the generated .po
file for each language for later translation.
At runtime, the _
function will return the translation of the given string for the current language, or the given string itself if no translation is available.
In TypeScript/JavaScript, the technique is similar: we import t
(simple translation), tn
(translation containing a number).
import { t, tn } from "@superset-ui/translation";
Add the LANGUAGES
variable to your superset_config.py
. Having more than one option inside will add a language selection dropdown to the UI on the right side of the navigation bar.
LANGUAGES = { 'en': {'flag': 'us', 'name': 'English'}, 'fr': {'flag': 'fr', 'name': 'French'}, 'zh': {'flag': 'cn', 'name': 'Chinese'}, }
pybabel extract -F superset/translations/babel.cfg -o superset/translations/messages.pot -k _ -k __ -k t -k tn -k tct .
This will update the template file superset/translations/messages.pot
with current application strings. Do not forget to update this file with the appropriate license information.
pybabel update -i superset/translations/messages.pot -d superset/translations --ignore-obsolete
This will update language files with the new extracted strings.
You can then translate the strings gathered in files located under superset/translation
, where there's one per language. You can use Poedit to translate the po
file more conveniently. There are some tutorials in the wiki.
In the case of JS translation, we need to convert the PO file into a JSON file, and we need the global download of the npm package po2json.
npm install -g po2json
To convert all PO files to formatted JSON files you can use the po2json.sh
script.
./scripts/po2json.sh
If you get errors running po2json
, you might be running the Ubuntu package with the same name, rather than the Node.js package (they have a different format for the arguments). If there is a conflict, you may need to update your PATH
environment variable or fully qualify the executable path (e.g. /usr/local/bin/po2json
instead of po2json
). If you get a lot of [null,***]
in messages.json
, just delete all the null,
. For example, "year":["年"]
is correct while "year":[null,"年"]
is incorrect.
For the translations to take effect we need to compile translation catalogs into binary MO files.
pybabel compile -d superset/translations
To create a dictionary for a new language, run the following, where LANGUAGE_CODE
is replaced with the language code for your target language, e.g. es
(see Flask AppBuilder i18n documentation for more details):
pip install -r superset/translations/requirements.txt pybabel init -i superset/translations/messages.pot -d superset/translations -l LANGUAGE_CODE
Then, extract strings for the new language.
Create Models and Views for the datasource, add them under superset folder, like a new my_models.py with models for cluster, datasources, columns and metrics and my_views.py with clustermodelview and datasourcemodelview.
Create DB migration files for the new models
Specify this variable to add the datasource model and from which module it is from in config.py:
For example:
ADDITIONAL_MODULE_DS_MAP = {'superset.my_models': ['MyDatasource', 'MyOtherDatasource']}
This means it'll register MyDatasource and MyOtherDatasource in superset.my_models module in the source registry.
To edit the frontend code for visualizations, you will have to check out a copy of apache-superset/superset-ui:
git clone https://github.com/apache-superset/superset-ui.git cd superset-ui yarn yarn build
Then use npm link
to create symlinks of the plugins/superset-ui packages you want to edit in superset-frontend/node_modules
:
# Since npm 7, you have to install plugin dependencies separately, too cd ../../superset-ui/plugins/[PLUGIN NAME] && npm install --legacy-peer-deps cd superset/superset-frontend npm link ../../superset-ui/plugins/[PLUGIN NAME] # Or to link all core superset-ui and plugin packages: # npm link ../../superset-ui/{packages,plugins}/* # Start developing npm run dev-server
When superset-ui
packages are linked with npm link
, the dev server will automatically load a package's source code from its /src
directory, instead of the built modules in lib/
or esm/
.
Note that every time you do npm install
, you will lose the symlink(s) and may have to run npm link
again.
The topic of authoring new plugins, whether you'd like to contribute it back or not has been well documented in the So, You Want to Build a Superset Viz Plugin... blog post
To contribute a plugin to Superset-UI, your plugin must meet the following criteria:
plugin-chart-whatever
and a package name of @superset-ui/plugin-chart-whatever
README.md
fileSubmissions will be considered for submission (or removal) on a case-by-case basis.
Alter the model you want to change. This example will add a Column
Annotations model.
Generate the migration file
superset db migrate -m 'add_metadata_column_to_annotation_model'
This will generate a file in migrations/version/{SHA}_this_will_be_in_the_migration_filename.py
.
Upgrade the DB
superset db upgrade
The output should look like this:
INFO [alembic.runtime.migration] Context impl SQLiteImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running upgrade 1a1d627ebd8e -> 40a0a483dd12, add_metadata_column_to_annotation_model.py
Add column to view
Since there is a new column, we need to add it to the AppBuilder Model view.
Test the migration's down
method
superset db downgrade
The output should look like this:
INFO [alembic.runtime.migration] Context impl SQLiteImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running downgrade 40a0a483dd12 -> 1a1d627ebd8e, add_metadata_column_to_annotation_model.py
When two DB migrations collide, you'll get an error message like this one:
alembic.util.exc.CommandError: Multiple head revisions are present for given argument 'head'; please specify a specific target revision, '<branchname>@head' to narrow to a specific head, or 'heads' for all heads`
To fix it:
Get the migration heads
superset db heads
This should list two or more migration hashes. E.g.
1412ec1e5a7b (head) 67da9ef1ef9c (head)
Pick one of them as the parent revision, open the script for the other revision and update Revises
and down_revision
to the new parent revision. E.g.:
--- a/67da9ef1ef9c_add_hide_left_bar_to_tabstate.py +++ b/67da9ef1ef9c_add_hide_left_bar_to_tabstate.py @@ -17,14 +17,14 @@ """add hide_left_bar to tabstate Revision ID: 67da9ef1ef9c -Revises: c501b7c653a3 +Revises: 1412ec1e5a7b Create Date: 2021-02-22 11:22:10.156942 """ # revision identifiers, used by Alembic. revision = "67da9ef1ef9c" -down_revision = "c501b7c653a3" +down_revision = "1412ec1e5a7b" import sqlalchemy as sa from alembic import op
Alternatively you may also run superset db merge
to create a migration script just for merging the heads.
superset db merge {HASH1} {HASH2}
Upgrade the DB to the new checkpoint
superset db upgrade
It's possible to configure a local database to operate in async
mode, to work on async
related features.
To do this, you'll need to:
Add an additional database entry. We recommend you copy the connection string from the database labeled main
, and then enable SQL Lab
and the features you want to use. Don't forget to check the Async
box
Configure a results backend, here's a local FileSystemCache
example, not recommended for production, but perfect for testing (stores cache in /tmp
)
from cachelib.file import FileSystemCache RESULTS_BACKEND = FileSystemCache('/tmp/sqllab')
Start up a celery worker
celery --app=superset.tasks.celery_app:app worker -Ofair
Note that:
celery worker
process for the changes to be reflected.sqlite
database using the SQLAlchemy
experimental broker. Ok for testing, but not recommended in productionIt's possible to configure database queries for charts to operate in async
mode. This is especially useful for dashboards with many charts that may otherwise be affected by browser connection limits. To enable async queries for dashboards and Explore, the following dependencies are required:
CACHE_CONFIG
and DATA_CACHE_CONFIG
config settingsThe following configuration settings are available for async queries (see config.py for default values)
GLOBAL_ASYNC_QUERIES
(feature flag) - enable or disable async query operationGLOBAL_ASYNC_QUERIES_REDIS_CONFIG
- Redis connection infoGLOBAL_ASYNC_QUERIES_REDIS_STREAM_PREFIX
- the prefix used with Redis StreamsGLOBAL_ASYNC_QUERIES_REDIS_STREAM_LIMIT
- the maximum number of events for each user-specific event stream (FIFO eviction)GLOBAL_ASYNC_QUERIES_REDIS_STREAM_LIMIT_FIREHOSE
- the maximum number of events for all users (FIFO eviction)GLOBAL_ASYNC_QUERIES_JWT_COOKIE_NAME
- the async query feature uses a JWT cookie for authentication, this setting is the cookie's nameGLOBAL_ASYNC_QUERIES_JWT_COOKIE_SECURE
- JWT cookie secure optionGLOBAL_ASYNC_QUERIES_JWT_COOKIE_DOMAIN
- JWT cookie domain option (see docs for set_cookie)GLOBAL_ASYNC_QUERIES_JWT_SECRET
- JWT's use a secret key to sign and validate the contents. This value should be at least 32 bytes and have sufficient randomness for proper securityGLOBAL_ASYNC_QUERIES_TRANSPORT
- available options: “polling” (HTTP, default), “ws” (WebSocket, requires running superset-websocket server)GLOBAL_ASYNC_QUERIES_POLLING_DELAY
- the time (in ms) between polling requestsMore information on the async query feature can be found in SIP-39.
Chart parameters are stored as a JSON encoded string the slices.params
column and are often referenced throughout the code as form-data. Currently the form-data is neither versioned nor typed as thus is somewhat free-formed. Note in the future there may be merit in using something like JSON Schema to both annotate and validate the JSON object in addition to using a Mypy TypedDict
(introduced in Python 3.8) for typing the form-data in the backend. This section serves as a potential primer for that work.
The following tables provide a non-exhausive list of the various fields which can be present in the JSON object grouped by the Explorer pane sections. These values were obtained by extracting the distinct fields from a legacy deployment consisting of tens of thousands of charts and thus some fields may be missing whilst others may be deprecated.
Note not all fields are correctly catagorized. The fields vary based on visualization type and may apprear in different sections depending on the type. Verified deprecated columns may indicate a missing migration and/or prior migrations which were unsucessful and thus future work may be required to clean up the form-data.
Field | Type | Notes |
---|---|---|
database_name | string | Deprecated? |
datasource | string | <datasouce_id>__<datasource_type> |
datasource_id | string | Deprecated? See datasource |
datasource_name | string | Deprecated? |
datasource_type | string | Deprecated? See datasource |
viz_type | string | The Visualization Type widget |
Field | Type | Notes |
---|---|---|
druid_time_origin | string | The Druid Origin widget |
granularity | string | The Druid Time Granularity widget |
granularity_sqla | string | The SQLA Time Column widget |
time_grain_sqla | string | The SQLA Time Grain widget |
time_range | string | The Time range widget |
Field | Type | Notes |
---|---|---|
metrics | array(string) | See Query section |
order_asc | - | See Query section |
row_limit | - | See Query section |
timeseries_limit_metric | - | See Query section |
Field | Type | Notes |
---|---|---|
order_by_cols | array(string) | The Ordering widget |
row_limit | - | See Query section |
Field | Type | Notes |
---|---|---|
metric | - | The Left Axis Metric widget. See Query section |
y_axis_format | - | See Y Axis section |
Field | Type | Notes |
---|---|---|
metric_2 | - | The Right Axis Metric widget. See Query section |
Field | Type | Notes |
---|---|---|
adhoc_filters | array(object) | The Filters widget |
extra_filters | array(object) | Another pathway to the Filters widget. It is generally used to pass dashboard filter parameters to a chart. It can be used for appending additional filters to a chart that has been saved with its own filters on an ad-hoc basis if the chart is being used as a standalone widget. For implementation examples see : utils test.py For insight into how superset processes the contents of this parameter see: exploreUtils/index.js |
columns | array(string) | The Breakdowns widget |
groupby | array(string) | The Group by or Series widget |
limit | number | The Series Limit widget |
metric metric_2 metrics percent_mertics secondary_metric size x y | string,object,array(string),array(object) | The metric(s) depending on the visualization type |
order_asc | boolean | The Sort Descending widget |
row_limit | number | The Row limit widget |
timeseries_limit_metric | object | The Sort By widget |
The metric
(or equivalent) and timeseries_limit_metric
fields are all composed of either metric names or the JSON representation of the AdhocMetric
TypeScript type. The adhoc_filters
is composed of the JSON represent of the AdhocFilter
TypeScript type (which can comprise of columns or metrics depending on whether it is a WHERE or HAVING clause). The all_columns
, all_columns_x
, columns
, groupby
, and order_by_cols
fields all represent column names.
Field | Type | Notes |
---|---|---|
color_picker | object | The Fixed Color widget |
label_colors | object | The Color Scheme widget |
normalized | boolean | The Normalized widget |
Field | Type | Notes |
---|---|---|
y_axis_2_label | N/A | Deprecated? |
y_axis_format | string | The Y Axis Format widget |
y_axis_zero | N/A | Deprecated? |
Note the y_axis_format
is defined under various section for some charts.
Field | Type | Notes |
---|---|---|
color_scheme | string |
Field | Type | Notes |
---|---|---|
add_to_dash | N/A | |
code | N/A | |
collapsed_fieldsets | N/A | |
comparison type | N/A | |
country_fieldtype | N/A | |
default_filters | N/A | |
entity | N/A | |
expanded_slices | N/A | |
filter_immune_slice_fields | N/A | |
filter_immune_slices | N/A | |
flt_col_0 | N/A | |
flt_col_1 | N/A | |
flt_eq_0 | N/A | |
flt_eq_1 | N/A | |
flt_op_0 | N/A | |
flt_op_1 | N/A | |
goto_dash | N/A | |
import_time | N/A | |
label | N/A | |
linear_color_scheme | N/A | |
new_dashboard_name | N/A | |
new_slice_name | N/A | |
num_period_compare | N/A | |
period_ratio_type | N/A | |
perm | N/A | |
rdo_save | N/A | |
refresh_frequency | N/A | |
remote_id | N/A | |
resample_fillmethod | N/A | |
resample_how | N/A | |
rose_area_proportion | N/A | |
save_to_dashboard_id | N/A | |
schema | N/A | |
series | N/A | |
show_bubbles | N/A | |
slice_name | N/A | |
timed_refresh_immune_slices | N/A | |
userid | N/A |