| # LLM Context Guide for Apache Superset |
| |
| Apache Superset is a data visualization platform with Flask/Python backend and React/TypeScript frontend. |
| |
| ## ⚠️ CRITICAL: Ongoing Refactors (What NOT to Do) |
| |
| **These migrations are actively happening - avoid deprecated patterns:** |
| |
| ### Frontend Modernization |
| - **NO `any` types** - Use proper TypeScript types |
| - **NO JavaScript files** - Convert to TypeScript (.ts/.tsx) |
| - **Use @superset-ui/core** - Don't import Ant Design directly, prefer Ant Design component wrappers from @superset-ui/core/components |
| - **Use antd theming tokens** - Prefer antd tokens over legacy theming tokens |
| - **Avoid custom css and styles** - Follow antd best practices and avoid styling and custom CSS whenever possible |
| |
| ### Testing Strategy Migration |
| - **Prefer unit tests** over integration tests |
| - **Prefer integration tests** over end-to-end tests |
| - **Use Playwright for E2E tests** - Migrating from Cypress |
| - **Cypress is deprecated** - Will be removed once migration is completed |
| - **Use Jest + React Testing Library** for component testing |
| - **Use `test()` instead of `describe()`** - Follow [avoid nesting when testing](https://kentcdodds.com/blog/avoid-nesting-when-youre-testing) principles |
| |
| ### Backend Type Safety |
| - **Add type hints** - All new Python code needs proper typing |
| - **MyPy compliance** - Run `pre-commit run mypy` to validate |
| - **SQLAlchemy typing** - Use proper model annotations |
| |
| ### UUID Migration |
| - **Prefer UUIDs over auto-incrementing IDs** - New models should use UUID primary keys |
| - **External API exposure** - Use UUIDs in public APIs instead of internal integer IDs |
| - **Existing models** - Add UUID fields alongside integer IDs for gradual migration |
| |
| ## Key Directories |
| |
| ``` |
| superset/ |
| ├── superset/ # Python backend (Flask, SQLAlchemy) |
| │ ├── views/api/ # REST API endpoints |
| │ ├── models/ # Database models |
| │ └── connectors/ # Database connections |
| ├── superset-frontend/src/ # React TypeScript frontend |
| │ ├── components/ # Reusable components |
| │ ├── explore/ # Chart builder |
| │ ├── dashboard/ # Dashboard interface |
| │ └── SqlLab/ # SQL editor |
| ├── superset-frontend/packages/ |
| │ └── superset-ui-core/ # UI component library (USE THIS) |
| ├── tests/ # Python/integration tests |
| ├── docs/ # Documentation (UPDATE FOR CHANGES) |
| └── UPDATING.md # Breaking changes log |
| ``` |
| |
| ## Code Standards |
| |
| ### TypeScript Frontend |
| - **Avoid `any` types** - Use proper TypeScript, reuse existing types |
| - **Functional components** with hooks |
| - **@superset-ui/core** for UI components (not direct antd) |
| - **Jest** for testing (NO Enzyme) |
| - **Redux** for global state where it exists, hooks for local |
| |
| ### Python Backend |
| - **Type hints required** for all new code |
| - **MyPy compliant** - run `pre-commit run mypy` |
| - **SQLAlchemy models** with proper typing |
| - **pytest** for testing |
| |
| ### Apache License Headers |
| - **New files require ASF license headers** - When creating new code files, include the standard Apache Software Foundation license header |
| - **LLM instruction files are excluded** - Files like LLMS.md, CLAUDE.md, etc. are in `.rat-excludes` to avoid header token overhead |
| |
| ### Code Comments |
| - **Avoid time-specific language** - Don't use words like "now", "currently", "today" in code comments as they become outdated |
| - **Write timeless comments** - Comments should remain accurate regardless of when they're read |
| |
| ## Documentation Requirements |
| |
| - **docs/**: Update for any user-facing changes |
| - **UPDATING.md**: Add breaking changes here |
| - **Docstrings**: Required for new functions/classes |
| |
| ## Architecture Patterns |
| |
| ### Security & Features |
| - **RBAC**: Role-based access via Flask-AppBuilder |
| - **Feature flags**: Control feature rollouts |
| - **Row-level security**: SQL-based data access control |
| |
| ## Test Utilities |
| |
| ### Python Test Helpers |
| - **`SupersetTestCase`** - Base class in `tests/integration_tests/base_tests.py` |
| - **`@with_config`** - Config mocking decorator |
| - **`@with_feature_flags`** - Feature flag testing |
| - **`login_as()`, `login_as_admin()`** - Authentication helpers |
| - **`create_dashboard()`, `create_slice()`** - Data setup utilities |
| |
| ### TypeScript Test Helpers |
| - **`superset-frontend/spec/helpers/testing-library.tsx`** - Custom render() with providers |
| - **`createWrapper()`** - Redux/Router/Theme wrapper |
| - **`selectOption()`** - Select component helper |
| - **React Testing Library** - NO Enzyme (removed) |
| |
| ### Test Database Patterns |
| - **Mock patterns**: Use `MagicMock()` for config objects, avoid `AsyncMock` for synchronous code |
| - **API tests**: Update expected columns when adding new model fields |
| |
| ### Running Tests |
| ```bash |
| # Frontend |
| npm run test # All tests |
| npm run test -- filename.test.tsx # Single file |
| |
| # E2E Tests (Playwright - NEW) |
| npm run playwright:test # All Playwright tests |
| npm run playwright:ui # Interactive UI mode |
| npm run playwright:headed # See browser during tests |
| npx playwright test tests/auth/login.spec.ts # Single file |
| npm run playwright:debug tests/auth/login.spec.ts # Debug specific file |
| |
| # E2E Tests (Cypress - DEPRECATED) |
| cd superset-frontend/cypress-base |
| npm run cypress-run-chrome # All Cypress tests (headless) |
| npm run cypress-debug # Interactive Cypress UI |
| |
| # Backend |
| pytest # All tests |
| pytest tests/unit_tests/specific_test.py # Single file |
| pytest tests/unit_tests/ # Directory |
| |
| # If pytest fails with database/setup issues, ask the user to run test environment setup |
| ``` |
| |
| ## Environment Validation |
| |
| **Quick Setup Check (run this first):** |
| |
| ```bash |
| # Verify Superset is running |
| curl -f http://localhost:8088/health || echo "❌ Setup required - see https://superset.apache.org/docs/contributing/development#working-with-llms" |
| ``` |
| |
| **If health checks fail:** |
| "It appears you aren't set up properly. Please refer to the [Working with LLMs](https://superset.apache.org/docs/contributing/development#working-with-llms) section in the development docs for setup instructions." |
| |
| **Key Project Files:** |
| - `superset-frontend/package.json` - Frontend build scripts (`npm run dev` on port 9000, `npm run test`, `npm run lint`) |
| - `pyproject.toml` - Python tooling (ruff, mypy configs) |
| - `requirements/` folder - Python dependencies (base.txt, development.txt) |
| |
| ## SQLAlchemy Query Best Practices |
| - **Use negation operator**: `~Model.field` instead of `== False` to avoid ruff E712 errors |
| - **Example**: `~Model.is_active` instead of `Model.is_active == False` |
| |
| ## Pull Request Guidelines |
| |
| **When creating pull requests:** |
| |
| 1. **Read the current PR template**: Always check `.github/PULL_REQUEST_TEMPLATE.md` for the latest format |
| 2. **Use the template sections**: Include all sections from the template (SUMMARY, BEFORE/AFTER, TESTING INSTRUCTIONS, ADDITIONAL INFORMATION) |
| 3. **Follow PR title conventions**: Use [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) |
| - Format: `type(scope): description` |
| - Example: `fix(dashboard): load charts correctly` |
| - Types: `fix`, `feat`, `docs`, `style`, `refactor`, `perf`, `test`, `chore` |
| |
| **Important**: Always reference the actual template file at `.github/PULL_REQUEST_TEMPLATE.md` instead of using cached content, as the template may be updated over time. |
| |
| ## Pre-commit Validation |
| |
| **Use pre-commit hooks for quality validation:** |
| |
| ```bash |
| # Install hooks |
| pre-commit install |
| |
| # IMPORTANT: Stage your changes first! |
| git add . # Pre-commit only checks staged files |
| |
| # Quick validation (faster than --all-files) |
| pre-commit run # Staged files only |
| pre-commit run mypy # Python type checking |
| pre-commit run prettier # Code formatting |
| pre-commit run eslint # Frontend linting |
| ``` |
| |
| **Important pre-commit usage notes:** |
| - **Stage files first**: Run `git add .` before `pre-commit run` to check only changed files (much faster) |
| - **Virtual environment**: Activate your Python virtual environment before running pre-commit |
| ```bash |
| # Common virtual environment locations (yours may differ): |
| source .venv/bin/activate # if using .venv |
| source venv/bin/activate # if using venv |
| source ~/venvs/superset/bin/activate # if using a central location |
| ``` |
| If you get a "command not found" error, ask the user which virtual environment to activate |
| - **Auto-fixes**: Some hooks auto-fix issues (e.g., trailing whitespace). Re-run after fixes are applied |
| |
| ## Common File Patterns |
| |
| ### API Structure |
| - **`/api.py`** - REST endpoints with decorators and OpenAPI docstrings |
| - **`/schemas.py`** - Marshmallow validation schemas for OpenAPI spec |
| - **`/commands/`** - Business logic classes with @transaction() decorators |
| - **`/models/`** - SQLAlchemy database models |
| - **OpenAPI docs**: Auto-generated at `/swagger/v1` from docstrings and schemas |
| |
| ### Migration Files |
| - **Location**: `superset/migrations/versions/` |
| - **Naming**: `YYYY-MM-DD_HH-MM_hash_description.py` |
| - **Utilities**: Use helpers from `superset.migrations.shared.utils` for database compatibility |
| - **Pattern**: Import utilities instead of raw SQLAlchemy operations |
| |
| ## Platform-Specific Instructions |
| |
| - **[CLAUDE.md](CLAUDE.md)** - For Claude/Anthropic tools |
| - **[.github/copilot-instructions.md](.github/copilot-instructions.md)** - For GitHub Copilot |
| - **[GEMINI.md](GEMINI.md)** - For Google Gemini tools |
| - **[GPT.md](GPT.md)** - For OpenAI/ChatGPT tools |
| - **[.cursor/rules/dev-standard.mdc](.cursor/rules/dev-standard.mdc)** - For Cursor editor |
| |
| --- |
| |
| **LLM Note**: This codebase is actively modernizing toward full TypeScript and type safety. Always run `pre-commit run` to validate changes. Follow the ongoing refactors section to avoid deprecated patterns. |