This file is for agents working on the datafusion-python project (developing, testing, reviewing). If you need to use the DataFusion DataFrame API (write queries, build expressions, understand available functions), see the user-facing skill at SKILL.md.
This project uses AI agent skills stored in .ai/skills/. Each skill is a directory containing a SKILL.md file with instructions for performing a specific task.
Skills follow the Agent Skills open standard. Each skill directory contains:
SKILL.md — The skill definition with YAML frontmatter (name, description, argument-hint) and detailed instructions.Every pull request must follow the template in .github/pull_request_template.md. The description must include these sections:
Closes #NNN.api change label.Always run pre-commit checks before committing. The hooks are defined in .pre-commit-config.yaml and run automatically on git commit if pre-commit is installed as a git hook. To run all hooks manually:
pre-commit run --all-files
Fix any failures before committing.
Every Python function must include a docstring with usage examples.
step=dfn.lit(3)) so readers can immediately see which parameter is being demonstrated.list_sort aliasing array_sort) only need a one-line description and a See Also reference to the primary function. They do not need their own examples.When adding or updating an aggregate or window function, ensure the corresponding site documentation is kept in sync:
docs/source/user-guide/common-operations/aggregations.rst — add new aggregate functions to the “Aggregate Functions” list and include usage examples if appropriate.docs/source/user-guide/common-operations/windows.rst — add new window functions to the “Available Functions” list and include usage examples if appropriate.