)]}'
{
  "log": [
    {
      "commit": "13b2c47b0d5e348cea24b9264e87fd67666c56f1",
      "tree": "14579478ae62f6a4e95898402a17c5392dd7722b",
      "parents": [
        "c657dad97349c1113e843e3e15bb41f865e65a97"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sun May 03 10:52:10 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sun May 03 10:52:10 2026 -0400"
      },
      "message": "Update user documentation for AI agent skill usage (#1505)\n\n* docs: publish SKILL.md on the docs site via myst include\n\nAdds a new `skill` page that embeds the repo-root `SKILL.md` through the\nmyst `{include}` directive, so the agent-facing guide lives on the\npublished docs site without duplication. The page is wired into the\nUser Guide toctree. Implements PR 4a of the plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: publish llms.txt at docs site root\n\nAdds `docs/source/llms.txt` in llmstxt.org schema: a short description\nplus categorized links to the agent skill, user guide pages, DataFrame\nAPI reference, and example queries. `html_extra_path` in `conf.py`\ncopies it verbatim to the published site root so it resolves at\n`https://datafusion.apache.org/python/llms.txt`. Implements PR 4b of\nthe plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: add write-dataframe-code contributor skill\n\nAdds `.ai/skills/write-dataframe-code/SKILL.md`, a contributor-facing\nskill for agents working on this repo. It layers on top of the\nuser-facing repo-root SKILL.md with:\n\n- a TPC-H pattern index mapping idiomatic API usages to the query file\n  that demonstrates them,\n- an ad-hoc plan-comparison workflow for checking DataFrame translations\n  against a reference SQL query via `optimized_logical_plan()`, and\n- the project-specific docstring and aggregate/window documentation\n  conventions that CLAUDE.md already enforces for contributors.\n\nImplements PR 4c of the plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: add audit-skill-md skill\n\nAdds `.ai/skills/audit-skill-md/SKILL.md`, a contributor skill that\ncross-references the repo-root `SKILL.md` against the current public\nPython API (functions module, DataFrame, Expr, SessionContext, and\npackage-root re-exports). Reports two classes of drift:\n\n- new APIs exposed by the Python surface that are not yet covered in\n  the user-facing guide, and\n- stale mentions in the guide that no longer exist in the public API.\n\nThe skill is diff-only — it produces a report the user reviews before\nany edit to `SKILL.md`. Complements `check-upstream/`, which audits in\nthe opposite direction (upstream Rust features not yet exposed).\n\nImplements PR 4d of the plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: enrich RST pages with demos relocated from TPC-H rewrite\n\nMoves the illustrative patterns that #1504 removed from the TPC-H\nexamples into the common-operations docs, where they serve as\npattern-focused teaching material without cluttering the TPC-H\ntranslations:\n\n- expressions.rst gains a \"Testing membership in a list\" section\n  comparing `|`-compound filters, `in_list`, and `array_position` +\n  `make_array`, plus a \"Conditional expressions\" section contrasting\n  switched and searched `case`.\n- udf-and-udfa.rst gains a \"When not to use a UDF\" subsection\n  showing the compound-OR predicate that replaces a Python-side UDF\n  for disjunctive bucket filters (the Q19 case).\n- aggregations.rst gains a \"Building per-group arrays\" subsection\n  covering `array_agg(filter\u003d..., distinct\u003dTrue)` with\n  `array_length`/`array_element` for the single-value-per-group\n  pattern (the Q21 case).\n- Adds `examples/array-operations.py`, a runnable end-to-end\n  walkthrough of the membership and array_agg patterns.\n\nImplements PR 4e of the plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: wire new contributor skills and plan-comparison diagnostic into AGENTS.md\n\n- List the three contributor skills (`check-upstream`,\n  `write-dataframe-code`, `audit-skill-md`) under the Skills section so\n  agents know what tools they have before starting work.\n- Document the plan-comparison diagnostic workflow (comparing\n  `ctx.sql(...).optimized_logical_plan()` against a DataFrame\u0027s\n  `optimized_logical_plan()` via `LogicalPlan.__eq__`) for translating\n  SQL queries to DataFrame form. Points at the full write-up in the\n  `write-dataframe-code` skill rather than duplicating it.\n\n`CLAUDE.md` is a symlink to `AGENTS.md`, so the change lands in both.\n\nImplements PR 4f of the plan in #1394.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: rename aggregations.rst demo df to orders_df to avoid clobbering state\n\nThe \"Building per-group arrays\" block added in the previous commit\nreassigned `df` and `ctx` mid-page, which then broke the\nGrouping Sets examples further down that share the Pokemon `df`\nbinding (`col_type_1` etc. no longer resolved). Rename the demo\nDataFrame to `orders_df` and drop the redundant `ctx \u003d SessionContext()`\nso the shared state from the top of the page stays intact.\n\nVerified with `sphinx-build -W --keep-going` against the full docs\ntree.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: replace raw SKILL.md include with a human-written AI-assistants page\n\nThe previous approach embedded the repo-root `SKILL.md` on the docs\nsite via a myst `{include}`. That file is written for agents -- dense,\nskill-formatted, and not suited to a human browsing the User Guide. It\nalso relied on a fragile `:start-line:` offset to strip YAML\nfrontmatter.\n\nReplace it with `docs/source/ai-coding-assistants.md`, a short\nhuman-readable page that mirrors the README section added in #1503:\nwhat the skill is, how to install it via `npx skills` or a manual\npointer, and what kinds of things it covers. `SKILL.md` stays at the\nrepo root as the single source of truth; agents fetch the raw GitHub\nURL directly.\n\n`llms.txt` is updated to point its Agent Guide entry at\n`raw.githubusercontent.com/.../SKILL.md` and to include the new\nhuman-readable page as a secondary link. The User Guide toctree now\nreferences `ai-coding-assistants` in place of the removed `skill`\nstub.\n\nVerified with `sphinx-build -W --keep-going` against the full docs\ntree.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: drop redundant assistants list in ai-coding-assistants intro\n\nThe introduction and the \"Installing the skill\" section both enumerated\nthe same set of supported assistants. Drop the intro copy; the list\nthat matters is next to `npx skills add`, where it answers \"what does\nthis command actually configure?\"\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: convert ai-coding-assistants page from markdown to rst, shorten title\n\nEvery other page in `docs/source/user-guide` and the top-level\n`docs/source` is written in reStructuredText; the lone `.md` page was\nan inconsistency. Rewrite in rst so the ASF header matches the rest of\nthe tree, cross-references can use `:py:func:` roles if we ever add\nany, and myst is no longer required just to render this one page.\n\nAlso shorten the page title from \"Using DataFusion with AI Coding\nAssistants\" to \"Using AI Coding Assistants\" -- it already sits under\nthe DataFusion user guide so the product name is redundant.\n\nVerified with `sphinx-build -W --keep-going`.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: drop audit-skill-md skill\n\nThe skill as written pushed for every public method to be mentioned\nin `SKILL.md`, which is the wrong goal. `SKILL.md` is a distilled\nagent guide of idiomatic patterns and pitfalls, not an API reference\n-- autoapi-generated docs and module docstrings already provide full\nper-method coverage. An audit pressing for 100% method coverage would\nbloat the skill file into a stale copy of that reference.\n\nThe two checks with actual value (stale mentions in `SKILL.md`, and\ndrift between `functions.__all__` and the categorized function list)\nare small enough to be ad-hoc greps at release time and do not\nwarrant a dedicated skill.\n\nAlso remove references to the skill from `AGENTS.md` and the\n`write-dataframe-code` skill\u0027s \"Related\" section.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: drop write-dataframe-code skill\n\nA separate PR covers the same contributor-facing material (TPC-H\npattern index, plan-comparison workflow, docstring conventions),\nso this skill is redundant. Remove the skill directory and the\ncorresponding references in `AGENTS.md`, including the\nplan-comparison section that pointed at it.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: show Parquet pushdown plan diff in \"When not to use a UDF\"\n\nThe previous version of the section asserted that a UDF predicate\nblocks optimizer rewrites but did not show evidence. Replace the two\n`code-block` examples with an executable walkthrough that writes a\nsmall Parquet file, runs the same filter two ways, and prints the\nphysical plan for each.\n\nThe native-expression plan renders with three annotations on the\n`DataSourceExec` node that the UDF plan does not have:\n\n- `predicate\u003dbrand@1 \u003d A AND qty@2 \u003e\u003d 150` pushed into the scan\n- `pruning_predicate\u003d... brand_min@0 \u003c\u003d A AND ... qty_max@4 \u003e\u003d 150`\n  for row-group pruning via Parquet footer min/max stats\n- `required_guarantees\u003d[brand in (A)]` for bloom-filter / dictionary\n  skipping\n\nThe UDF form keeps only `predicate\u003dbrand_qty_filter(...)`: the scan\nhas to materialize every row group and call the Python callback.\n\nThe disjunctive-OR rewrite (previously the main example) stays at the\nend as the idiomatic alternative for multi-bucket filters.\n\nVerified with `sphinx-build -W --keep-going`.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: rework \"subsets within a group\" aggregation example\n\nRename the section from \"Building per-group arrays\" to \"Comparing subsets\nwithin a group\" so the heading matches the content. Rewrite the intro to\nlead with the problem (compare full group vs filtered subset), reframe\nthe worked example around partially failed orders, and replace the\ntrailing bullet list with a one-line walkthrough of the result.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: clarify \"When not to use a UDF\" intro\n\nRewrite the opening of the section to make three things clearer: the\ncontrast is with native DataFusion expressions (not Python in general),\nsome predicates genuinely feel easier to write as a Python loop and that\ntension is worth acknowledging, and predicate pushdown is a table-provider\nmechanism rather than a Parquet-only feature. Parquet stays as the\nconcrete demo.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: move ai-coding-assistants under user-guide/\n\nThe page was sitting at the top level of docs/source/ while every other\npage in the USER GUIDE toctree lives under docs/source/user-guide/.\nMove the file, update the toctree entry, and update the absolute URL\nin llms.txt to match the new path.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: replace AGENTS.md skill list with discovery instructions\n\nA static skill list in AGENTS.md goes stale as new skills are added\n(it already missed the make-pythonic skill that was merged separately).\nReplace the enumerated list with a pointer telling agents to list\n.ai/skills/ and read each SKILL.md frontmatter, so the catalog never\nhas to be hand-maintained.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: fix broken llms.txt link and stale otherwise xref\n\n- ai-coding-assistants.rst: use absolute https://datafusion.apache.org/python/llms.txt URL; the relative `llms.txt` resolved to /python/user-guide/llms.txt and 404\u0027d because html_extra_path publishes the file at the site root.\n- expressions.rst: drop the broken `:py:meth:~datafusion.expr.Expr.otherwise` xref (otherwise lives on CaseBuilder, not Expr) and spell the recommended replacement as `f.when(f.in_list(...), value).otherwise(default)`.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: update SKILL.md path after move to skills/datafusion_python/\n\nUpstream #1519 moved the root `SKILL.md` to `skills/datafusion_python/SKILL.md`\nso that consumers can install the skill without cloning the whole repo. Update\nall repo-internal links and external GitHub URLs in the docs site, README,\nAGENTS.md, and the package docstring to point at the new location.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "c657dad97349c1113e843e3e15bb41f865e65a97",
      "tree": "b2ee2e39947e7431734591fc11fdf8ad8a4049a9",
      "parents": [
        "e0284c6e788b6fc893495ed929b9badef1cf925c"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Apr 29 07:35:21 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Apr 29 07:35:21 2026 -0400"
      },
      "message": "Move public skills to a directory to avoid downloading the whole repo (#1519)\n\n* Move to skill directory\n\n* Avoid moved skill with test"
    },
    {
      "commit": "e0284c6e788b6fc893495ed929b9badef1cf925c",
      "tree": "fa186f6dc053173f57633a247c3e59b6037fe262",
      "parents": [
        "03577163a057f791b19f30ce5130464a4a1c78a4"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 24 13:09:24 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 24 13:09:24 2026 -0400"
      },
      "message": "feat: add AI skill to find and improve the Pythonic interface to functions (#1484)\n\n* feat: accept native Python types in function arguments instead of requiring lit()\n\nUpdate 47 functions in functions.py to accept native Python types (int, float,\nstr) for arguments that are contextually literals, eliminating verbose lit()\nwrapping. For example, users can now write split_part(col(\"a\"), \",\", 2) instead\nof split_part(col(\"a\"), lit(\",\"), lit(2)). All changes are backward compatible.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* fix: update alias function signatures to match pythonic primary functions\n\nUpdate instr and position (aliases of strpos) to accept Expr | str for\nthe substring parameter, matching the updated primary function signature.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: update make-pythonic skill to require alias type hint updates\n\nAlias functions that delegate to a primary function must have their type\nhints updated to match, even though coercion logic is only added to the\nprimary. Added a new Step 3 to the implementation workflow for this.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* fix: address review feedback on pythonic skill and function signatures\n\nUpdate SKILL.md to prevent three classes of issues: clarify that float\nalready accepts int per PEP 484 (avoiding redundant int | float that\nfails ruff PYI041), add backward-compat rule for Category B so existing\nExpr params aren\u0027t removed, and add guidance for inline coercion with\nmany optional nullable params instead of local helpers.\n\nReplace regexp_instr\u0027s _to_raw() helper with inline coercion matching\nthe pattern used throughout the rest of the file.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* refactor: add coerce_to_expr helpers and replace inline coercion patterns\n\nIntroduce coerce_to_expr() and coerce_to_expr_or_none() in expr.py as the\ncomplement to ensure_expr() — where ensure_expr rejects non-Expr values,\nthese helpers wrap them via Expr.literal(). Replaces ~60 inline isinstance\nchecks in functions.py with single-line helper calls, and updates the\nmake-pythonic skill to document the new pattern.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: add aggregate function literal detection to make-pythonic skill\n\nAdd Technique 1a to detect literal-only arguments in aggregate functions.\nUnlike scalar UDFs which enforce literals in invoke_with_args(), aggregate\nfunctions enforce them in accumulator() via get_scalar_value(),\nvalidate_percentile_expr(), or downcast_ref::\u003cLiteral\u003e(). Without this\ntechnique, the skill would incorrectly classify arguments like\napprox_percentile_cont\u0027s percentile as Category A (Expr | float) when they\nshould be Category B (float only). Updates the decision flow to branch on\nscalar vs aggregate before checking for literal enforcement.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: add window function literal detection to make-pythonic skill\n\nAdd Technique 1b to detect literal-only arguments in window functions.\nWindow functions enforce literals in partition_evaluator() via\nget_scalar_value_from_args() / downcast_ref::\u003cLiteral\u003e(), not in\ninvoke_with_args() (scalar) or accumulator() (aggregate). Updates the\ndecision flow to branch on scalar vs aggregate vs window.\n\nKnown window functions with literal-only arguments: ntile (n), lead/lag\n(offset, default_value), nth_value (n).\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* fix: use explicit None checks, widen numeric type hints, and add tests\n\nReplace 7 fragile truthiness checks (x.expr if x else None) with\nexplicit is not None checks to prevent silent None when zero-valued\nliterals are passed. Widen log/power/pow type hints to Expr | int | float\nwith noqa: PYI041 for clarity. Add unit tests for coerce_to_expr helpers\nand integration tests for pythonic calling conventions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* chore: suppress FBT003 in tests and remove redundant noqa comments\n\nAdd FBT003 (boolean positional value) to the per-file-ignores for\npython/tests/* in pyproject.toml, and remove the 6 now-redundant\ninline noqa: FBT003 comments across test_expr.py and test_context.py.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* docs: replace static function lists with discovery instructions in skill\n\nReplace hardcoded \"Known aggregate/window functions with literal-only\narguments\" lists with instructions to discover them dynamically by\nsearching the upstream crate source. Keeps a few examples as validation\nanchors so the agent knows its search is working correctly.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* fix: make interrupt test reliable on Python 3.11\n\nPyThreadState_SetAsyncExc only delivers exceptions when the thread is\nexecuting Python bytecode, not while in native (Rust/C) code. The\nprevious test had two issues causing flakiness on Python 3.11:\n\n1. The interrupt fired before df.collect() entered the UDF, while the\n   thread was still in native code where async exceptions are ignored.\n2. time.sleep(2.0) is a single C call where async exceptions are not\n   checked — they\u0027re only checked between bytecode instructions.\n\nFix by adding a threading.Event so the interrupt waits until the UDF is\nactually executing Python code, and by sleeping in small increments so\nthe eval loop has opportunities to check for pending exceptions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "03577163a057f791b19f30ce5130464a4a1c78a4",
      "tree": "2b02d946a60edddb9df94a0a9438b3e734acaf39",
      "parents": [
        "c8bb9f7d3876de97141d204740a6b99d5facd10f"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 24 11:47:06 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 24 11:47:06 2026 -0400"
      },
      "message": "tpch examples: rewrite queries idiomatically and embed reference SQL (#1504)\n\n* tpch examples: add reference SQL to each query, fix Q20\n\n- Append the canonical TPC-H reference SQL (from benchmarks/tpch/queries/)\n  to each q01..q22 module docstring so readers can compare the DataFrame\n  translation against the SQL at a glance.\n- Fix Q20: `df \u003d df.filter(col(\"ps_availqty\") \u003e lit(0.5) * col(\"total_sold\"))`\n  was missing the assignment so the filter was dropped from the pipeline.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* tpch examples: rewrite non-idiomatic queries in idiomatic DataFrame form\n\nRewrite the seven TPC-H example queries that did not demonstrate the\nidiomatic DataFrame pattern. The remaining queries (Q02/Q11/Q15/Q17/Q22,\nwhich use window functions in place of correlated subqueries) already are\nidiomatic and are left unchanged.\n\n- Q04: replace `.aggregate([col(\"l_orderkey\")], [])` with\n  `.select(\"l_orderkey\").distinct()`, which is the natural way to express\n  \"reduce to one row per order\" on a DataFrame.\n- Q07: remove the CASE-as-filter on `n_name` and use\n  `F.in_list(col(\"n_name\"), [nation_1, nation_2])` instead. Drops a\n  comment block that admitted the filter form was simpler.\n- Q08: rewrite the switched CASE `F.case(...).when(lit(False), ...)` as a\n  searched `F.when(col(...).is_not_null(), ...).otherwise(...)`. That\n  mirrors the reference SQL\u0027s `case when ... then ... else 0 end` shape.\n- Q12: replace `array_position(make_array(...), col)` with\n  `F.in_list(col(\"l_shipmode\"), [...])`. Same semantics, without routing\n  through array construction / array search.\n- Q19: remove the pyarrow UDF that re-implemented a disjunctive predicate\n  in Python. Build the same predicate in DataFusion by OR-combining one\n  `in_list` + range-filter expression per brand. Keeps the per-brand\n  constants in the existing `items_of_interest` dict.\n- Q20: use `F.starts_with` instead of an explicit substring slice. Replace\n  the inner-join + `select(...).distinct()` tail with a semi join against\n  a precomputed set of excess-quantity suppliers so the supplier columns\n  are preserved without deduplication after the fact.\n- Q21: replace the `array_agg` / `array_length` / `array_element` pipeline\n  with two semi joins. One semi join keeps orders with more than one\n  distinct supplier (stand-in for the reference SQL\u0027s `exists` subquery),\n  the other keeps orders with exactly one late supplier (stand-in for the\n  `not exists` subquery).\n\nAll 22 answer-file comparisons and 22 plan-comparison diagnostics still\npass (`pytest examples/tpch/_tests.py`: 44 passed).\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* tpch examples: align reference SQL constants with DataFrame queries\n\nThe reference SQL embedded in each q01..q22 module docstring was carried\nover verbatim from ``benchmarks/tpch/queries/`` and uses a different set\nof TPC-H substitution parameters than the DataFrame examples\n(answer-file-validated at scale factor 1). Update each reference SQL to\nuse the substitution parameters the DataFrame uses, so both expressions\ndescribe the same query and would produce the same results against the\nsame data.\n\nConstants aligned:\n\n- Q01: ``90 days`` cutoff (DataFrame ``DAYS_BEFORE_FINAL \u003d 90``).\n- Q02: ``p_size \u003d 15``, ``p_type like \u0027%BRASS\u0027``, ``r_name \u003d \u0027EUROPE\u0027``.\n- Q04: base date ``1993-07-01`` (``3 month`` interval preserved per the\n  \"quarter of a year\" wording).\n- Q05: ``r_name \u003d \u0027ASIA\u0027``.\n- Q06: ``l_discount between 0.06 - 0.01 and 0.06 + 0.01``.\n- Q07: nations ``\u0027FRANCE\u0027`` / ``\u0027GERMANY\u0027``.\n- Q08: ``r_name \u003d \u0027AMERICA\u0027``, ``p_type \u003d \u0027ECONOMY ANODIZED STEEL\u0027``,\n  inner-case ``nation \u003d \u0027BRAZIL\u0027``.\n- Q09: ``p_name like \u0027%green%\u0027``.\n- Q10: base date ``1993-10-01`` (``3 month`` interval preserved).\n- Q11: ``n_name \u003d \u0027GERMANY\u0027``.\n- Q12: ship modes ``(\u0027MAIL\u0027, \u0027SHIP\u0027)``, base date ``1994-01-01``.\n- Q13: ``o_comment not like \u0027%special%requests%\u0027``.\n- Q14: base date ``1995-09-01``.\n- Q15: base date ``1996-01-01``.\n- Q16: ``p_brand \u003c\u003e \u0027Brand#45\u0027``, ``p_type not like \u0027MEDIUM POLISHED%\u0027``,\n  sizes ``(49, 14, 23, 45, 19, 3, 36, 9)``.\n- Q17: ``p_brand \u003d \u0027Brand#23\u0027``, ``p_container \u003d \u0027MED BOX\u0027``.\n- Q18: ``sum(l_quantity) \u003e 300``.\n- Q19: brands ``Brand#12`` / ``Brand#23`` / ``Brand#34`` with the matching\n  minimum quantities (1, 10, 20).\n- Q20: ``p_name like \u0027forest%\u0027``, base date ``1994-01-01``,\n  ``n_name \u003d \u0027CANADA\u0027``.\n- Q21: ``n_name \u003d \u0027SAUDI ARABIA\u0027``.\n- Q22: country codes ``(\u002713\u0027, \u002731\u0027, \u002723\u0027, \u002729\u0027, \u002730\u0027, \u002718\u0027, \u002717\u0027)``.\n\nInterval units (month / year) are preserved where the problem-statement\ntext reads \"given quarter\", \"given year\", \"given month\". Q01 keeps the\nliteral \"days\" unit because the TPC-H problem statement itself describes\nthe cutoff in days.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* tpch examples: apply SKILL.md idioms across all 22 queries\n\nSweep every q01..q22 example for idiomatic DataFrame style as described in\nthe repo-root SKILL.md:\n\n- ``col(\"x\") \u003d\u003d \"s\"`` in place of ``col(\"x\") \u003d\u003d lit(\"s\")`` on comparison\n  right-hand sides (auto-wrap applies).\n- Plain-name strings in ``select``/``aggregate``/``sort`` group/sort key\n  lists when the key is a bare column.\n- Drop redundant ``how\u003d\"inner\"`` and single-element ``left_on``/``right_on``\n  list wrapping on equi-joins.\n- Collapse chained ``.filter(a).filter(b)`` runs into ``.filter(a, b)``\n  and chained ``.with_column`` runs into ``.with_columns(a\u003d..., b\u003d...)``.\n- ``df.sort_by(...)`` or plain-name ``df.sort(...)`` when no null-placement\n  override is needed.\n- ``F.count_star()`` in place of ``F.count(col(\"x\"))`` whenever the SQL\n  reads ``count(*)``.\n- ``F.starts_with(col, lit(prefix))`` and ``~F.starts_with(...)`` in place\n  of substring-prefix equality/inequality tricks.\n- ``F.in_list(col, [lit(...)])`` in place of ``~F.array_position(...).\n  is_null()`` and in place of disjunctions of equality comparisons.\n- Searched ``F.when(cond, x).otherwise(y)`` in place of switched\n  ``F.case(bool_expr).when(lit(True/False), x).end()`` forms.\n- Semi-joins as the DataFrame form of ``EXISTS`` (Q04); anti-joins as\n  ``NOT EXISTS`` (Q22 was already using this idiom).\n- Whole-frame window aggregates as the DataFrame stand-in for a SQL\n  scalar subquery (Q11/Q15/Q17/Q22).\n\nIndividual query fixes of note:\n\n- Q16 — add the secondary sort keys (``p_brand``, ``p_type``, ``p_size``)\n  that the TPC-H spec requires but the original DataFrame omitted.\n- Q22 — drop a stray ``df.show()`` mid-pipeline; replace the 0-based\n  substring slice with ``F.left(col(\"c_phone\"), lit(2))``.\n- Q14 — rewrite the promo/non-promo factor split as a searched CASE inside\n  ``F.sum(...)`` so the DataFrame expression matches the reference SQL\n  shape exactly.\n\nAll 22 answer-file comparisons still pass at scale factor 1.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* tpch examples: more idiomatic aggregate FILTER, string funcs, date handling\n\nAdditional sweep of the TPC-H DataFrame examples informed by comparing\nagainst a fresh set of SKILL.md-only generations under\n``examples/tpch/agentic_queries/``:\n\n- Q02: ``F.ends_with(col(\"p_type\"), lit(TYPE_OF_INTEREST))`` in place of\n  ``F.strpos(col, lit) \u003e 0``. The reference SQL is ``p_type like \u0027%BRASS\u0027``,\n  which is an ends_with check, not contains. ``F.strpos \u003e 0`` returned the\n  correct rows on TPC-H data by coincidence but is semantically wrong.\n- Q09: ``F.contains(col(\"p_name\"), lit(part_color))`` in place of\n  ``F.strpos(col, lit) \u003e 0``. The SQL is ``p_name like \u0027%green%\u0027``.\n- Q08, Q12, Q14: use the ``filter`` keyword on ``F.sum`` / ``F.count`` —\n  the DataFrame form of SQL ``sum(...) FILTER (WHERE ...)`` — instead of\n  wrapping the aggregate input in ``F.when(cond, x).otherwise(0)``. Q08\n  also reorganises to inner-join the supplier\u0027s nation onto the regional\n  sales, which removes the previous left-join + ``F.when(is_not_null, ...)``\n  dance.\n- Q15: compute the grand maximum revenue as a separate scalar aggregate\n  and ``join_on(...)`` on equality, instead of the whole-frame window\n  ``F.max`` + filter shape. Simpler plan, same result.\n- Q16: ``F.regexp_like(col, pattern)`` in place of\n  ``F.regexp_match(col, pattern).is_not_null()``.\n- Q04, Q05, Q06, Q07, Q08, Q10, Q12, Q14, Q15, Q20: store both the start\n  and the end of the date window as plain ``datetime.date`` objects and\n  compare with ``lit(end_date)``, instead of carrying the start date +\n  ``pa.month_day_nano_interval`` and adding them at query-build time.\n  Drops unused ``pyarrow`` imports from the files that no longer need\n  Arrow scalars.\n\nAll 22 answer-file comparisons still pass at scale factor 1.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "c8bb9f7d3876de97141d204740a6b99d5facd10f",
      "tree": "e8b2602165c4dc1e7208d56b3fc45f5d6332573e",
      "parents": [
        "8741d30cd812e4668f3f9187b56f12ce2de0d6e7"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 24 07:57:11 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 24 07:57:11 2026 -0400"
      },
      "message": "docs: add README section for AI coding assistants (#1503)\n\nPoints users to the repo-root SKILL.md via the npx skills registry or a\nmanual AGENTS.md / CLAUDE.md pointer. Implements PR 1c of the plan in #1394.\n\nCo-authored-by: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "8741d30cd812e4668f3f9187b56f12ce2de0d6e7",
      "tree": "d7fa7ec580a8ce057c76934d8a7ef01f3e710ce2",
      "parents": [
        "8a5d783c7e418bfbbd95e48a2d9cacafea6162c7"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Apr 23 22:01:01 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 23 22:01:01 2026 -0400"
      },
      "message": "docs: enrich module docstrings and add doctest examples (#1498)\n\n* Enrich module docstrings and add doctest examples\n\nExpands the module docstrings for `functions.py`, `dataframe.py`,\n`expr.py`, and `context.py` so each module opens with a concept summary,\ncross-references to related APIs, and a small executable example.\n\nAdds doctest examples to the high-traffic `DataFrame` methods that\npreviously lacked them: `select`, `aggregate`, `sort`, `limit`, `join`,\nand `union`. Optional parameters are demonstrated with keyword syntax,\nand examples reuse the same input data across variants so the effect of\neach option is easy to see.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Use distinct group sums in aggregate docstring example\n\nChange the score data from [1, 2, 3] to [1, 2, 5] so the grouped\nresult produces [3, 5] instead of [3, 3], removing ambiguity about\nwhich total belongs to which team.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Align module-docstring examples with SKILL.md idioms\n\nDrop the redundant lit() in the dataframe.py module-docstring filter\nexample and use a plain string group key in the aggregate() doctest, so\nboth examples model the style SKILL.md recommends. Also document the\nsort(\"a\") string form and sort_by() shortcut in SKILL.md\u0027s sorting\nsection.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "8a5d783c7e418bfbbd95e48a2d9cacafea6162c7",
      "tree": "08f42ac0dedff8563163d297b8c6d13c95d97eba",
      "parents": [
        "40309978c920bd123a4c7b764a2ddfdb97758607"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Apr 23 19:05:06 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 23 19:05:06 2026 -0400"
      },
      "message": "Skills require the header to be the first thing in the file which conflicts with the RAT check. Make an exception for this file. (#1501)"
    },
    {
      "commit": "40309978c920bd123a4c7b764a2ddfdb97758607",
      "tree": "c70a137ffbb797b52787dac6f858e97af8de001f",
      "parents": [
        "60d8b5dbb5e409cd9ce7692972420e955b8a802e"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Apr 23 18:28:55 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 23 18:28:55 2026 -0400"
      },
      "message": "Add SKILL.md and enrich package docstring (#1497)\n\n* Add AGENTS.md and enrich __init__.py module docstring\n\nAdd python/datafusion/AGENTS.md as a comprehensive DataFrame API guide\nfor AI agents and users. It ships with pip automatically (Maturin includes\neverything under python-source \u003d \"python\"). Covers core abstractions,\nimport conventions, data loading, all DataFrame operations, expression\nbuilding, a SQL-to-DataFrame reference table, common pitfalls, idiomatic\npatterns, and a categorized function index.\n\nEnrich the __init__.py module docstring from 2 lines to a full overview\nwith core abstractions, a quick-start example, and a pointer to AGENTS.md.\n\nCloses #1394 (PR 1a)\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Clarify audience of root vs package AGENTS.md\n\nThe root AGENTS.md (symlinked as CLAUDE.md) is for contributors working\non the project. Add a pointer to python/datafusion/AGENTS.md which is\nthe user-facing DataFrame API guide shipped with the package. Also add\nthe Apache license header to the package AGENTS.md.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add PR template and pre-commit check guidance to AGENTS.md\n\nDocument that all PRs must follow .github/pull_request_template.md and\nthat pre-commit hooks must pass before committing. List all configured\nhooks (actionlint, ruff, ruff-format, cargo fmt, cargo clippy, codespell,\nuv-lock) and the command to run them manually.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove duplicated hook list from AGENTS.md\n\nLet the hooks be discoverable from .pre-commit-config.yaml rather than\nmaintaining a separate list that can drift.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Fix AGENTS.md: Arrow C Data Interface, aggregate filter, fluent example\n\n- Clarify that DataFusion works with any Arrow C Data Interface\n  implementation, not just PyArrow.\n- Show the filter keyword argument on aggregate functions (the idiomatic\n  HAVING equivalent) instead of the post-aggregate .filter() pattern.\n- Update the SQL reference table to show FILTER (WHERE ...) syntax.\n- Remove the now-incorrect \"Aggregate then filter for HAVING\" pitfall.\n- Add .collect() to the fluent chaining example so the result is clearly\n  materialized.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Update agents file after working through the first tpc-h query using only the text description\n\n* Add feedback from working through each of the TPC-H queries\n\n* Address Copilot review feedback on AGENTS.md\n\n- Wrap CASE/WHEN method-chain examples in parentheses and assign to a\n  variable so they are valid Python as shown (Copilot #1, #2).\n- Fix INTERSECT/EXCEPT mapping: the default distinct\u003dFalse corresponds to\n  INTERSECT ALL / EXCEPT ALL, not the distinct forms. Updated both the\n  Set Operations section and the SQL reference table to show both the\n  ALL and distinct variants (Copilot #4).\n- Change write_parquet / write_csv / write_json examples to file-style\n  paths (output.parquet, etc.) to match the convention used in existing\n  tests and examples. Note that a directory path is also valid for\n  partitioned output (Copilot #5).\n\nVerified INTERSECT/EXCEPT semantics with a script:\n  df1.intersect(df2)                -\u003e [1, 1, 2]  (\u003d INTERSECT ALL)\n  df1.intersect(df2, distinct\u003dTrue) -\u003e [1, 2]     (\u003d INTERSECT)\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Use short-form comparisons in AGENTS.md examples\n\nDrop lit() on the RHS of comparison operators since Expr auto-wraps raw\nPython values, matching the style the guide recommends (Copilot #3, #6).\n\nUpdates examples in the Aggregation, CASE/WHEN, SQL reference table,\nCommon Pitfalls, Fluent Chaining, and Variables-as-CTEs sections, plus\nthe __init__.py quick-start snippet. Prose explanations of the rule\n(which cite the long form as the thing to avoid) are left unchanged.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Move user guide from python/datafusion/AGENTS.md to SKILL.md\n\nThe in-wheel AGENTS.md was not a real distribution channel -- no shipping\nagent walks site-packages for AGENTS.md files. Moving to SKILL.md at the\nrepo root, with YAML frontmatter, lets the skill ecosystems (npx skills,\nClaude Code plugin marketplaces, community aggregators) discover it.\n\nUpdate the pointers in the contributor AGENTS.md and the __init__.py\nmodule docstring accordingly. The docstring now references the GitHub\nURL since the file no longer ships with the wheel.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Address review feedback: doctest, streaming, date/timestamp\n\n- Convert the __init__.py quick-start block to doctest format so it is\n  picked up by `pytest --doctest-modules` (already the project default),\n  preventing silent rot.\n- Extract streaming into its own SKILL.md subsection with guidance on\n  when to prefer execute_stream() over collect(), sync and async\n  iteration, and execute_stream_partitioned() for per-partition streams.\n- Generalize the date-arithmetic rule from Date32 to both Date32 and\n  Date64 (both reject Duration at any precision, both accept\n  month_day_nano_interval), and note that Timestamp columns differ and\n  do accept Duration.\n- Document the PyArrow-inherited type mapping returned by\n  to_pydict()/to_pylist(), including the nanosecond fallback to\n  pandas.Timestamp / pandas.Timedelta and the to_pandas() footgun where\n  date columns come back as an object dtype.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Distinguish user guide from agent reference in module docstring\n\nThe docstring pointed readers at SKILL.md as a \"comprehensive guide,\" but\nSKILL.md is written in a dense, skill-oriented format for agents — humans\nare better served by the online user guide. Put the online docs first as\nthe primary reference and label the SKILL.md link as the agent reference.\n\nCo-Authored-By: Claude Opus 4.7 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "60d8b5dbb5e409cd9ce7692972420e955b8a802e",
      "tree": "f2364ca93f408363a0929fe25e103226f63ac1d1",
      "parents": [
        "2715a32e939d17222c18e8adacf85ee45da464b9"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Apr 14 03:31:01 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 14 03:31:01 2026 -0400"
      },
      "message": "Fix error on show() with an explain plan (#1492)"
    },
    {
      "commit": "2715a32e939d17222c18e8adacf85ee45da464b9",
      "tree": "91262ce078f88250bfbdd6424445f043859fc2ca",
      "parents": [
        "398980d1edbb8ad6d9744236f2dfe0c6ab4b4665"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Apr 14 03:27:00 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 14 03:27:00 2026 -0400"
      },
      "message": "chore: update release documentation (#1494)\n\n* Update release documentation\n\n* Minor change to workflow because release start at 1"
    },
    {
      "commit": "398980d1edbb8ad6d9744236f2dfe0c6ab4b4665",
      "tree": "17cd377ee141d16fdfee93979a0db7286fc6921f",
      "parents": [
        "8a7efead43cff8dc7515e27e53da7545100e25a7"
      ],
      "author": {
        "name": "Zeel Desai",
        "email": "72783325+zeel2104@users.noreply.github.com",
        "time": "Mon Apr 13 09:24:56 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Apr 13 09:24:56 2026 -0400"
      },
      "message": "Support None comparisons for null expressions (#1489)\n\n* Support None comparisons for null expressions\n\n* Fold None comparison coverage into relational expr test"
    },
    {
      "commit": "8a7efead43cff8dc7515e27e53da7545100e25a7",
      "tree": "64ed2a5076a48e7f737e18b046d3cd43ac04aeb6",
      "parents": [
        "00b24572c98a257f06ff026a90c07634a86204d4"
      ],
      "author": {
        "name": "Shreyesh",
        "email": "shreyesh.arangath@gmail.com",
        "time": "Mon Apr 13 03:34:35 2026 -0700"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Apr 13 06:34:35 2026 -0400"
      },
      "message": "Add Python bindings for accessing ExecutionMetrics (#1381)\n\n* feat: add Python bindings for accessing ExecutionMetrics\n\n* test: imporve tests\n\n* first round of reviews\n\n* plan caching\n\n* address some concerns\n\n* merge and address comments\n\n* fix Ci issues\n\n* attempt to fix lint\n\n* fix build\n\n* fix docstring\n\n* address some more comments\n\n---------\n\nCo-authored-by: ShreyeshArangath \u003cshryeyesh.arangath@gmail.com\u003e"
    },
    {
      "commit": "00b24572c98a257f06ff026a90c07634a86204d4",
      "tree": "6e4c531777ff78afe3c9c507d2397517ec59a8b6",
      "parents": [
        "1be838bb47f04bcf4d1a0f65e3e6958aa9366f3f"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Apr 13 06:33:29 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Apr 13 06:33:29 2026 -0400"
      },
      "message": "ci: disable symbol export on Windows verification (#1486)\n\n* Set rust flags on windows release verification\n\n* Forward flag to linker\n\n* Switch to msvc rust toolchain\n\n* Revert \"Switch to msvc rust toolchain\"\n\nThis reverts commit 9879fc7dbe066098445b9600087e665435b58f8a."
    },
    {
      "commit": "1be838bb47f04bcf4d1a0f65e3e6958aa9366f3f",
      "tree": "947f2e392faa9002020930092d7d1cee9dde83a2",
      "parents": [
        "3585c11eed778810e3317c56c2c25a8cdc29be5b"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sun Apr 12 21:24:39 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sun Apr 12 21:24:39 2026 -0400"
      },
      "message": "Release 53.0.0 (#1491)\n\n* Update version number and changelog\n\n* minor: set version number on dependency to publish to crates.io\n\n* taplo fmt"
    },
    {
      "commit": "3585c11eed778810e3317c56c2c25a8cdc29be5b",
      "tree": "3b4c9265fa211bdd0cdfbe8f2dbe2d345bdcf83a",
      "parents": [
        "ecd14c10aff67169f2bfe1b7f86ff07621088dd0"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Apr 09 07:38:59 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 09 07:38:59 2026 -0400"
      },
      "message": "minor: remove deprecated interfaces (#1481)\n\n* udf module has been deprecated since DF47. html_formatter module has been deprecated since DF48.\n\n* database has been deprecated since DF48\n\n* select_columns has been deprecated since DF43\n\n* unnest_column has been deprecated since DF42\n\n* display_name has been deprecated since DF42\n\n* window() has been deprecated since DF50\n\n* serde functions have been deprecated since DF42\n\n* from_arrow_table and tables have been deprecated since DF42\n\n* RuntimeConfig has been deprecated since DF44\n\n* Update user documentation to remove deprecated function\n\n* update tpch examples for latest function uses\n\n* Remove unnecessary options in example\n\n* update rendering for the most recent dataframe_formatter instead of the deprecated html_formatter"
    },
    {
      "commit": "ecd14c10aff67169f2bfe1b7f86ff07621088dd0",
      "tree": "03d3e29a3a4a7933cff0ffe591c9fa042b9c48e2",
      "parents": [
        "aa3b1948c3a49d14395093287a6e93354229c539"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Apr 08 11:11:48 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Apr 08 11:11:48 2026 -0400"
      },
      "message": "Add missing SessionContext utility methods (#1475)\n\n* Add missing SessionContext utility methods\n\nExpose upstream DataFusion v53 utility methods: session_start_time,\nenable_ident_normalization, parse_sql_expr, execute_logical_plan,\nrefresh_catalogs, remove_optimizer_rule, and table_provider. The\nadd_optimizer_rule and add_analyzer_rule methods are omitted as the\nOptimizerRule and AnalyzerRule traits are not yet exposed to Python.\nCloses #1459.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Raise KeyError from table_provider for consistency with table()\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add docstring examples for new SessionContext utility methods\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* update docstring\n\n* Address PR review feedback for SessionContext utility methods\n\n- Improve docstring examples to show actual output instead of asserts\n- Use doctest +SKIP for non-deterministic session_start_time output\n- Fix table_provider error mapping: outer async error is now RuntimeError\n- Strengthen tests: validate RFC 3339 with fromisoformat, test both\n  optimizer rule removal paths, exact string match for parse_sql_expr,\n  verify enable_ident_normalization with dynamic state change\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Fix test_session_start_time failure on Python 3.10\n\ndatetime.fromisoformat() only supports up to 6 fractional-second\ndigits (microseconds) on Python 3.10. Truncate nanosecond precision\nbefore parsing.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "aa3b1948c3a49d14395093287a6e93354229c539",
      "tree": "8e2124d5f4029861b1fb54297f257294d742718b",
      "parents": [
        "46f9ab8fcad03913234ce29e5075644c1ecdb9b7"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Apr 08 09:22:28 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Apr 08 09:22:28 2026 -0400"
      },
      "message": "Add missing registration methods (#1474)\n\n* Add missing SessionContext read/register methods for Arrow IPC and batches\n\nAdd read_arrow, read_empty, register_arrow, and register_batch methods to\nSessionContext, exposing upstream DataFusion v53 functionality. The write_*\nmethods and read_batch/read_batches are already covered by DataFrame.write_*\nand SessionContext.from_arrow respectively. Closes #1458.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove redundant read_empty Rust binding, make Python read_empty an alias for empty_table\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add pathlib.Path and empty batch tests for Arrow IPC and register_batch\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Make test_read_empty more robust with length and num_rows checks\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add examples to docstrings for new register/read methods\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Empty table actually returns record batch of length one but there are no columns\n\n* Add optional argument examples to register_arrow and read_arrow docstrings\n\nDemonstrate schema\u003d and file_extension\u003d keyword arguments in the\ndocstring examples for register_arrow and read_arrow, following project\nguidelines for optional parameter documentation.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Simplify read_empty docstring to use alias pattern\n\nFollow the same See Also alias convention used in functions.py since\nread_empty is a simple alias for empty_table.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove shared ctx from doctest namespace, use inline SessionContext\n\nAvoid shared SessionContext state across doctests by having each\ndocstring example create its own ctx instance, matching the pattern\nused throughout the rest of the codebase.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove redundant import pyarrow as pa from docstrings\n\nThe pa alias is already provided by the doctest namespace in\nconftest.py, so inline imports are unnecessary.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "46f9ab8fcad03913234ce29e5075644c1ecdb9b7",
      "tree": "008f7dcfb9c5d7ed35fb5e3bbfdab371eb6aa865",
      "parents": [
        "52932128d353e417ddae2c5ff3f14135cb806f7e"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Apr 07 15:03:38 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 07 15:03:38 2026 -0400"
      },
      "message": "Add missing deregister methods to SessionContext (#1473)\n\n* Add deregister methods to SessionContext for UDFs and object stores\n\nExpose upstream DataFusion deregister methods (deregister_udf, deregister_udaf,\nderegister_udwf, deregister_udtf, deregister_object_store) in both the Rust\nPyO3 bindings and Python wrappers, closing the gap identified in #1457.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Fix deregister tests to expect ValueError instead of RuntimeError\n\nDataFusion raises ValueError for planning errors when a deregistered\nfunction is used in a query.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Replace .unwrap() with proper error propagation in object store methods\n\nUrl::parse() can fail on invalid input. Use .map_err() to convert\nthe error into a Python exception instead of panicking.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Minor move of import statement\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "52932128d353e417ddae2c5ff3f14135cb806f7e",
      "tree": "b88ea32d565a6b1708831f95a42034c7028a50e4",
      "parents": [
        "898d73de20346bba7241907bb18cba47da53e9a9"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Apr 07 14:58:09 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 07 14:58:09 2026 -0400"
      },
      "message": "Add missing Dataframe functions (#1472)\n\n* Add missing DataFrame methods for set operations and query\n\nExpose upstream DataFusion DataFrame methods that were not yet\navailable in the Python API. Closes #1455.\n\nSet operations:\n- except_distinct: set difference with deduplication\n- intersect_distinct: set intersection with deduplication\n- union_by_name: union matching columns by name instead of position\n- union_by_name_distinct: union by name with deduplication\n\nQuery:\n- distinct_on: deduplicate rows based on specific columns\n- sort_by: sort by expressions with ascending order and nulls last\n\nNote: show_limit is already covered by the existing show(num) method.\nexplain_with_options and with_param_values are deferred as they require\nexposing additional types (ExplainOption, ParamValues).\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add ExplainFormat enum and format option to DataFrame.explain()\n\nExtend the existing explain() method with an optional format parameter\ninstead of adding a separate explain_with_options() method. This keeps\nthe API simple while exposing all upstream ExplainOption functionality.\n\nAvailable formats: indent (default), tree, pgjson, graphviz.\n\nThe ExplainFormat enum is exported from the top-level datafusion module.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add DataFrame.window() and unnest recursion options\n\nExpose remaining DataFrame methods from upstream DataFusion.\nCloses #1456.\n\n- window(*exprs): apply window function expressions and append results\n  as new columns\n- unnest_column/unnest_columns: add optional recursions parameter for\n  controlling unnest depth via (input_column, output_column, depth)\n  tuples\n\nNote: drop_columns is already exposed as the existing drop() method.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Update docstring\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e\n\n* Improve docstrings and test robustness for new DataFrame methods\n\nClarify except_distinct/intersect_distinct docstrings, add deterministic\nsort to test_window, add sort_by ascending verification test, and add\nsmoke tests for PGJSON and GRAPHVIZ explain formats.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Consolidate new DataFrame tests into parametrized tests\n\nCombine set operation tests (except_distinct, intersect_distinct,\nunion_by_name, union_by_name_distinct) into a single parametrized\ntest_set_operations_distinct. Merge sort_by tests and convert\nexplain format tests to parametrized form.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add doctest examples to new DataFrame method docstrings\n\nAdd \u003e\u003e\u003e style usage examples for window, explain, except_distinct,\nintersect_distinct, union_by_name, union_by_name_distinct, distinct_on,\nsort_by, and unnest_columns to match existing docstring conventions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Improve error messages, tests, and API hygiene from PR review\n\n- Provide actionable error message for invalid explain format strings\n- Remove recursions param from deprecated unnest_column (use unnest_columns)\n- Add null-handling test case for sort_by to verify nulls-last behavior\n- Add format-specific assertions to explain tests (TREE, PGJSON, GRAPHVIZ)\n- Add deep recursion test for unnest_columns with depth \u003e 1\n- Add multi-expression window test to verify variadic *exprs\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Consolidate window and unnest tests into parametrized tests\n\nCombine test_window and test_window_multiple_expressions into a single\nparametrized test. Merge unnest recursion tests into one parametrized\ntest covering basic, explicit depth 1, and deep recursion cases.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Address PR review feedback for DataFrame operations\n\n- Use upstream parse error for explain format instead of hardcoded options\n- Fix sort_by to use column name resolution consistent with sort()\n- Use ExplainFormat enum members directly in tests instead of string lookup\n- Merge union_by_name_distinct into union_by_name(distinct\u003dFalse) for a\n  more Pythonic API\n- Update check-upstream skill to note union_by_name_distinct coverage\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add DataFrame.column(), col(), and find_qualified_columns() methods\n\nExpose upstream find_qualified_columns to resolve unqualified column\nnames into fully qualified column expressions. This is especially\nuseful for disambiguating columns after joins.\n\n- find_qualified_columns(*names) on Rust side calls upstream directly\n- DataFrame.column(name) and col(name) alias on Python side\n- Update join and join_on docstrings to reference DataFrame.col()\n- Add \"Disambiguating Columns with DataFrame.col()\" section to joins docs\n- Add tests for qualified column resolution, ambiguity, and join usage\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Merge union_by_name and union_by_name_distinct into a single method with distinct flag\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* converting into a python dict loses a column when the names are identical\n\n* Consolidate except_all/except_distinct and intersect/intersect_distinct into single methods with distinct flag\n\nFollows the same pattern as union(distinct\u003d) and union_by_name(distinct\u003d).\nAlso deprecates union_distinct() in favor of union(distinct\u003dTrue).\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e"
    },
    {
      "commit": "898d73de20346bba7241907bb18cba47da53e9a9",
      "tree": "2a462b3bbae08b97d637c90a91c9786236553f1b",
      "parents": [
        "d07fdb3ef7d211920f40d0106fa50161c0bf20ce"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Apr 07 09:01:36 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Apr 07 09:01:36 2026 -0400"
      },
      "message": "Add missing aggregate functions (#1471)\n\n* Add missing aggregate functions: grouping, percentile_cont, var_population\n\nExpose upstream DataFusion aggregate functions that were not yet\navailable in the Python API. Closes #1454.\n\n- grouping: returns grouping set membership indicator (rewritten by\n  the ResolveGroupingFunction analyzer rule before physical planning)\n- percentile_cont: computes exact percentile using continuous\n  interpolation (unlike approx_percentile_cont which uses t-digest)\n- var_population: alias for var_pop\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Fix grouping() distinct parameter type for API consistency\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Improve aggregate function tests and docstrings per review feedback\n\nAdd docstring example to grouping(), parametrize percentile_cont tests,\nand add multi-column grouping test case.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add GroupingSet.rollup, .cube, and .grouping_sets factory methods\n\nExpose ROLLUP, CUBE, and GROUPING SETS via the DataFrame API by adding\nstatic methods on GroupingSet that construct the corresponding Expr\nvariants. Update grouping() docstring and tests to use the new API.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove _GroupingSetInternal alias, use expr_internal.GroupingSet directly\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Parametrize grouping set tests for rollup and cube\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add grouping sets documentation and note grouping() alias limitation\n\nAdd user documentation for GroupingSet.rollup, .cube, and\n.grouping_sets with Pokemon dataset examples. Document the upstream\nalias limitation (apache/datafusion#21411) in both the grouping()\ndocstring and the aggregation user guide.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add grouping sets note to DataFrame.aggregate() docstring\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Address PR review feedback: add quantile_cont alias and simplify examples\n\n- Add quantile_cont as alias for percentile_cont (matches upstream)\n- Replace pa.concat_arrays batch pattern with collect_column() in docstrings\n- Add percentile_cont, quantile_cont, var_population to docs function list\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Accept string column names in GroupingSet factory methods\n\nGroupingSet.rollup(), .cube(), and .grouping_sets() now accept both\nExpr objects and string column names, consistent with DataFrame.aggregate().\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add agent instructions to keep aggregation/window docs in sync\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* dfn is already available globally\n\n* Remove unnecessary import on doctest\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "d07fdb3ef7d211920f40d0106fa50161c0bf20ce",
      "tree": "84d6cf140850de88bd310be10347a2f351960437",
      "parents": [
        "99bc9602dd077c924685f1fc6e54e6feb3429302"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Apr 06 08:54:30 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Apr 06 08:54:30 2026 -0400"
      },
      "message": "Add missing scalar functions  (#1470)\n\n* Add missing scalar functions: get_field, union_extract, union_tag, arrow_metadata, version, row\n\nExpose upstream DataFusion scalar functions that were not yet available\nin the Python API. Closes #1453.\n\n- get_field: extracts a field from a struct or map by name\n- union_extract: extracts a value from a union type by field name\n- union_tag: returns the active field name of a union type\n- arrow_metadata: returns Arrow field metadata (all or by key)\n- version: returns the DataFusion version string\n- row: alias for the struct constructor\n\nNote: arrow_try_cast was listed in the issue but does not exist in\nDataFusion 53, so it is not included.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add tests for new scalar functions\n\nTests for get_field, arrow_metadata, version, row, union_tag, and\nunion_extract.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Accept str for field name and type parameters in scalar functions\n\nAllow arrow_cast, get_field, and union_extract to accept plain str\narguments instead of requiring Expr wrappers. Also improve\narrow_metadata test coverage and fix parameter shadowing.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Accept str for key parameter in arrow_metadata for consistency\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add doctest examples and fix docstring style for new scalar functions\n\nReplace Args/Returns sections with doctest Examples blocks for\narrow_metadata, get_field, union_extract, union_tag, and version to\nmatch existing codebase conventions. Simplify row to alias-style\ndocstring with See Also reference. Document that arrow_cast accepts\nboth str and Expr for data_type.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Support pyarrow DataType in arrow_cast\n\nAllow arrow_cast to accept a pyarrow DataType in addition to str and\nExpr. The DataType is converted to its string representation before\nbeing passed to DataFusion. Adds test coverage for the new input type.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Document bracket syntax shorthand in get_field docstring\n\nNote that expr[\"field\"] is a convenient alternative when the field\nname is a static string, and get_field is needed for dynamic\nexpressions. Add a second doctest example showing the bracket syntax.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Fix arrow_cast with pyarrow DataType by delegating to Expr.cast\n\nUse the existing Rust-side PyArrowType\u003cDataType\u003e conversion via\nExpr.cast() instead of str() which produces pyarrow type names\nthat DataFusion does not recognize.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Clarify when to use arrow_cast vs Expr.cast in docstring\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "99bc9602dd077c924685f1fc6e54e6feb3429302",
      "tree": "ccaa7c9a5fb8012dd2c005de2997b35ece6f0382",
      "parents": [
        "ff15648c5dca6b41d3f6146c6c36c97e605f8561"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Apr 06 07:47:13 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Apr 06 07:47:13 2026 -0400"
      },
      "message": "Add missing array functions (#1468)\n\n* Add missing array/list functions and aliases (#1452)\n\nAdd new array functions from upstream DataFusion v53: array_any_value,\narray_distance, array_max, array_min, array_reverse, arrays_zip,\nstring_to_array, and gen_series. Add corresponding list_* aliases and\nmissing list_* aliases for existing functions (list_empty, list_pop_back,\nlist_pop_front, list_has, list_has_all, list_has_any). Also add\narray_contains/list_contains as aliases for array_has, generate_series\nas alias for gen_series, and string_to_list as alias for string_to_array.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add unit tests for new array/list functions and aliases\n\nTests cover all functions and aliases added in the previous commit:\narray_any_value, array_distance, array_max, array_min, array_reverse,\narrays_zip, string_to_array, gen_series, generate_series,\narray_contains, list_contains, list_empty, list_pop_back,\nlist_pop_front, list_has, list_has_all, list_has_any, and list_*\naliases for the new functions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Improve array function APIs: optional params, better naming, restore comment\n\n- Make null_string optional in string_to_array/string_to_list\n- Make step optional in gen_series/generate_series\n- Rename second_array to element in array_contains/list_has/list_contains\n- Restore # Window Functions section comment in __all__\n- Add tests for optional parameter variants\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Consolidate array/list function tests using pytest parametrize\n\nReduce 26 individual tests to 14 test functions with parametrized\ncases, eliminating boilerplate while maintaining full coverage.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Move list alias tests into existing test_array_functions parametrize block\n\nMerge standalone tests for list_empty, list_pop_back, list_pop_front,\nlist_has, array_contains, list_contains, list_has_all, and list_has_any\ninto the existing parametrized test_array_functions block alongside\ntheir array_* counterparts.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Merge test_array_any_value into parametrized test_any_value_aliases\n\nUse the richer multi-row dataset (including all-nulls case) for both\narray_any_value and list_any_value via the parametrized test.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add arrays_overlap and list_overlap as aliases for array_has_any\n\nThese aliases match the upstream DataFusion SQL-level aliases, completing\nthe set of missing array functions from issue #1452.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add docstring examples for optional params in string_to_array and gen_series\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Update AGENTS file to demonstrate preferred method of documenting python functions\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "ff15648c5dca6b41d3f6146c6c36c97e605f8561",
      "tree": "9ef8154af5770402b050e0f06705ce10370401be",
      "parents": [
        "8a35caea9ed01492742738f161fa5b4459d69402"
      ],
      "author": {
        "name": "Nuno Faria",
        "email": "nunofpfaria@gmail.com",
        "time": "Sun Apr 05 13:29:32 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sun Apr 05 08:29:32 2026 -0400"
      },
      "message": "minor: Fix pytest instructions in README (#1477)"
    },
    {
      "commit": "8a35caea9ed01492742738f161fa5b4459d69402",
      "tree": "580704c9bf189ddbf4a5f3caa327a9161b86c883",
      "parents": [
        "16feeb136737ae45fac39f7a82cca2d88fd6224b"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sat Apr 04 12:20:31 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sat Apr 04 12:20:31 2026 -0400"
      },
      "message": "Add missing map functions (#1461)\n\n* Add map functions (make_map, map_keys, map_values, map_extract, map_entries, element_at)\n\nCloses #1448\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add unit tests for map functions\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove redundant pyo3 element_at function\n\nelement_at is already a Python-only alias for map_extract,\nso the Rust binding is unnecessary.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Change make_map to accept a Python dictionary\n\nmake_map now takes a dict for the common case and also supports\nseparate keys/values lists for column expressions. Non-Expr keys\nand values are automatically converted to literals.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Make map the primary function with make_map as alias\n\nmap() now supports three calling conventions matching upstream:\n- map({\"a\": 1, \"b\": 2}) — from a Python dictionary\n- map([keys], [values]) — two lists that get zipped\n- map(k1, v1, k2, v2, ...) — variadic key-value pairs\n\nNon-Expr keys and values are automatically converted to literals.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Improve map function docstrings\n\n- Add examples for all three map() calling conventions\n- Use clearer descriptions instead of jargon (no \"zipped\" or \"variadic\")\n- Break map_keys/map_values/map_extract/map_entries examples into\n  two steps: create the map column first, then call the function\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Remove map() in favor of make_map(), fix docstrings, add validation\n\n- Remove map() function that shadowed Python builtin; make_map() is now\n  the sole entry point for creating map expressions\n- Fix map_extract/element_at docstrings: missing keys return [None],\n  not an empty list (matches actual upstream behavior)\n- Add length validation for the two-list calling convention\n- Update all tests and docstring examples accordingly\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Consolidate map function tests into parametrized groups\n\nReduce boilerplate by combining make_map construction tests and map\naccessor function tests into two @pytest.mark.parametrize groups.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Docstring update\n\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e\n\n* Docstring update\n\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e\n\n* Simplify test for readability\n\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e\n\n* Simplify test for readability\n\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e"
    },
    {
      "commit": "16feeb136737ae45fac39f7a82cca2d88fd6224b",
      "tree": "10fb40bbeab421b23d25bade5da9985dc359c0fe",
      "parents": [
        "0b6ea95a3d304a774bbe512bb70fbca332aa5426"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Fri Apr 03 12:47:31 2026 -0700"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 03 15:47:31 2026 -0400"
      },
      "message": "Reduce peak memory usage during release builds to fix OOM on manylinux runners (#1445)\n\n* adjust swap to 8gb\n\n* modify profile.release"
    },
    {
      "commit": "0b6ea95a3d304a774bbe512bb70fbca332aa5426",
      "tree": "8c1e2c3ae2ea105c730d84dc47c1dfe3f720c8de",
      "parents": [
        "645d261ce3bc0b3b610c8d82422042b3e573e793"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 03 15:43:28 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 03 15:43:28 2026 -0400"
      },
      "message": "Add missing conditional functions (#1464)\n\n* Add missing conditional functions: greatest, least, nvl2, ifnull (#1449)\n\nExpose four conditional functions from upstream DataFusion that were\nnot yet available in the Python bindings.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add unit tests for greatest, least, nvl2, and ifnull functions\n\nTests cover multiple data types (integers, strings), null handling\n(all-null, partial-null), multiple arguments, and ifnull/nvl equivalence.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Use standard alias docstring pattern for ifnull\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* remove unused df fixture and fix parameter shadowing\n\n* Refactor conditional function tests into parametrized test suite\n\nReplace separate test functions for coalesce, greatest, least, nvl,\nnvl2, ifnull with a single parametrized test using a shared fixture.\nAdds coverage for nvl, nullif (previously untested), datetime and\nboolean types, literal fallbacks, and variadic calls.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "645d261ce3bc0b3b610c8d82422042b3e573e793",
      "tree": "4f05826255e2f5fe8d2367e8137c7057d9dd50a9",
      "parents": [
        "be8dd9d08fd284cf1747a2c1b965d9c95fff117c"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 03 13:51:43 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 03 13:51:43 2026 -0400"
      },
      "message": "Add missing string function `contains` (#1465)\n\n* Add missing `contains` string function\n\nExpose the upstream DataFusion `contains(string, search_str)` function\nwhich returns true if search_str is found within string (case-sensitive).\n\nNote: the other functions from #1450 (instr, position, substring_index)\nalready exist — instr and position are aliases for strpos, and\nsubstring_index is exposed as substr_index.\n\nCloses #1450\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add unit test for contains string function\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Update python/datafusion/functions.py\n\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\nCo-authored-by: Nuno Faria \u003cnunofpfaria@gmail.com\u003e"
    },
    {
      "commit": "be8dd9d08fd284cf1747a2c1b965d9c95fff117c",
      "tree": "3c5f66c1cfc4a2631f8255aa1b31928c766ae2d3",
      "parents": [
        "0113a6ee55cc61f9ebd897ae8cfc9213f560e468"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Apr 03 09:37:00 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Apr 03 09:37:00 2026 -0400"
      },
      "message": "Add AI skill to check current repository against upstream APIs (#1460)\n\n* Initial commit for skill to check upstream repo\n\n* Add instructions on using the check-upstream skill\n\n* Add FFI type coverage and implementation pattern to check-upstream skill\n\nDocument the full FFI type pipeline (Rust PyO3 wrapper → Protocol type →\nPython wrapper → ABC base class → exports → example) and catalog which\nupstream datafusion-ffi types are supported, which have been evaluated as\nnot needing direct exposure, and how to check for new gaps.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Update check-upstream skill to include FFI types as a checkable area\n\nAdd \"ffi types\" to the argument-hint and description so users can invoke\nthe skill with `/check-upstream ffi types`. Also add pipeline verification\nstep to ensure each supported FFI type has the full end-to-end chain\n(PyO3 wrapper, Protocol, Python wrapper with type hints, ABC, exports).\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Move FFI Types section alongside other areas to check\n\nSection 7 (FFI Types) was incorrectly placed after the Output Format and\nImplementation Pattern sections. Move it to sit after Section 6\n(SessionContext Methods), consistent with the other checkable areas.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Replace static FFI type list with dynamic discovery instruction\n\nThe supported FFI types list would go stale as new types are added.\nReplace it with a grep instruction to discover them at check time,\nkeeping only the \"evaluated and not requiring exposure\" list which\ncaptures rationale not derivable from code.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Make Python API the source of truth for upstream coverage checks\n\nFunctions exposed in Python (e.g., as aliases of other Rust bindings)\nwere being falsely reported as missing because they lacked a dedicated\n#[pyfunction] in Rust. The user-facing API is the Python layer, so\ncoverage should be measured there.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add exclusion list for DataFrame methods already covered by Python API\n\nshow_limit is covered by DataFrame.show() and with_param_values is\ncovered by SessionContext.sql(param_values\u003d...), so neither needs\nseparate exposure.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Move skills to .ai/skills/ for tool-agnostic discoverability\n\nMoves the canonical skill definitions from .claude/skills/ to .ai/skills/\nand replaces .claude/skills with a symlink, so Claude Code still discovers\nthem while other AI agents can find them in a tool-neutral location.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add AGENTS.md for tool-agnostic agent instructions with CLAUDE.md symlink\n\nAGENTS.md points agents to .ai/skills/ for skill discovery. CLAUDE.md\nsymlinks to it so Claude Code picks it up as project instructions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Make README upstream coverage section tool-agnostic\n\nRemove Claude Code references and update skill path from .claude/skills/\nto .ai/skills/ to match the new tool-neutral directory structure.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add GitHub issue lookup step to check-upstream skill\n\nWhen gaps are identified, search open issues at\napache/datafusion-python before reporting. Existing issues are\nlinked in the report rather than duplicated.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Require Python test coverage in issues created by check-upstream skill\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add license text\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "0113a6ee55cc61f9ebd897ae8cfc9213f560e468",
      "tree": "c6fcb67b7f267985b667c1794dae89647233a4fe",
      "parents": [
        "24994099e41a4e933f883557e2bce1a963bac0ea"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Apr 02 17:47:47 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Apr 02 17:47:47 2026 -0400"
      },
      "message": "Add missing datetime functions (#1467)\n\n* Add missing datetime functions: make_time, current_timestamp, date_format\n\nCloses #1451. Adds make_time Rust binding and Python wrapper, and adds\ncurrent_timestamp (alias for now) and date_format (alias for to_char)\nPython functions.\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n* Add unit tests for make_time, current_timestamp, and date_format\n\nCo-Authored-By: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e\n\n---------\n\nCo-authored-by: Claude Opus 4.6 (1M context) \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "24994099e41a4e933f883557e2bce1a963bac0ea",
      "tree": "1ffac2a348ed161d1843b85caf710a8e35f94156",
      "parents": [
        "73a9d53a37f6ce864b68dda1b07e92a0fed8c8ba"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Mar 31 14:09:16 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 31 14:09:16 2026 -0400"
      },
      "message": "ci: update codespell paths (#1469)\n\n* Update path so it works well with pre-commit\n\n* Prefix path with asterisk so we get matching in both CI and pre-commit\n\n* Update paths for codespell"
    },
    {
      "commit": "73a9d53a37f6ce864b68dda1b07e92a0fed8c8ba",
      "tree": "677778d5c94e1fd3017e3900a7e5cb93e38bd0c1",
      "parents": [
        "5be412b6a691a57bb2246e6726751fe9e8916035"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Tue Mar 31 01:57:32 2026 -0700"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 31 16:57:32 2026 +0800"
      },
      "message": "CI: Add CodeQL workflow for GitHub Actions security scanning (#1408)\n\n* CI: Add CodeQL workflow for GitHub Actions security scanning\n\n* Update .github/workflows/codeql.yml"
    },
    {
      "commit": "5be412b6a691a57bb2246e6726751fe9e8916035",
      "tree": "57e8625b67701e99b914a7761d80b350b6cc4b73",
      "parents": [
        "ad8d41f2b5faff9a35aeeb340a24480c8ccb6eff"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sun Mar 29 18:34:48 2026 -0400"
      },
      "committer": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sun Mar 29 18:34:48 2026 -0400"
      },
      "message": "Pin rust toolchain to apache allowlist sha\n"
    },
    {
      "commit": "ad8d41f2b5faff9a35aeeb340a24480c8ccb6eff",
      "tree": "7049ed5f3f8e65f7fba2ca600c4174163812f182",
      "parents": [
        "acd9a8dcdd1015497835ae3c9a49e4bf5961d719"
      ],
      "author": {
        "name": "Daniel Mesejo",
        "email": "mesejoleon@gmail.com",
        "time": "Sat Mar 28 14:35:32 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sat Mar 28 09:35:32 2026 -0400"
      },
      "message": "chore: enforce uv lockfile consistency in CI and pre-commit (#1398)\n\n* chore: enforce uv lockfile consistency in CI and pre-commit\n\n  Add --locked flag to uv sync in CI to fail if uv.lock is out of sync,\n  and add the uv-lock pre-commit hook to automatically keep uv.lock\n  up to date when pyproject.toml changes.\n\n* chore: add missing --locked calls"
    },
    {
      "commit": "acd9a8dcdd1015497835ae3c9a49e4bf5961d719",
      "tree": "e80663b744d7e39ded240e2f168bd6dfee532828",
      "parents": [
        "8c6a481b43b322a80990ff6d793d1a921218f567"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Fri Mar 27 14:19:18 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 14:19:18 2026 -0400"
      },
      "message": "Complete doc string examples for functions.py (#1435)\n\n* Verify all non-alias functions have doc string\n\n* MNove all alias for statements to see also blocks and confirm no examples\n\n* Fix google doc style for all examples\n\n* Remove builtins use\n\n* Add coverage for optional filter\n\n* Cover optional argument examples for window and value functions\n\n* Cover optional arguments for scalar functions\n\n* Cover array and aggregation functions\n\n* Make examples different\n\n* Make format more consistent\n\n* Remove duplicated df definition"
    },
    {
      "commit": "8c6a481b43b322a80990ff6d793d1a921218f567",
      "tree": "4a7db74b096cf185a52343fcc47ed56e27a922bb",
      "parents": [
        "4b215724565cec4257ed9dfa25271c5481c9f7b4"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Mar 27 12:44:54 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 12:44:54 2026 -0400"
      },
      "message": "chore: update dependencies (#1447)\n\n* Cargo lock update\n\n* Update download-artifact\n\n* Update upload-artifact to v7\n\n* Update cargo toml to latest deps"
    },
    {
      "commit": "4b215724565cec4257ed9dfa25271c5481c9f7b4",
      "tree": "250328694f80cceabdbf7c6ab5be2027f16c7110",
      "parents": [
        "75d07ce706fcbda423ad90222aa3dacccb7a5766"
      ],
      "author": {
        "name": "Topias Pyykkönen",
        "email": "43851547+toppyy@users.noreply.github.com",
        "time": "Fri Mar 27 17:17:44 2026 +0200"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 11:17:44 2026 -0400"
      },
      "message": "Add a working, more complete example of using a catalog (docs) (#1427)\n\n* Add a working, more complete example of using a catalog\n\n* the default schema is \u0027public\u0027, not \u0027default\u0027\n\n* in-memory table instead of imaginary csv for standalone example\n\n* typo fix\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\n\n* minor c string fix after merge\n\n---------\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "75d07ce706fcbda423ad90222aa3dacccb7a5766",
      "tree": "a4da6f4876c6b8b0a31f680feae70ab31400ebf8",
      "parents": [
        "207fc16d62e2f64b687798741b33964aad9b5b7e"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Mar 27 10:29:23 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 10:29:23 2026 -0400"
      },
      "message": "Implement configuration extension support (#1391)\n\n* Implement config options\n\n* Update examples and tests\n\n* pyo3 update\n\n* Add docstring\n\n* rat\n\n* Update examples/datafusion-ffi-example/python/tests/_test_config.py\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e\n\n* Update crates/core/src/context.rs\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e\n\n* Update crates/core/src/context.rs\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e\n\n---------\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e"
    },
    {
      "commit": "207fc16d62e2f64b687798741b33964aad9b5b7e",
      "tree": "a7e1f3ea5a16ad6676727265f495b5fa433eec53",
      "parents": [
        "6cea061abeca55bbe1a53e3c07ad62145d3ac809"
      ],
      "author": {
        "name": "Thomas Tanon",
        "email": "thomas@pellissier-tanon.fr",
        "time": "Fri Mar 27 14:39:20 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 09:39:20 2026 -0400"
      },
      "message": "Remove validate_pycapsule (#1426)\n\nThe Bound\u003c\u0027_, PyCapsule\u003e::pointer_checked does the same validation and is already used across the codebase"
    },
    {
      "commit": "6cea061abeca55bbe1a53e3c07ad62145d3ac809",
      "tree": "505297528a70f192ce2ba72290e68d6e98343cee",
      "parents": [
        "876646d67771261cfd9a57c721bece0d95b9740c"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Fri Mar 27 09:29:09 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 09:29:09 2026 -0400"
      },
      "message": "Update remaining existing examples to make testable/standalone executable (#1437)\n\n* Move example to doctestable examples for context.py\n\n* Add more standard dafusion namespaces to reduce clutter\n\n* Update project to use ruff compatible with pre-commit version\n\n* Resolve ruff errors for newer version but just ignore them\n\n* Convert dataframe examples to doctestable. Found bug in dropping A\n\n* Move expr.py to doctestable examples\n\n* Move user_defined.py to doctestable examples"
    },
    {
      "commit": "876646d67771261cfd9a57c721bece0d95b9740c",
      "tree": "28928f5553b6ff4fb9dd0c8a83c10586a0e4989c",
      "parents": [
        "e09c93bbe5c7d78c3752adc9158f3ff012d0c4cd"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Fri Mar 27 06:08:57 2026 -0700"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 27 09:08:57 2026 -0400"
      },
      "message": "docs: clarify DataFusion 52 FFI session-parameter requirement for provider hooks (#1439)\n\n* mention new session arg\n\n* flow better\n\n* smaller change"
    },
    {
      "commit": "e09c93bbe5c7d78c3752adc9158f3ff012d0c4cd",
      "tree": "df5e0198eb5a5f77fe1a0345ed8711b23756fd8f",
      "parents": [
        "1397c5d6444e370a0feee69231fb8bc92c778d5f"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Mar 26 15:45:05 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 26 15:45:05 2026 -0400"
      },
      "message": "ci: add swap during build, use tpchgen-cli (#1443)\n\n* restrict number of rustc jobs during build stage\n\n* temporarily run release build on PR\n\n* Change optimization setting for substrait\n\n* Add swap during release build\n\n* Remove temporary checks to build in PR\n\n* Try using tpchgen-cli for test files. commit answers\n\n* taplo fmt\n\n* do not run rat on data files\n\n* ci needs ./ in path\n\n* add no-project to uv run\n\n* Temporary debug lines to figure out what is happening in CI\n\n* filter null value during aggregation instead now that https://github.com/apache/datafusion/issues/21011 is closed"
    },
    {
      "commit": "1397c5d6444e370a0feee69231fb8bc92c778d5f",
      "tree": "7c839c97aa7ab159547864b8362d5cfa07d0e550",
      "parents": [
        "0c33524dc05091cf0bd5b510417e5b3e2ee48922"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Mar 24 08:05:42 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 24 08:05:42 2026 -0400"
      },
      "message": "bump datafusion to release version (#1441)"
    },
    {
      "commit": "0c33524dc05091cf0bd5b510417e5b3e2ee48922",
      "tree": "43df4600cd3ebc4879c13af1f8b7acaa01a35abf",
      "parents": [
        "85a3595444e7946dc4eaa166cb4843bee2bf2f07"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Tue Mar 24 05:04:39 2026 -0700"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 24 08:04:39 2026 -0400"
      },
      "message": "pin setup-uv (#1438)"
    },
    {
      "commit": "85a3595444e7946dc4eaa166cb4843bee2bf2f07",
      "tree": "760d65a3c1f20e59d88f434f99f4d3f64fec6f20",
      "parents": [
        "4e51fa8935799343c973e9cd306f42d278620d42"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 18 22:06:31 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 19 10:06:31 2026 +0800"
      },
      "message": "Add docstring examples for Aggregate window functions (#1418)\n\n* Add docstring examples for Aggregate window functions\n\nAdd example usage to docstrings for Aggregate window functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Remove for example for example docstring\n\n* Actually remove all for example calls in favor of docstrings\n\n* Remove builtins\n\n* Make google docstyle\n\n* Fix bad merge leading to duplicate xample\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "4e51fa8935799343c973e9cd306f42d278620d42",
      "tree": "a2099d91f544fcec9a10f04887d5065233b986e8",
      "parents": [
        "3c5013dd57369c55aaf5a463797b73f1d65f3d8a"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 18 02:01:44 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 18 14:01:44 2026 +0800"
      },
      "message": "Add docstring examples for Scalar string functions (#1423)\n\n* Add docstring examples for Scalar string functions\n\nAdd example usage to docstrings for Scalar string functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Remove examples for aliases\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "3c5013dd57369c55aaf5a463797b73f1d65f3d8a",
      "tree": "a173d047281e680037b047b2a9ab956b3405728a",
      "parents": [
        "74b32214fb2c9a06f72cd0495b19fee5d5a3047b"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 18 01:58:30 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 18 13:58:30 2026 +0800"
      },
      "message": "Add docstring examples for Scalar array/list functions (#1420)\n\n* Add docstring examples for Scalar array/list functions\n\nAdd example usage to docstrings for Scalar array/list functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Remove examples from all aliases, maybe we should just remove the aliases for simple api surface\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "74b32214fb2c9a06f72cd0495b19fee5d5a3047b",
      "tree": "2fa3274158c7854408e6ebe03c4db5a8151a3ed0",
      "parents": [
        "f01f30c6332e40208e9f943a163a66e3d2781d08"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 18 01:52:23 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 18 13:52:23 2026 +0800"
      },
      "message": "Add docstring examples for Aggregate statistical and regression functions (#1417)\n\n* Add docstring examples for Aggregate statistical and regression functions\n\nAdd example usage to docstrings for Aggregate statistical and regression functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Simplify covar\n\n* Make sure everything is google doc style\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "f01f30c6332e40208e9f943a163a66e3d2781d08",
      "tree": "989ad5ef60fa594b9b8b10d5158c9e8437a46845",
      "parents": [
        "3dfd6ee5d9ba7de0896f195cef5bc16b4d5f0dd0"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 18 01:51:06 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 18 13:51:06 2026 +0800"
      },
      "message": "Add docstring examples for Scalar temporal functions (#1424)\n\n* Add docstring examples for Scalar temporal functions\n\nAdd example usage to docstrings for Scalar temporal functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Remove examples for aliases\n\n* Fix claude\u0027s attempt to cheat with sql\n\n* Make examples follow google docstyle\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "3dfd6ee5d9ba7de0896f195cef5bc16b4d5f0dd0",
      "tree": "2c95a68808117dc3fba60254907b5609ed21335c",
      "parents": [
        "93f4c34bf5a4afae2547d5ccb677143d1833ebf0"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Mar 17 15:58:34 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 17 15:58:34 2026 -0400"
      },
      "message": "Fix CI errors on main (#1432)\n\n* Do not run check for patches on main, just release candidates\n\n* It is not necessary to pull submodules. It\u0027s only slowing down CI"
    },
    {
      "commit": "93f4c34bf5a4afae2547d5ccb677143d1833ebf0",
      "tree": "b791f414441679efa788d187f1aaadc63c08820e",
      "parents": [
        "e524121c8a68171d1031db0487ec13a547871c42"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Tue Mar 17 02:14:42 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 17 14:14:42 2026 +0800"
      },
      "message": "Add docstring examples for Aggregate basic and bitwise/boolean functions (#1416)\n\n* Add docstring examples for Aggregate basic and bitwise/boolean functions\n\nAdd example usage to docstrings for Aggregate basic and bitwise/boolean functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Add tighter bound on approx_distinct for small sizes\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "e524121c8a68171d1031db0487ec13a547871c42",
      "tree": "ff2519ced5b962699ba0797ec8c206a39b101b28",
      "parents": [
        "b9a958e3893a9a208d67aac314a9ede97b370679"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Tue Mar 17 02:13:40 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 17 14:13:40 2026 +0800"
      },
      "message": "Add docstring examples for Common utility functions (#1419)\n\n* Add docstring examples for Common utility functions\n\nAdd example usage to docstrings for Common utility functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Don\u0027t add examples for aliases\n\n* Parameters back to args\n\n* Examples to google doc style\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "b9a958e3893a9a208d67aac314a9ede97b370679",
      "tree": "a1fa23f725a95389e826fe6a79752677410a3168",
      "parents": [
        "89751b552e8c5388e9cc994acadf1de5b896422f"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Tue Mar 17 02:13:18 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 17 14:13:18 2026 +0800"
      },
      "message": "Add docstring examples for Scalar math functions (#1421)\n\n* Add docstring examples for Scalar math functions\n\nAdd example usage to docstrings for Scalar math functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Fix copy past error on name\n\n* Remove example from alias\n\n* Examples google docstyle\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "89751b552e8c5388e9cc994acadf1de5b896422f",
      "tree": "10ff116c84173f252c84b521ec9f5852aa2ae84c",
      "parents": [
        "21990b0bb01599fb67dbd8686c907e5f810aace3"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Tue Mar 17 02:12:28 2026 -0400"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Mar 17 14:12:28 2026 +0800"
      },
      "message": "Add docstring examples for Scalar regex, crypto, struct and other (#1422)\n\n* Add docstring examples for Scalar regex, crypto, struct and other functions\n\nAdd example usage to docstrings for Scalar regex, crypto, struct and other functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Fix typo\n\n* Fix docstring already broken that I added an example to\n\n* Add sha outputs\n\n* clarify struct results\n\n* Examples should follow google docstyle\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "21990b0bb01599fb67dbd8686c907e5f810aace3",
      "tree": "b70f5fec8ab315add514b001dbc1e26146d8270b",
      "parents": [
        "9af1681f203ec2a21b64371a1dc6361641ffb2f9"
      ],
      "author": {
        "name": "Paul J. Davis",
        "email": "paul.joseph.davis@gmail.com",
        "time": "Mon Mar 16 07:07:50 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 16 08:07:50 2026 -0400"
      },
      "message": "feat: Add FFI_TableProviderFactory support (#1396)\n\n* feat: Add FFI_TableProviderFactory support\n\nThis wraps the new FFI_TableProviderFactory APIs in datafusion-ffi.\n\n* Address PR comments\n\n* Add support for Python based TableProviderFactory\n\nThis adds the ability to register Python based TableProviderFactory\ninstances to the SessionContext.\n\n* Correction after rebase\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "9af1681f203ec2a21b64371a1dc6361641ffb2f9",
      "tree": "8dbd8e0b20021110c5e1a3f75ab67e1dea47c643",
      "parents": [
        "1160d5a91d586927dc6e466829965770c3fa299a"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Mar 16 12:05:00 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 16 07:05:00 2026 -0400"
      },
      "message": "Create workspace with core and util crates (#1414)\n\n* Break up the repository into a workspace with three crates\n\n* We have a workspace cargo lock file now so this is not needed\n\n* Cleanup\n\n* These files should be redundant because of the build.rs file\n\n* More moving around of utils to clean up\n\n* Add note on how to run FFI example tests\n\n* Add back in dep removed during rebase\n\n* taplo fmt\n\n* Since we have a workspace we know the example version is in sync so we do not need this test\n\n* Add description, homepage, and repository to Cargo.toml\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\n\n* Add description, homepage, and repository to Cargo.toml\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\n\n* Add description, homepage, and repository to Cargo.toml\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\n\n* Removed unused include\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e\n\n---------\n\nCo-authored-by: Kevin Liu \u003ckevinjqliu@users.noreply.github.com\u003e"
    },
    {
      "commit": "1160d5a91d586927dc6e466829965770c3fa299a",
      "tree": "fe0c2acef8aa576601071750c7e28741cd3d8616",
      "parents": [
        "d322b7b7bfd527370f03854717661488737c9f8b"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Wed Mar 11 12:01:18 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 11 07:01:18 2026 -0400"
      },
      "message": "Add docstring examples for Scalar trigonometric functions (#1411)\n\n* Add docstring examples for Scalar trigonometric functions\n\nAdd example usage to docstrings for Scalar trigonometric functions to improve documentation.\n\nCo-Authored-By: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e\n\n* Remove weird artifact\n\n* Move conftest so it doesn\u0027t get packaged in release\n\n---------\n\nCo-authored-by: Claude Opus 4.6 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "d322b7b7bfd527370f03854717661488737c9f8b",
      "tree": "37023b01b33266caf5cc6064f2a6d56adc7cb74c",
      "parents": [
        "f914fc854a54ba133ee8eb0c3cb0e9845a5d7f7f"
      ],
      "author": {
        "name": "Daniel Mesejo",
        "email": "mesejoleon@gmail.com",
        "time": "Mon Mar 09 07:52:47 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 09 14:52:47 2026 +0800"
      },
      "message": "feat: feat: add to_time, to_local_time, to_date functions (#1387)\n\n* feat: add to_time, to_local_time, to_date, to_char functions\n\nAdditionally fix conditional on formatters (since it is *args it cannot be None)\nRefactor name to avoid possible collision with f.\n\n* address comments in PR\n\n* chore: add tests for today"
    },
    {
      "commit": "f914fc854a54ba133ee8eb0c3cb0e9845a5d7f7f",
      "tree": "e6dc27e290f558ca3478c18f50120a02996d0e30",
      "parents": [
        "8ef2cd75d984758b3ae2db43629666da1a7bee19"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Mar 06 21:12:56 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 06 21:12:56 2026 -0500"
      },
      "message": "Catch warnings in FFI unit tests (#1410)\n\n* Failed pytest in ffi crate when warnings are generated\n\n* Bump DF53 version"
    },
    {
      "commit": "8ef2cd75d984758b3ae2db43629666da1a7bee19",
      "tree": "bb9069da7f788023205ad827b3d87f4a9d492d92",
      "parents": [
        "231ed2b1d375fefe9aa01cdc8ae41c620c772f76"
      ],
      "author": {
        "name": "Nuno Faria",
        "email": "nunofpfaria@gmail.com",
        "time": "Fri Mar 06 16:11:42 2026 +0000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Mar 06 11:11:42 2026 -0500"
      },
      "message": "Upgrade to DataFusion 53 (#1402)\n\n* Upgrade to DataFusion 53\n\n* Fix fmt\n\n* Fix fmt\n\n* Fix docs\n\n* Bump datafusion rev to 53.0.0\n\n* Bump ffi example datafusion commit to the same as main repo\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "231ed2b1d375fefe9aa01cdc8ae41c620c772f76",
      "tree": "f4fe40a7e1e912afb92cd6b9530a1c782ec856e5",
      "parents": [
        "7b630ee893d6b81ae7a0f2c35b77dab723567b13"
      ],
      "author": {
        "name": "Nick",
        "email": "24689722+ntjohnson1@users.noreply.github.com",
        "time": "Thu Mar 05 10:42:20 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Mar 05 10:42:20 2026 -0500"
      },
      "message": "Enable doc tests in local and CI testing (#1409)\n\n* Turn on doctests\n\n* Fix existing doc examples\n\n* Remove stale referenece to rust-toolchain removed in #1383, surpised pre-commit didn\u0027t flag for anyone else"
    },
    {
      "commit": "7b630ee893d6b81ae7a0f2c35b77dab723567b13",
      "tree": "60130a9a39f2678310f80785c80ff5d8924ddb12",
      "parents": [
        "0c1499cddea5fa20c13728b0c2726aea4fbd1b08"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Mar 04 10:24:32 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Mar 04 10:24:32 2026 -0500"
      },
      "message": "Add check for crates.io patches to CI (#1407)\n\n"
    },
    {
      "commit": "0c1499cddea5fa20c13728b0c2726aea4fbd1b08",
      "tree": "33c8d3d3a3a71aa5e3b5fd318dd49a9bc6159f1d",
      "parents": [
        "e42775c2fcfe8929df0874414ba2bcd6bbea174c"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Mon Mar 02 08:21:37 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Mar 02 08:21:37 2026 -0500"
      },
      "message": "fix: satisfy rustfmt check in lib.rs re-exports (#1406)\n\n"
    },
    {
      "commit": "e42775c2fcfe8929df0874414ba2bcd6bbea174c",
      "tree": "dc00636ab2c095427c71c2ed6191101be8476d47",
      "parents": [
        "57a50faebb93365a56f337e53120ca215c03774b"
      ],
      "author": {
        "name": "dario curreri",
        "email": "48800335+dariocurr@users.noreply.github.com",
        "time": "Thu Feb 26 15:13:38 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 26 09:13:38 2026 -0500"
      },
      "message": "ci: update pre-commit hooks, fix linting, and refresh dependencies (#1385)\n\n* ci: update pre-commit hooks and fix linting issues\n\n* Update Ruff version in pre-commit configuration to v0.15.1.\n* Add noqa comments to suppress specific linting warnings in various files.\n* Update regex patterns in test cases for better matching.\n\n* style: correct indentation in GitHub Actions workflow file\n\n* Adjusted indentation for the enable-cache option in the test.yml workflow file to ensure proper YAML formatting.\n\n* refactor: reorder imports in indexed_field.rs for clarity\n\n* Adjusted the order of imports in indexed_field.rs to improve readability and maintain consistency with project conventions.\n\n* build: update dependencies in Cargo.toml and Cargo.lock\n\n* Bump versions of several dependencies including tokio, pyo3-log, prost, uuid, and log to their latest releases.\n* Update Cargo.lock to reflect the changes in dependency versions.\n\n* style: format pyproject.toml for consistency\n\n* Adjusted formatting in pyproject.toml for improved readability by aligning lists and ensuring consistent indentation.\n* Updated dependencies and configuration settings for better organization.\n\n* style: remove noqa comments for import statements\n\n* Cleaned up import statements in multiple files by removing unnecessary noqa comments, enhancing code readability and maintaining consistency across the codebase.\n\n* style: simplify formatting in pyproject.toml\n\n* Streamlined list formatting in pyproject.toml for improved readability by removing unnecessary line breaks and ensuring consistent structure across sections.\n* No functional changes were made; the focus was solely on code style and organization."
    },
    {
      "commit": "57a50faebb93365a56f337e53120ca215c03774b",
      "tree": "f7ca24cd93ebf0be125113a2f62abd2fa532a613",
      "parents": [
        "22086650c8df8b7bc382130c9560f76762dbe6c0"
      ],
      "author": {
        "name": "Kevin Liu",
        "email": "kevinjqliu@users.noreply.github.com",
        "time": "Wed Feb 25 16:34:55 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 25 16:34:55 2026 -0500"
      },
      "message": "Allow running \"verify release candidate\" github workflow on Windows (#1392)\n\n* run for windows\n\n* readme"
    },
    {
      "commit": "22086650c8df8b7bc382130c9560f76762dbe6c0",
      "tree": "4a6818b9a4f57808858d63a81ffc4de3453fa942",
      "parents": [
        "44a3eb353960e96d16013139d30bb588b7c901db"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Feb 23 08:36:12 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Feb 23 08:36:12 2026 -0500"
      },
      "message": "Add workflow to verify release candidate on multiple systems (#1388)\n\n* add workflow\n\n* add protoc\n\n* update coverage\n\n* upgrade\n\n* newline\n\n* add a note about manual trigger\n\n* add a section to release about manually running the matrix\n\n* more details\n\n---------\n\nCo-authored-by: Kevin Liu \u003ckevin.jq.liu@gmail.com\u003e"
    },
    {
      "commit": "44a3eb353960e96d16013139d30bb588b7c901db",
      "tree": "2311dfd754396d4a4b6a3a5847ac0af55b0b5773",
      "parents": [
        "d87c6e8049c165158071460f4550546fdc5c42c6"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Feb 23 07:29:03 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Feb 23 07:29:03 2026 -0500"
      },
      "message": "Merge release 52.0.0 into main (#1389)\n\n* Update version number to 52.0.0\n\n* Update changelog for 52.0.0"
    },
    {
      "commit": "d87c6e8049c165158071460f4550546fdc5c42c6",
      "tree": "c67ea2a49f12fd9f05e4dd33c8399f602d1429d2",
      "parents": [
        "4a75b0f370cf82d7892bbffae0d854e8d707540f"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Feb 19 07:27:53 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 19 07:27:53 2026 -0500"
      },
      "message": "chore: bump Python version for RAT checking (#1386)\n\n* Bump python version to correct error in rat checking\n\n* Add missing license files"
    },
    {
      "commit": "4a75b0f370cf82d7892bbffae0d854e8d707540f",
      "tree": "66ac7d2d2ea9b97a2f4c85dab943cdff53cc34ba",
      "parents": [
        "675e41ed988360fd0758639e3fa52a2536282ebd"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 18 13:22:40 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 18 13:22:40 2026 -0500"
      },
      "message": "minor: update cargo dependencies (#1383)\n\n* Update Cargo.lock files\n\n* Remove CI steps that are no longer required"
    },
    {
      "commit": "675e41ed988360fd0758639e3fa52a2536282ebd",
      "tree": "df4fbc29613cabc74d283bdac6582effa846c3ff",
      "parents": [
        "3481904fe9c770ffc218cc043d47d35e860148a6"
      ],
      "author": {
        "name": "Daniel Mesejo",
        "email": "mesejoleon@gmail.com",
        "time": "Wed Feb 18 17:38:59 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 18 11:38:59 2026 -0500"
      },
      "message": "feat: add regexp_instr function (#1382)\n\n* feat: add regexp_instr function\n\nThe current implementation of regexp_instr in Datafusion, does not support\nendoption. Hence None is passed in the implementation of the function\nexposing it to Python.\n\n* chore: add test for all optional arguments\n\n* fix: make start truly optional in regexp_count"
    },
    {
      "commit": "3481904fe9c770ffc218cc043d47d35e860148a6",
      "tree": "94eb6e04f3b121ae20a4d6c021046bb101da6245",
      "parents": [
        "4cd56743e50d6b33c3d21152ec4ae87d5fe3faf4"
      ],
      "author": {
        "name": "kosiew",
        "email": "kosiew@gmail.com",
        "time": "Thu Feb 19 00:15:06 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 18 11:15:06 2026 -0500"
      },
      "message": "Fix Python UDAF list-of-timestamps return by enforcing list-valued scalars and caching PyArrow types (#1347)\n\n* Implement UDAF improvements for list type handling\n\nStore UDAF return type in Rust accumulator and wrap\npyarrow Array/ChunkedArray returns into list scalars\nfor list-like return types. Add a UDAF test to return\na list of timestamps via a pyarrow array, validating\nthe aggregate output for correctness.\n\n* Document UDAF list-valued scalar returns\n\nAdd documented list-valued scalar returns for UDAF\naccumulators, including an example with pa.scalar and a note\nabout unsupported pyarrow.Array returns from evaluate().\nAlso, introduce a UDAF FAQ entry detailing list-returning\npatterns and required return_type/state_type definitions.\n\n* Fix pyarrow calls and improve type handling in RustAccumulator\n\n* Refactor RustAccumulator to support pyarrow array types and improve type checking for list types\n\n* Fixed PyO3 type mismatch by cloning Array/ChunkedArray types before unbinding and binding fresh copies when checking array-likeness, eliminating the Bound reference error\n\n* Add timezone information to datetime objects in test_udaf_list_timestamp_return\n\n* clippy fix\n\n* Refactor RustAccumulator and utility functions for improved type handling and conversion from Python objects to Arrow types\n\n* Enhance PyArrow integration by refining type handling and conversion in RustAccumulator and utility functions\n\n* Fix array data binding in py_obj_to_scalar_value function\n\n* Implement single point for scalar conversion from python objects\n\n* Add unit tests and simplify python wrapper for literal\n\n* Add nanoarrow and arro3-core to dev dependencies. Sort the dependencies alphabetically.\n\n* Refactor common code into helper function so we do not duplicate it.\n\n* Update import path to access Scalar type\n\n* Add test for generic python objects that support the C interface\n\n* Update unit test to pass back either pyarrow array or array wrapped as scalar\n\n* Update tests to pass back raw python values or pyarrow scalar\n\n* Expand on user documentation for how to return list arrays\n\n* More user documentation\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "4cd56743e50d6b33c3d21152ec4ae87d5fe3faf4",
      "tree": "b20eef999ef7e88e8a141f2343741a766beb8992",
      "parents": [
        "024c86b16f4da4f2ec957b00f7ea37d00bdc759a"
      ],
      "author": {
        "name": "Dhanashri Prathamesh Iranna",
        "email": "110318083+Prathamesh9284@users.noreply.github.com",
        "time": "Sat Feb 14 05:19:28 2026 +0530"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Feb 13 18:49:28 2026 -0500"
      },
      "message": "feat: add support for generating JSON formatted substrait plan (#1376)\n\n* chore: add new dependencies to pyproject.toml\n\n* chore: rename from_json to parse_json\n\n* fix: missin import\n\n* Fix call to internal function. Drive by update to dquality on logical plan. Switch unit test to focus on json parsing and not byte serialization.\n\n* fix: resolve clippy redundant closure lint in substrait.rs\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "024c86b16f4da4f2ec957b00f7ea37d00bdc759a",
      "tree": "bfb6c45fb8283f53554ac8b1bc2545161b166f0d",
      "parents": [
        "16f98ffaa1a82ef631405650ab89a1f0247aad7c"
      ],
      "author": {
        "name": "Adisa Mubarak (AdMub)",
        "email": "99817240+AdMub@users.noreply.github.com",
        "time": "Thu Feb 12 20:28:03 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 12 14:28:03 2026 -0500"
      },
      "message": "docs: Clarify first_value usage in select vs aggregate (#1348)\n\n* docs: Add warning to first_value about usage in select vs aggregate\n\nClarifies that aggregate functions like first_value must be used within .aggregate() and not .select(). Closes #1300.\n\n* chore: remove temporary reproduction script\n\n* Update all aggregate functions to have an example usage that is correct\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "16f98ffaa1a82ef631405650ab89a1f0247aad7c",
      "tree": "856cfe9d4a9d707b758688d02fe7221abd3a6b2a",
      "parents": [
        "08a8dc04aabc46b10b6eaa72038b788f0863d2de"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 11 15:33:46 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 11 15:33:46 2026 -0500"
      },
      "message": "chore: update rust 2024 edition (#1371)\n\n* Cargo fmt on rust 2024\n\n* Update example to rust2024"
    },
    {
      "commit": "08a8dc04aabc46b10b6eaa72038b788f0863d2de",
      "tree": "b87c149f0c5ea9547679c51b0e8c84ae36d5fb5a",
      "parents": [
        "95b4a00f7a315aa4d463566ad9096242da086211"
      ],
      "author": {
        "name": "Daniel Mesejo",
        "email": "mesejoleon@gmail.com",
        "time": "Wed Feb 11 19:30:45 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 11 13:30:45 2026 -0500"
      },
      "message": "fix: mangled errors (#1377)\n\ncloses #1226"
    },
    {
      "commit": "95b4a00f7a315aa4d463566ad9096242da086211",
      "tree": "4b19abf56104ae96d1ea64379bdbdc38a8c40655",
      "parents": [
        "3f89704f9d8885521a2d9f2f4c8e78d4a67e9b2a"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 11 11:36:56 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 11 11:36:56 2026 -0500"
      },
      "message": "Remove the FFI test wheel from the distribution artifact (#1378)\n\n"
    },
    {
      "commit": "3f89704f9d8885521a2d9f2f4c8e78d4a67e9b2a",
      "tree": "167fa3d581033f0a5cc31101e4819d4375d2168d",
      "parents": [
        "8fc943629b93b342ef67bb8aea0aa581615a374d"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 11 07:34:53 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 11 07:34:53 2026 -0500"
      },
      "message": "Build in debug mode for PRs (#1375)\n\n* First draft of running debug mode for PRs and release mode for main \u0026 releases\n\n* Update paths\n\n* Change install command for taplo\n\n* install protoc\n\n* taplo fmt\n\n* Working through CI build issues\n\n* More CI issues\n\n* do not build taplo, just download it\n\n* Try only running clippy when we can reuse build artifacts\n\n* Try removing unnecessary installs during build\n\n* Don\u0027t build cargo-license\n\n* Add back in uv sync so we can run maturin\n\n* minor: name casing\n\n* Fix path for wheels\n\n* More CI updates, but expect pytest to fail until we switch to downloading the wheel artifacts from build stage\n\n* Correct error in yml file. Rename to match other file extensions\n\n* Download wheel from build stage for testing\n\n* For CI tests move into test directory to avoid picking up pyproject.toml file\n\n* Do not upload artifacts not used in testing during debug builds\n\n* Do not attempt to use local python path for tests\n\n* Bump manylinux version\n\n* Do not run release flow for branches named branch-*. Only run it for pushes to main and release or candidate tags.\n\n* Build FFI test code in build stage so we only build it once\n\n* We need both wheels to be in the dist folder and the maturin action is erasing the other wheel\n\n* We now have two wheels that need to be installed instead of just one\n\n* Make a minor change to restart CI\n\n* tests will need both wheels also\n\n* Update .github/workflows/build.yml\n\nCo-authored-by: Martin Grigorov \u003cmartin-g@users.noreply.github.com\u003e\n\n* mac has protoc system installed\n\n* Update .github/workflows/test.yml\n\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e\n\n---------\n\nCo-authored-by: Martin Grigorov \u003cmartin-g@users.noreply.github.com\u003e\nCo-authored-by: Copilot \u003c175728472+Copilot@users.noreply.github.com\u003e"
    },
    {
      "commit": "8fc943629b93b342ef67bb8aea0aa581615a374d",
      "tree": "d36455790f8cd62834944fe38190e3811f778613",
      "parents": [
        "b555df524acdce90121b6b7de9dd5f78189bd04e"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Feb 09 09:17:26 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Feb 09 09:17:26 2026 -0500"
      },
      "message": "feat: add CatalogProviderList support (#1363)\n\n* Implement catalog provider list\n\n* Flush out python side and add unit test\n\n* Add FFI test for catalog provider list\n\n* Update type hints\n\n* Update unit test to add a different type of catalog to the catalog list"
    },
    {
      "commit": "b555df524acdce90121b6b7de9dd5f78189bd04e",
      "tree": "a4fd468501d8feaca5d6b5639d16d288b177623a",
      "parents": [
        "1da08a259d6844c8dff8c4c8c9ba1874c86f8894"
      ],
      "author": {
        "name": "Marko Milenković",
        "email": "milenkovicm@users.noreply.github.com",
        "time": "Sun Feb 08 12:21:48 2026 +0000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sun Feb 08 07:21:48 2026 -0500"
      },
      "message": "chore: add confirmation before tarball is released (#1372)\n\n"
    },
    {
      "commit": "1da08a259d6844c8dff8c4c8c9ba1874c86f8894",
      "tree": "c9a0f58ada4a3422e50080d1ebe3ec1f57b1b944",
      "parents": [
        "ee62f7a26d178995a497c081625b8f5ccc5a82fd"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Feb 05 16:48:04 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 05 16:48:04 2026 -0500"
      },
      "message": "Implement all CSV reader options (#1361)\n\n* Implement all CSV options with a builder pattern\n\n* Remove unused clippy warning\n\n* Add additional tests for csv read options"
    },
    {
      "commit": "ee62f7a26d178995a497c081625b8f5ccc5a82fd",
      "tree": "1308ed7703debf715d45f9d51e11cf339cc39239",
      "parents": [
        "adec2dcd785daa16a02f8029c1067c1d7594d67c"
      ],
      "author": {
        "name": "kosiew",
        "email": "kosiew@gmail.com",
        "time": "Thu Feb 05 20:09:25 2026 +0800"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Feb 05 07:09:25 2026 -0500"
      },
      "message": "Enforce DataFrame display memory limits with `max_rows` + `min_rows` constraint (deprecate `repr_rows`) (#1367)\n\n* Update DataFrameHtmlFormatter to enforce min_rows_display constraint and adjust default values\n\n* Refactor DataFrame formatter to replace repr_rows with max_rows and update related validations\n\n* Add validation for formatter parameters and deprecate repr_rows alias\n\n* Add boundary condition tests for HTML formatter memory limits and resolve max_rows logic\n\n* Remove repr_rows handling in max_rows resolution in Rust\n\n* Refactor whitespace in parameter validation and update test for HTML formatter memory limits\n\n* ruff fix\n\n* Rename min_rows_display to min_rows in formatter configuration and update related tests\n\n* Refactor function parameter handling and documentation\n\nRemoved type annotations and redundant default values\nfrom parameter names. Enhanced descriptions for clarity\nand added context for usage. Fixed formatting for the\ndocumentation sections to improve readability.\n\n* Update HTML formatter memory boundary tests for large datasets\n\n* Enhance memory boundary tests in HTML formatter for large datasets\n\n* Add fixture for multi-batch DataFrame and test early stream termination with memory limits\n\n* Add backward compatibility tests for deprecated formatter attributes\n\n* ruff fix\n\n* Remove deprecation timeline comments from HTML formatter backward compatibility test"
    },
    {
      "commit": "adec2dcd785daa16a02f8029c1067c1d7594d67c",
      "tree": "468533b5940856d8230e4d7ef7ffa87101370ec3",
      "parents": [
        "f72e549694e3287b459495cfd7affb3b132589b3"
      ],
      "author": {
        "name": "Antoine Beyeler",
        "email": "49431240+abey79@users.noreply.github.com",
        "time": "Wed Feb 04 20:37:29 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 14:37:29 2026 -0500"
      },
      "message": "Improve displayed error by using `DataFusionError`\u0027s `Display` trait (#1370)\n\n* Use  instead of  for\n\n* update test"
    },
    {
      "commit": "f72e549694e3287b459495cfd7affb3b132589b3",
      "tree": "3bdfd2530421e3392a5b1aec53b004657a42de5d",
      "parents": [
        "015dd76f9fdc8fe74cce1c87fa348b20698903fd"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 04 08:54:57 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 08:54:57 2026 -0500"
      },
      "message": "Cargo update to prepare for DF52 release (#1368)\n\n"
    },
    {
      "commit": "015dd76f9fdc8fe74cce1c87fa348b20698903fd",
      "tree": "38cf4257a4d1241ff3c2d2bdd7b0ad176e78acba",
      "parents": [
        "32272765a4a6215befd75c6cd1af22d09e170808"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Wed Feb 04 08:54:45 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 08:54:45 2026 -0500"
      },
      "message": "Pass Field information back and forth when using scalar UDFs (#1299)\n\n* Pass Field information back and forth when using scalar UDFs\n\n* Add ArrowArrayExportable class and use it to create pyarrow arrays for python UDFs\n\n* Minor user documentation update\n\n* Update naming from type to field where appropriate\n\n* Add unit test to check field inputs\n\n* Update docstring\n\n* Add text to user documentation on passing field information for scalar UDFs\n\n* Minor change requested in code review\n\n* Make type hints match outer"
    },
    {
      "commit": "32272765a4a6215befd75c6cd1af22d09e170808",
      "tree": "bb6a988183b0badc233efb3099875c3187d08855",
      "parents": [
        "2465e19c390861163f024164df0c6987ccb0fec5"
      ],
      "author": {
        "name": "dependabot[bot]",
        "email": "49699333+dependabot[bot]@users.noreply.github.com",
        "time": "Wed Feb 04 07:31:49 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 07:31:49 2026 -0500"
      },
      "message": "build(deps): bump actions/cache from 4 to 5 (#1323)\n\nBumps [actions/cache](https://github.com/actions/cache) from 4 to 5.\n- [Release notes](https://github.com/actions/cache/releases)\n- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)\n- [Commits](https://github.com/actions/cache/compare/v4...v5)\n\n---\nupdated-dependencies:\n- dependency-name: actions/cache\n  dependency-version: \u00275\u0027\n  dependency-type: direct:production\n  update-type: version-update:semver-major\n...\n\nSigned-off-by: dependabot[bot] \u003csupport@github.com\u003e\nCo-authored-by: dependabot[bot] \u003c49699333+dependabot[bot]@users.noreply.github.com\u003e"
    },
    {
      "commit": "2465e19c390861163f024164df0c6987ccb0fec5",
      "tree": "d670ba35d95b01b47c53bfd97a29c9553aac39b5",
      "parents": [
        "f6cc7a2a9f7e08aafd1e6c700f130b4cff733533"
      ],
      "author": {
        "name": "dependabot[bot]",
        "email": "49699333+dependabot[bot]@users.noreply.github.com",
        "time": "Wed Feb 04 07:31:25 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 07:31:25 2026 -0500"
      },
      "message": "build(deps): bump actions/upload-artifact from 4 to 6 (#1322)\n\nBumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 6.\n- [Release notes](https://github.com/actions/upload-artifact/releases)\n- [Commits](https://github.com/actions/upload-artifact/compare/v4...v6)\n\n---\nupdated-dependencies:\n- dependency-name: actions/upload-artifact\n  dependency-version: \u00276\u0027\n  dependency-type: direct:production\n  update-type: version-update:semver-major\n...\n\nSigned-off-by: dependabot[bot] \u003csupport@github.com\u003e\nCo-authored-by: dependabot[bot] \u003c49699333+dependabot[bot]@users.noreply.github.com\u003e"
    },
    {
      "commit": "f6cc7a2a9f7e08aafd1e6c700f130b4cff733533",
      "tree": "8846a67bea8b984ac83bc8f9d9bbb48d3fe6bfe0",
      "parents": [
        "ada3dcdb8da0fe14cf4e61fe874485e937468f29"
      ],
      "author": {
        "name": "dependabot[bot]",
        "email": "49699333+dependabot[bot]@users.noreply.github.com",
        "time": "Wed Feb 04 07:31:00 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 07:31:00 2026 -0500"
      },
      "message": "build(deps): bump actions/download-artifact from 5 to 7 (#1321)\n\nBumps [actions/download-artifact](https://github.com/actions/download-artifact) from 5 to 7.\n- [Release notes](https://github.com/actions/download-artifact/releases)\n- [Commits](https://github.com/actions/download-artifact/compare/v5...v7)\n\n---\nupdated-dependencies:\n- dependency-name: actions/download-artifact\n  dependency-version: \u00277\u0027\n  dependency-type: direct:production\n  update-type: version-update:semver-major\n...\n\nSigned-off-by: dependabot[bot] \u003csupport@github.com\u003e\nCo-authored-by: dependabot[bot] \u003c49699333+dependabot[bot]@users.noreply.github.com\u003e"
    },
    {
      "commit": "ada3dcdb8da0fe14cf4e61fe874485e937468f29",
      "tree": "8dde9b70bb0988ce8e9b643fa230bcd0ef3e5ec1",
      "parents": [
        "e7f5867f506e4d14b4993606782580d19af2a347"
      ],
      "author": {
        "name": "dependabot[bot]",
        "email": "49699333+dependabot[bot]@users.noreply.github.com",
        "time": "Wed Feb 04 07:30:29 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Wed Feb 04 07:30:29 2026 -0500"
      },
      "message": "build(deps): bump actions/checkout from 5 to 6 (#1310)\n\nBumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.\n- [Release notes](https://github.com/actions/checkout/releases)\n- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)\n- [Commits](https://github.com/actions/checkout/compare/v5...v6)\n\n---\nupdated-dependencies:\n- dependency-name: actions/checkout\n  dependency-version: \u00276\u0027\n  dependency-type: direct:production\n  update-type: version-update:semver-major\n...\n\nSigned-off-by: dependabot[bot] \u003csupport@github.com\u003e\nCo-authored-by: dependabot[bot] \u003c49699333+dependabot[bot]@users.noreply.github.com\u003e"
    },
    {
      "commit": "e7f5867f506e4d14b4993606782580d19af2a347",
      "tree": "834f5250993ddcdff30b2b26fd21f992b53ad82c",
      "parents": [
        "eaa3f79b998fc433e930fc74ba648372d59e6ace"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Feb 02 09:27:54 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Feb 02 09:27:54 2026 -0500"
      },
      "message": "Prepare for DF52 release (#1337)\n\n* Prepare for DF52 release\n\n* Update datafusion to point to release version\n\n* round now respects decimal\n\n* Update pathlib\n\n* Update documentation\n\n* Commit copilot suggestions\n\n* Pass codec capsule to table providers\n\n* Simplify codec passing around\n\n* Update signatures for FFI passing of logical codec around\n\n* Update upgrade guide for new method signature\n\n* Add more unit tests to cover FFI providers\n\n* Missing run command"
    },
    {
      "commit": "eaa3f79b998fc433e930fc74ba648372d59e6ace",
      "tree": "226c3d505b65bff67dae6bb30622266351183916",
      "parents": [
        "7aff3635c93d5897d470642928c39c86e7851931"
      ],
      "author": {
        "name": "Mimoune",
        "email": "djouallah@users.noreply.github.com",
        "time": "Sat Jan 31 01:32:45 2026 +1000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Jan 30 10:32:45 2026 -0500"
      },
      "message": "Add use_fabric_endpoint parameter to MicrosoftAzure class (#1357)\n\nThis change adds support for the use_fabric_endpoint parameter to the\nMicrosoftAzure object store class, enabling connections to Microsoft\nFabric OneLake storage.\n\nThe parameter allows users to specify that they want to use Data Lake\nStorage Gen2 endpoints (dfs.fabric.microsoft.com) instead of the default\nAzure Blob Storage endpoints (blob.core.windows.net), which is required\nfor OneLake/Fabric storage access.\n\nImplementation follows the same pattern as existing boolean parameters\n(use_emulator, allow_http) by:\n- Adding the parameter to the PyO3 signature macro\n- Adding it as Option\u003cbool\u003e to the function parameters\n- Conditionally calling with_use_fabric_endpoint() on the builder\n\nFixes #1356\n\nCo-authored-by: Claude Sonnet 4.5 \u003cnoreply@anthropic.com\u003e"
    },
    {
      "commit": "7aff3635c93d5897d470642928c39c86e7851931",
      "tree": "6edbcdb8415e5c893274706af419bb806b09075f",
      "parents": [
        "a9229673bbb4c5706386ee83179d4741777c577d"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Jan 12 07:22:52 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jan 12 07:22:52 2026 -0500"
      },
      "message": "Use explicit timer in unit test (#1338)\n\n* Use an explicit wait in a dataframe query during testing to check for keyboard interrupts\n\n* Add interrupt check when spawning futures\n\n* Update unit test to do four variantions of fast/slow queries and interrupt either collect or stream"
    },
    {
      "commit": "a9229673bbb4c5706386ee83179d4741777c577d",
      "tree": "4ca97b2c5624cf9a06a1c9a5c34f9363bd0cc343",
      "parents": [
        "1df6db27d95d99ddb51136a60abd05a33ce375ad"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Fri Jan 09 08:24:42 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Fri Jan 09 08:24:42 2026 -0500"
      },
      "message": "Release 51.0.0 (#1333)\n\n* Update changelog and version number\n\n* Update cargo lock\n\n* Update cargo lock file and corresponding change to unit test due to arrow update"
    },
    {
      "commit": "1df6db27d95d99ddb51136a60abd05a33ce375ad",
      "tree": "678207b8aa188ed1f930ef0e7ed19188a2129cca",
      "parents": [
        "3a4ae6d8ed43fa5a82725a37961b626ff884fd96"
      ],
      "author": {
        "name": "Nuno Faria",
        "email": "nunofpfaria@gmail.com",
        "time": "Mon Jan 05 15:02:38 2026 +0000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jan 05 10:02:38 2026 -0500"
      },
      "message": "fix: Inconsistent schemas when converting to pyarrow (#1315)\n\n* Fix inconsistent schemas when converting to pyarrow\n\n* Add extra tests\n\n* Change deprecated type\n\n---------\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "3a4ae6d8ed43fa5a82725a37961b626ff884fd96",
      "tree": "00b9fde9edc1d94f5ca25f461bf865ce7ed60683",
      "parents": [
        "474e9e67bfda3074dc435d3d653de538b0cafa7b"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Mon Jan 05 08:25:45 2026 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jan 05 08:25:45 2026 -0500"
      },
      "message": "Do not convert pyarrow scalar values to plain python types when passing as `lit` (#1319)\n\n* Add unit tests for pyarrow scalar round trips\n\n* Do not convert to bare python object for lit conversion from pyarrow scalar"
    },
    {
      "commit": "474e9e67bfda3074dc435d3d653de538b0cafa7b",
      "tree": "13194644a6568aa89abba7f8a9282658be88b47e",
      "parents": [
        "fcd70567dedc580416c2931cc7f25e3960704ace"
      ],
      "author": {
        "name": "Daniel Mesejo",
        "email": "mesejoleon@gmail.com",
        "time": "Mon Jan 05 14:24:09 2026 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Mon Jan 05 08:24:09 2026 -0500"
      },
      "message": "fix: use coalesce instead of drop_duplicate_keys for join (#1318)\n\n* fix: use coalesce instead of drop_duplicate_keys for join\n\ncloses #1305\n\n* fix(docs): join usage"
    },
    {
      "commit": "fcd70567dedc580416c2931cc7f25e3960704ace",
      "tree": "2275b9c1ff139f21af0634cec8c9b3f09e04126d",
      "parents": [
        "6864d8013faa5f222063df61e5b6ff5368c4d252"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Dec 30 08:47:07 2025 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 30 08:47:07 2025 -0500"
      },
      "message": "Update build workflow link (#1330)\n\n* Update link\n\n* fix extension"
    },
    {
      "commit": "6864d8013faa5f222063df61e5b6ff5368c4d252",
      "tree": "535125e393b5e2c4c57cabed468a36dab4efc7b8",
      "parents": [
        "db3c6a0bc0abe28fd6b61daf65c94f192b4f3611"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Dec 23 13:28:56 2025 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 23 13:28:56 2025 -0500"
      },
      "message": "Minor build errors (#1325)\n\n"
    },
    {
      "commit": "db3c6a0bc0abe28fd6b61daf65c94f192b4f3611",
      "tree": "69975bba8a7b05aacb3a3c9c4f93a29204e94cd2",
      "parents": [
        "c141dd3aa8eaf499f6cf3b81dfc278b4f3a5981a"
      ],
      "author": {
        "name": "Nuno Faria",
        "email": "nunofpfaria@gmail.com",
        "time": "Tue Dec 23 12:15:48 2025 +0000"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Dec 23 07:15:48 2025 -0500"
      },
      "message": "Upgrade to Datafusion 51 (#1311)\n\n* Upgrade to Datafusion 51\n\n* Fix clippy\n\n* Upgrade to Datafusion 51\n\n* Fix clippy\n\n* Refactor test_arrow_c_stream_interrupted to handle exceptions in a separate thread\n\n* Improve exception handling in wait_for_future\n\nUpdated wait_for_future to surface pending Python exceptions by\nexecuting bytecode during signal checks, ensuring that asynchronous\ninterrupts are processed promptly. Enhanced PartitionedDataFrameStreamReader\nto cancel remaining partition streams on projection errors or Python\ninterrupts, allowing for clean iteration stops. Added regression tests\nto validate interrupted Arrow C stream reads and improve timing\nfor RecordBatchReader.read_all cancellations.\n\n* Refactor signal checking in future collection to simplify error handling\n\n* Simplify KeyboardInterrupt check in test_arrow_c_stream_interrupted\n\n* rm test_record_batch_reader_interrupt_exits_quickly\n\n* Refactor test_arrow_c_stream_interrupted to improve exception handling and readability\n\n* Improve exception handling in test_arrow_c_stream_interrupted to check for KeyboardInterrupt\n\n* Add comment - handle KeyboardInterrupt more effectively\n\n* Remove unnecessary stream cancellation on error in PartitionedDataFrameStreamReader\n\n* Add jupyter notebook for test\n\n* Revert \"Add jupyter notebook for test\"\n\nThis reverts commit 784929d51e38124cd97a5d054fd6265fbd543343.\n\n* Simplify error handling in PartitionedDataFrameStreamReader\n\n* Fix ruff warnings\n\n* Fix ruff warnings\n\n* Update src/utils.rs\n\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e\n\n* Fix clippy\n\n---------\n\nCo-authored-by: Siew Kam Onn \u003ckosiew@gmail.com\u003e\nCo-authored-by: Tim Saucer \u003ctimsaucer@gmail.com\u003e"
    },
    {
      "commit": "c141dd3aa8eaf499f6cf3b81dfc278b4f3a5981a",
      "tree": "680fc55279fe24f8f8aed0d3cf0feac15e8c80e3",
      "parents": [
        "276dc6a5b0b6a7ce946f3e61c8dabe59e9e0a2ec"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Sun Dec 07 17:25:12 2025 +0100"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Sun Dec 07 17:25:12 2025 +0100"
      },
      "message": "Feat/parameterized sql queries (#964)\n\n* Intermediate work on parameterizing queries\n\n* Reworking to do token parsing of sql query instead of string manipulation\n\n* Switching to explicit param_values or named parameters that will perform string replacement via parsed tokens\n\n* Add additional unit tests for parameterized queries\n\n* merge conflict\n\n* license text\n\n* Add documentation\n\n* cargo clippy and fmt\n\n* Need at least pyarrow 16 now\n\n* add type hints\n\n* Minor docstring update"
    },
    {
      "commit": "276dc6a5b0b6a7ce946f3e61c8dabe59e9e0a2ec",
      "tree": "331e971c372f11fe965ecf4585ef9fc747bf8e76",
      "parents": [
        "f1b3029db6443c20e6f4c5d2ed0cc7b4217ce256"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Tue Nov 18 15:09:12 2025 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Tue Nov 18 15:09:12 2025 -0500"
      },
      "message": "chore: apply cargo fmt with import organization (#1303)\n\n* Apply nightly format to organize imports consistently\n\n* Set matrix for cargo fmt\n\n* Revert \"Set matrix for cargo fmt\"\n\nThis reverts commit 85119050e3da16d72cc5a7e4ebaa2e4a494dbcc4.\n\n* Instead of creating a large matrix just add one workflow for nightly fmt\n\n* Intentionally cause cargo fmt to fail in nightly\n\n* Apply nightly fmt"
    },
    {
      "commit": "f1b3029db6443c20e6f4c5d2ed0cc7b4217ce256",
      "tree": "5ada95e04cdd2da8361a3747abb1ca5f08e4063c",
      "parents": [
        "89d8930625c7a903fb0e5f1d4f5310341e0b4fea"
      ],
      "author": {
        "name": "Tim Saucer",
        "email": "timsaucer@gmail.com",
        "time": "Thu Nov 13 11:33:30 2025 -0500"
      },
      "committer": {
        "name": "GitHub",
        "email": "noreply@github.com",
        "time": "Thu Nov 13 11:33:30 2025 -0500"
      },
      "message": "Add function collect_column to dataframe (#1302)\n\n"
    }
  ],
  "next": "89d8930625c7a903fb0e5f1d4f5310341e0b4fea"
}
