Sign in
apache
/
spark
/
HEAD
4418d6e
[SPARK-53562][PYTHON] Limit Arrow batch sizes in `applyInArrow` and `applyInPandas`
by Ruifeng Zheng
· 2 hours ago
master
b005f56
[SPARK-53689][BUILD][FOLLOW-UP] Check if RELEASE_VERSION is already set properly
by Hyukjin Kwon
· 5 hours ago
3f2c623
[SPARK-53689][BUILD][FOLLOW-UP] Respect RELEASE_VERSION environment variable if already defined
by Hyukjin Kwon
· 7 hours ago
b1bbf02
[SPARK-53688][PYTHON][INFRA][TESTS] Increase the timeout and skip UDF type tests in Python-only on MacOS
by Ruifeng Zheng
· 9 hours ago
de7ba3b
[SPARK-53631][CORE] Optimize memory and perf on SHS bootstrap
by Cheng Pan
· 9 hours ago
06f7ad2
[SPARK-53682][INFRA] Refresh `spark-rm` Docker Image with `jammy-20250819`
by Dongjoon Hyun
· 10 hours ago
b4592c4
[SPARK-53678][SQL] Fix NPE when subclass of ColumnVector is created with null DataType
by manuzhang
· 10 hours ago
8ac0382
[SPARK-53689][BUILD] Respect RELEASE_VERSION environment variable if already defined
by Hyukjin Kwon
· 10 hours ago
7dfd9e2
[SPARK-53681][BUILD][TESTS] Upgrade `snowflake-jdbc` to 3.26.1
by Dongjoon Hyun
· 12 hours ago
661d611
[SPARK-53645][PS] Implement `skipna` parameter for ps.DataFrame `any()`
by Peter Nguyen
· 13 hours ago
fa9e787
[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow Python UDTFs
by Allison Wang
· 13 hours ago
2a9999f
[SPARK-53671][PYTHON] Exclude 0-args from `@udf` eval type inference
by Ruifeng Zheng
· 13 hours ago
e95f12b
[SPARK-53633][SQL] Reuse InputStream in vectorized Parquet reader
by Cheng Pan
· 15 hours ago
1841dd2
[SPARK-53516][CORE][TESTS][FOLLOWUP] Fix compilation errors in `SparkPipelinesSuite`
by yangjie01
· 19 hours ago
c6cea73
[SPARK-53591][SDP] Simplify Pipeline Spec Pattern Glob Matching
by Jacky Wang
· 21 hours ago
a13187c
[SPARK-53676][PYTHON][TESTS] Skip UDF type check with numpy 1.x
by Ruifeng Zheng
· 22 hours ago
b6993cb
[SPARK-53516][SDP] Fix `spark.api.mode` arg process in SparkPipelines
by Cheng Pan
· 22 hours ago
0e42b95
[SPARK-53673][CONNECT][TESTS] Fix a flaky test failure in `SparkSessionE2ESuite - interrupt tag` caused by the usage of `ForkJoinPool`
by Kousuke Saruta
· 24 hours ago
fdcd140
[SPARK-53629][SQL] Implement type widening for MERGE INTO WITH SCHEMA EVOLUTION
by Szehon Ho
· 30 hours ago
cf30da2
[SPARK-47110][INFRA] Reenble AmmoniteTest tests in Maven builds
by Kousuke Saruta
· 32 hours ago
dde895c
[SPARK-53651][SDP] Add support for persistent views in pipelines
by Sandy Ryza
· 35 hours ago
33196fe
[SPARK-53643][DOCS] Add Arrow UDF to debugging and user guide
by Xinrong Meng
· 35 hours ago
1e7169e
[SPARK-53660][SQL][TESTS] Add unit test for Metadata equality check
by Yicong-Huang
· 2 days ago
a2adc43
[SPARK-53668][BUILD] Add `--enable-native-access=ALL-UNNAMED` to `build/sbt`
by Dongjoon Hyun
· 2 days ago
69031c9
[SPARK-53661][BUILD][TESTS] Upgrade `bouncycastle` to 1.82
by Dongjoon Hyun
· 2 days ago
c37ab6e
[SPARK-53653][DOC] Update `rexml` gem version to 3.4.4
by Bjørn Jørgensen
· 2 days ago
f9aa4c9
[SPARK-53655][SQL][TESTS] Fix the intention of 'read parquet footers in parallel' test
by Kent Yao
· 2 days ago
ed2692f
[SPARK-53654][SQL][PYTHON] Support `seed` in function `uuid`
by Ruifeng Zheng
· 2 days ago
984e16b
[SPARK-53657][PYTHON][TESTS] Enable doctests for `GroupedData.agg`
by Ruifeng Zheng
· 2 days ago
f48de10
[SPARK-53592][PYTHON][TESTS][FOLLOW-UP] Remove unused config in the parity test
by Ruifeng Zheng
· 2 days ago
36ed5ee
[SPARK-53429][PYTHON] Support Direct Passthrough Partitioning in the PySpark Dataframe API
by Shujing Yang
· 2 days ago
71c67b0
[SPARK-53641][DOCS] Add PARTITION BY support in Arrow Python UDTF docs
by Allison Wang
· 2 days ago
686d844
[SPARK-53592][PYTHON] Make `@udf` support vectorized UDF
by Ruifeng Zheng
· 4 days ago
589141e
[SPARK-53233][SQL][FOLLOWUP] Add compatibility class/object for org.apache.spark.sql.execution.streaming
by Wenchen Fan
· 5 days ago
4f10262
[SPARK-53623][SQL] improve reading large table properties performance
by Yesheng Ma
· 5 days ago
db13a38
[SPARK-53578][CONNECT] Simplify data type handling in LiteralValueProtoConverter
by Yihong He
· 6 days ago
a8bb8b0
[SPARK-53625][SS] Propagate metadata columns through projections to address ApplyCharTypePadding incompatibility
by Livia Zhu
· 6 days ago
552effc
[SPARK-53637][BUILD] Demote bcprov-jdk18on to test scope
by Cheng Pan
· 6 days ago
fb46424
[SPARK-53626][DOCS] Add invalid mixed-type operations to ANSI migration guide
by Xinrong Meng
· 6 days ago
2639792
[SPARK-53523][SQL][FOLLOWUP] Udpate scaladocs and add tests in ProcedureSuite
by Cheng Pan
· 6 days ago
49a3c13
[SPARK-53632][PYTHON][DOCS][TESTS] Reenable doctest for `DataFrame.pandas_api`
by Ruifeng Zheng
· 6 days ago
4b93d4c
[SPARK-53598][SQL] Check the existence of numParts before reading large table property
by Cheng Pan
· 6 days ago
1795306
[SPARK-53323][PYTHON][CONNECT] Enable Spark Connect tests for df.asTable() in Arrow UDTF
by Shujing Yang
· 6 days ago
1ec647e
[SPARK-53630][PYTHON][DOCS][TESTS] Reenable doctest for `Dataframe.freqItems`
by Ruifeng Zheng
· 6 days ago
b87f6d1
[MINOR][TESTS] Restore classic-only python tests
by Ruifeng Zheng
· 6 days ago
9d23f2f
[SPARK-53355][PYTHON][SQL] fix numpy 1.x repr in type tests
by Ben Hurdelhey
· 6 days ago
551e7f2
[SPARK-53619][PYTHON][DOCS][TESTS] Enable doctests for toArrow/toPandas/mapInArrow/mapInPandas
by Ruifeng Zheng
· 6 days ago
3080e61
[SPARK-53387][PYTHON] Add support for Arrow UDTFs with PARTITION BY
by Allison Wang
· 7 days ago
010d36f
[SPARK-53507][CONNECT] Add breaking change info to errors
by imarkowitz
· 7 days ago
f490471
[SPARK-52659][SQL] Misleading modulo error message in ansi mode
by 공성재
· 7 days ago
3990b0f
[SPARK-53620][CORE] SparkSubmit should print stacktrace when exitFn is called
by Cheng Pan
· 7 days ago
87a71fa
[SPARK-53438][CONNECT][SQL] Use CatalystConverter in LiteralExpressionProtoConverter
by Yihong He
· 7 days ago
79a9283
[SPARK-53372][SDP] SDP End to End Testing Suite
by Jacky Wang
· 7 days ago
e08c15b
[SPARK-53606][DOCS] Fix MapInPandas/MapInArrow examples with barrier
by Ruifeng Zheng
· 7 days ago
f57a473
[SPARK-53546][TESTS][FOLLOW-UP] Fix nested array schema evolution and style for InMemoryBaseTable
by Szehon Ho
· 7 days ago
33e40b7
[SPARK-52991][SQL][FOLLOW-UP] Revise `MergeIntoTable` to use `lazy val` and add a new test
by Szehon Ho
· 7 days ago
067f295
[SPARK-53604][INFRA] Temporarily increase PySpark job execution time to 150 minutes
by Ruifeng Zheng
· 7 days ago
aaf9308
[MINOR][DOCS] Update the `See Also` section for time functions
by Ruifeng Zheng
· 7 days ago
5d1ad61
[SPARK-53603][BUILD] Upgrade Checkstyle to 11.0.1
by Dongjoon Hyun
· 7 days ago
3d87de3
[SPARK-53594][PYTHON] Make arrow UDF respect user-specified eval type
by Ruifeng Zheng
· 7 days ago
53a330b
[SPARK-53602][PYTHON] Profile dump improvement and profiler doc fix
by Xinrong Meng
· 8 days ago
78aba00
[SPARK-53581][CORE] Fix potential thread-safety issue for mapTaskIds.add()
by Yi Wu
· 8 days ago
4aa934e
[SPARK-53599][BUILD] Upgrade `Netty` to 4.1.127.Final
by yangjie01
· 8 days ago
54e69ba
[SPARK-53601][INFRA] Use Java 25 instead of 25-ea
by Dongjoon Hyun
· 8 days ago
7781ec2
[SPARK-53600][SQL] Revise `SessionHolder` last access time log message
by Dongjoon Hyun
· 8 days ago
56fa1e9
[MINOR][PYTHON][DOCS] Correct the examples of `toPandas` and `toArrow`
by Ruifeng Zheng
· 8 days ago
86d310b
[SPARK-53559][SQL][CATALYST] Fix HLL sketch updates to use raw collation key bytes
by Chris Boumalhab
· 8 days ago
f03c644
Fix: SparkML-connect can't load SparkML (legacy mode) saved model
by Weichen Xu
· 8 days ago
dbbb54d
[SPARK-53582][SQL] Extend `isExtractable` so it can be applied on `UnresolvedExtractValue`
by Mihailo Timotic
· 8 days ago
24139be
[MINOR][PYTHON][DOCS] Update the doctests to check the default column names
by Ruifeng Zheng
· 8 days ago
fa2f7d5
[SPARK-53584][PYTHON] Improve process_column_param validation and column parameter docstring
by Xinrong Meng
· 8 days ago
e3dce42
[SPARK-53590][BUILD] Add `huaweicloud-provided` profile
by Dongjoon Hyun
· 9 days ago
6c9e750
[SPARK-53586][K8S][BUILD] Upgrade `kubernetes-client` to 7.4.0
by Dongjoon Hyun
· 9 days ago
10b27f3
[SPARK-53577][DOCS] Fix Scaladoc source links for java sources
by Kent Yao
· 9 days ago
8c422f9
[SPARK-53113][SQL] Support the time type by try_make_timestamp()
by Uros Bojanic
· 9 days ago
2844705
[SPARK-53127][SQL] Enable LIMIT ALL to override recursion row limit
by pavle-martinovic_data
· 9 days ago
10c634f
[SPARK-53572][SQL] Avoid throwing from ExtractValue.isExtractable
by Vladimir Golubev
· 9 days ago
1a86e97
[SPARK-53574] Fix AnalysisContext being wiped during nested plan resolution
by Andy
· 9 days ago
40993e6
[SPARK-53361][SS][1/2] Optimizing JVM–Python Communication in TWS by Grouping Multiple Keys into One Arrow Batch
by zeruibao
· 9 days ago
fbdad29
[SPARK-53546][SQL][TESTS] Fix InMemoryDataSource to return default value or null for new fields
by Szehon Ho
· 9 days ago
ce94cb3
[SPARK-53552][SQL] Optimize substr SQL function
by WanKun
· 9 days ago
9bd844b
[SPARK-53558][SQL] Show fully qualified table name including the catalog name in the exception message when the table is not found
by Ganesha S
· 9 days ago
19ca63f
[SPARK-53553][CONNECT] Fix handling of null values in LiteralValueProtoConverter
by Yihong He
· 9 days ago
bacd343
[SPARK-53182][PYTHON][DOCS] Fix broken and missing links in PySpark DataFrames user guide
by Jonny Comes
· 10 days ago
dbd765d
[SPARK-53544][PYTHON] Support complex types on observations
by Takuya Ueshin
· 10 days ago
8c49165
[SPARK-43579][PYTHON] optim: Cache the converter between Arrow and pandas for reuse
by Peter Nguyen
· 10 days ago
e4d60e9
[SPARK-53563][PS] Optimize: sql_processor by avoiding inefficient string concatenation
by Peter Nguyen
· 10 days ago
bc93c25
[SPARK-53568][CONNECT][PYTHON] Fix several small bugs in Spark Connect Python client error handling logic
by Alex Khakhlyuk
· 10 days ago
54dee4a
[SPARK-53537][CORE] Adding Support for Parsing CONTINUE HANDLER
by Teodor Djelic
· 11 days ago
f92601c
[SPARK-53444][SQL] Rework execute immediate
by Serge Rielau
· 11 days ago
04a40ab
[SPARK-53157][CORE] Decouple driver and executor polling intervals
by ForVic
· 12 days ago
4441fa1
[SPARK-53529][PYTHON][CONNECT] Fix `pyspark` connect client to support IPv6
by wangguangxin.cn
· 12 days ago
b8657a9
[SPARK-53523][SQL] Named parameters respect `spark.sql.caseSensitive`
by Cheng Pan
· 12 days ago
c0acf45
[SPARK-53550][SQL][FOLLOWUP] Union output partitioning should compare canonicalized attributes
by Liang-Chi Hsieh
· 12 days ago
8178cb0
[SPARK-53491][SS] Fix exponential formatting of inputRowsPerSecond and processedRowsPerSecond in progress metrics JSON
by jayant.sharma
· 12 days ago
1dab449
[SPARK-53413][SQL] Shuffle cleanup for commands
by Karuppayya Rajendran
· 12 days ago
07d987a
[MINOR][CORE][TESTS] Fix typo in MasterDecommissionSuite class name
by Yuri Niitsuma
· 12 days ago
faa1aaa
[SPARK-53498][SDP] Correctly Reference `pyspark/pipelines/cli.py` from `spark-pipelines` Binary
by anishm-db
· 12 days ago
2962650
[SPARK-52426][SQL][FOLLOWUP] Fix spark-sql repl w/ RedirectConsolePlugin
by Kent Yao
· 12 days ago
f465eca
[SPARK-53557][INFRA] Reduce automated vote email deadline from 4 days to 73 hours
by Dongjoon Hyun
· 12 days ago
Next »