layout: post title: Spark Release 3.0.3 categories: [] tags: [] status: publish type: post published: true meta: _edit_last: ‘4’ _wpas_done_all: ‘1’
Spark 3.0.3 is a maintenance release containing stability fixes. This release is based on the branch-3.0 maintenance branch of Spark. We strongly recommend all 3.0 users to upgrade to this stable release.
Notable changes
- [SPARK-34421]: Custom functions can't be used in temporary views with CTEs
- [SPARK-34545]: PySpark Python UDF return inconsistent results when applying 2 UDFs with different return type to 2 columns together
- [SPARK-34719]: Fail if the view query has duplicated column names
- [SPARK-35463]: Skip checking checksum on a system doesn't have
shasum
- [SPARK-32924]: Web UI sort on duration is wrong
- [SPARK-33482]: V2 Datasources that extend FileScan preclude exchange reuse
- [SPARK-33504]: The application log in the Spark history server contains sensitive attributes such as password that should be redated instead of plain text
- [SPARK-34424]: HiveOrcHadoopFsRelationSuite fails with seed 610710213676
- [SPARK-34556]: Checking duplicate static partition columns doesn't respect case sensitive conf
- [SPARK-34596]: NewInstance.doGenCode should not throw malformed class name error
- [SPARK-34763]: col(), $“” and df(“name”) should handle quoted column names properly
- [SPARK-34794]: Nested higher-order functions broken in DSL
- [SPARK-34798]: Fix incorrect join condition
- [SPARK-34876]: Non-nullable aggregates can return NULL in a correlated subquery
- [SPARK-34897]: Support reconcile schemas based on index after nested column pruning
- [SPARK-34909]: conv() does not convert negative inputs to unsigned correctly
- [SPARK-34922]: Use better CBO cost function
- [SPARK-34963]: Nested column pruning fails to extract case-insensitive struct field from array
- [SPARK-34970]: Redact map-type options in the output of explain()
- [SPARK-35080]: Correlated subqueries with equality predicates can return wrong results
- [SPARK-35096]: foreachBatch throws ArrayIndexOutOfBoundsException if schema is case Insensitive
- [SPARK-35106]: HadoopMapReduceCommitProtocol performs bad rename when dynamic partition overwrite is used
- [SPARK-35227]: Replace Bintray with the new repository service for the spark-packages resolver in SparkSubmit
- [SPARK-35296]: Dataset.observe fails with an assertion
- [SPARK-35482]: case sensitive block manager port key should be used in BasicExecutorFeatureStep
- [SPARK-35493]: spark.blockManager.port does not work for driver pod
- [SPARK-35659]: Avoid write null to StateStore
- [SPARK-35673]: Spark fails on unrecognized hint in subquery
- [SPARK-35679]: Overflow on converting valid Timestamp to Microseconds
- [SPARK-34697]: Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about || (string concatenation operator)
- [SPARK-34772]: RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
- [SPARK-35127]: When we switch between different stage-detail pages, the entry item in the newly-opened page may be blank
- [SPARK-35168]: mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
- [SPARK-35566]: Fix number of output rows for StateStoreRestoreExec
- [SPARK-35714]: Bug fix for deadlock during the executor shutdown
- [SPARK-34534]: New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness
- [SPARK-34939]: Throw fetch failure exception when unable to deserialize broadcasted map statuses
Dependency Changes
While being a maintence release we did still upgrade some dependencies in this release they are:
- [SPARK-35210]: Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue
Known issues
- [SPARK-34529]: spark.read.csv is throwing exception ,“lineSep' can contain only 1 character” when parsing windows line feed (CR LF)
You can consult JIRA for the detailed changes.
We would like to acknowledge all community members for contributing patches to this release.