releases/_posts/2021-06-23-spark-release-3-0-3.md

layout: post title: Spark Release 3.0.3 categories: [] tags: [] status: publish type: post published: true meta: _edit_last: ‘4’ _wpas_done_all: ‘1’

Spark 3.0.3 is a maintenance release containing stability fixes. This release is based on the branch-3.0 maintenance branch of Spark. We strongly recommend all 3.0 users to upgrade to this stable release.

Notable changes

[SPARK-34421]: Custom functions can't be used in temporary views with CTEs
[SPARK-34545]: PySpark Python UDF return inconsistent results when applying 2 UDFs with different return type to 2 columns together
[SPARK-34719]: Fail if the view query has duplicated column names
[SPARK-35463]: Skip checking checksum on a system doesn't have shasum
[SPARK-32924]: Web UI sort on duration is wrong
[SPARK-33482]: V2 Datasources that extend FileScan preclude exchange reuse
[SPARK-33504]: The application log in the Spark history server contains sensitive attributes such as password that should be redated instead of plain text
[SPARK-34424]: HiveOrcHadoopFsRelationSuite fails with seed 610710213676
[SPARK-34556]: Checking duplicate static partition columns doesn't respect case sensitive conf
[SPARK-34596]: NewInstance.doGenCode should not throw malformed class name error
[SPARK-34763]: col(), $“” and df(“name”) should handle quoted column names properly
[SPARK-34794]: Nested higher-order functions broken in DSL
[SPARK-34798]: Fix incorrect join condition
[SPARK-34876]: Non-nullable aggregates can return NULL in a correlated subquery
[SPARK-34897]: Support reconcile schemas based on index after nested column pruning
[SPARK-34909]: conv() does not convert negative inputs to unsigned correctly
[SPARK-34922]: Use better CBO cost function
[SPARK-34963]: Nested column pruning fails to extract case-insensitive struct field from array
[SPARK-34970]: Redact map-type options in the output of explain()
[SPARK-35080]: Correlated subqueries with equality predicates can return wrong results
[SPARK-35096]: foreachBatch throws ArrayIndexOutOfBoundsException if schema is case Insensitive
[SPARK-35106]: HadoopMapReduceCommitProtocol performs bad rename when dynamic partition overwrite is used
[SPARK-35227]: Replace Bintray with the new repository service for the spark-packages resolver in SparkSubmit
[SPARK-35296]: Dataset.observe fails with an assertion
[SPARK-35482]: case sensitive block manager port key should be used in BasicExecutorFeatureStep
[SPARK-35493]: spark.blockManager.port does not work for driver pod
[SPARK-35659]: Avoid write null to StateStore
[SPARK-35673]: Spark fails on unrecognized hint in subquery
[SPARK-35679]: Overflow on converting valid Timestamp to Microseconds
[SPARK-34697]: Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about || (string concatenation operator)
[SPARK-34772]: RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
[SPARK-35127]: When we switch between different stage-detail pages, the entry item in the newly-opened page may be blank
[SPARK-35168]: mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
[SPARK-35566]: Fix number of output rows for StateStoreRestoreExec
[SPARK-35714]: Bug fix for deadlock during the executor shutdown
[SPARK-34534]: New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness
[SPARK-34939]: Throw fetch failure exception when unable to deserialize broadcasted map statuses

Dependency Changes

While being a maintence release we did still upgrade some dependencies in this release they are:

[SPARK-35210]: Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

Known issues

[SPARK-34529]: spark.read.csv is throwing exception ,“lineSep' can contain only 1 character” when parsing windows line feed (CR LF)

You can consult JIRA for the detailed changes.

We would like to acknowledge all community members for contributing patches to this release.