Spark Release 3.3.4
Spark 3.3.4 is the last maintenance release containing security and correctness fixes. This release is based on the branch-3.3 maintenance branch of Spark. We strongly recommend all 3.3 users to upgrade to this stable release.
Notable changes
- [SPARK-43327]: Trigger
committer.setupJob
before plan execute in FileFormatWriter#write
- [SPARK-43393]: Address sequence expression overflow bug
- [SPARK-44547]: Ignore fallback storage for cached RDD migration
- [SPARK-44581]: Fix the bug that ShutdownHookManager gets wrong UGI from SecurityManager of ApplicationMaster
- [SPARK-44725]: Document
spark.network.timeoutInterval
- [SPARK-44805]: getBytes/getShorts/getInts/etc. should work in a column vector that has a dictionary
- [SPARK-44857]: Fix
getBaseURI
error in Spark Worker LogPage UI buttons
- [SPARK-44871]: Fix percentile_disc behaviour
- [SPARK-44920]: Use await() instead of awaitUninterruptibly() in TransportClientFactory.createClient()
- [SPARK-44925]: K8s default service token file should not be materialized into token
- [SPARK-44935]: Fix
RELEASE
file to have the correct information in Docker images if exists
- [SPARK-44937]: Mark connection as timedOut in TransportClient.close
- [SPARK-44973]: Fix
ArrayIndexOutOfBoundsException
in conv()
- [SPARK-44990]: Reduce the frequency of get
spark.sql.legacy.nullValueWrittenAsQuotedEmptyStringCsv
- [SPARK-45057]: Avoid acquire read lock when keepReadLock is false
- [SPARK-45079]: Fix an internal error from
percentile_approx()
on NULL
accuracy
- [SPARK-45100]: Fix an internal error from
reflect()
on NULL
class and method
- [SPARK-45187]: Fix
WorkerPage
to use the same pattern for logPage
urls
- [SPARK-45227]: Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend
- [SPARK-45389]: Correct MetaException matching rule on getting partition metadata
- [SPARK-45430]: Fix for FramelessOffsetWindowFunction when IGNORE NULLS and offset > rowCount
- [SPARK-45508]: Add “–add-opens=java.base/jdk.internal.ref=ALL-UNNAMED” so Platform can access Cleaner on Java 9+
- [SPARK-45580]: Handle case where a nested subquery becomes an existence join
- [SPARK-45670]: SparkSubmit does not support
--total-executor-cores
when deploying on K8s
- [SPARK-45749]: Fix
Spark History Server
to sort Duration
column properly
- [SPARK-45920]: group by ordinal should be idempotent
- [SPARK-46006]: YarnAllocator miss clean targetNumExecutorsPerResourceProfileId after YarnSchedulerBackend call stop
- [SPARK-46012]: EventLogFileReader should not read rolling logs if app status file is missing
- [SPARK-46029]: Escape the single quote, _ and % for DS V2 pushdown
- [SPARK-46092]: Don’t push down Parquet row group filters that overflow
- [SPARK-46095]: Document
REST API
for Spark Standalone Cluster
- [SPARK-46239]: Hide
Jetty
info
- [SPARK-46286]: Document
spark.io.compression.zstd.bufferPool.enabled
Dependency Changes
While being a maintenance release we did still upgrade some dependencies in this release they are:
You can consult JIRA for the detailed changes.
We would like to acknowledge all community members for contributing patches to this release.
Spark News Archive