commit | f77a8338ac8094c9cf15e4619f023a30139bff42 | [log] [tgz] |
---|---|---|
author | Nga Chung <17833879+ngachung@users.noreply.github.com> | Mon Jun 13 10:37:33 2022 -0700 |
committer | GitHub <noreply@github.com> | Mon Jun 13 10:37:33 2022 -0700 |
tree | df1924731d09500bb4bcb12234a565b72d6a8c3a | |
parent | 998d124d69d6ec5c9ff1671be692d7002273449b [diff] | |
parent | 20fe50e2ebe5b9b4d3b8e266be20ee5265aadab3 [diff] |
Merge pull request #4 from wphyojpl/master breaking: multiple new features and performance enhancement attempts
https://stackoverflow.com/questions/38487667/overwrite-specific-partitions-in-spark-dataframe-write-method?noredirect=1&lq=1 > Finally! This is now a feature in Spark 2.3.0: SPARK-20236 > To use it, you need to set the spark.sql.sources.partitionOverwriteMode setting to dynamic, the dataset needs to be partitioned, and the write mode overwrite. Example: > https://stackoverflow.com/questions/50006526/overwrite-only-some-partitions-in-a-partitioned-spark-dataset spark.conf.set("spark.sql.sources.partitionOverwriteMode","dynamic") data.toDF().write.mode("overwrite").format("parquet").partitionBy("date", "name").save("s3://path/to/somewhere")