Following upgrade to Apache Spark 3.5 in Dataverse, we noticed some tables stopped syncing to our external data warehouse.
Spark jobs in Synapse Analytics are reporting the following errors:
Caused by: org.apache.spark.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.WRITE_ANCIENT_DATETIME] You may get a different result due to the upgrading to Spark >= 3.0:
writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z
into Parquet files can be dangerous, as the files may be read by Spark 2.x
or legacy versions of Hive later, which uses a legacy hybrid calendar that
is different from Spark 3.0+'s Proleptic Gregorian calendar. See more
details in SPARK-31404. You can set "spark.sql.parquet.datetimeRebaseModeInWrite" to "LEGACY" to rebase the
datetime values w.r.t. the calendar difference during writing, to get maximum
interoperability. Or set the config to "CORRECTED" to write the datetime
values as it is, if you are sure that the written files will only be read by
Spark 3.0+ or other systems that use Proleptic Gregorian calendar.