"Migration Guide: Spark Core"

2020-01-21
  • Table of contents {:toc}

Upgrading from Core 2.4 to 3.0

  • The org.apache.spark.ExecutorPlugin interface and related configuration has been replaced with org.apache.spark.plugin.SparkPlugin, which adds new functionality. Plugins using the old interface need to be modified to extend the new interfaces. Check the Monitoring guide for more details.

  • Deprecated method TaskContext.isRunningLocally has been removed. Local execution was removed and it always has returned false.

  • Deprecated method shuffleBytesWritten, shuffleWriteTime and shuffleRecordsWritten in ShuffleWriteMetrics have been removed. Instead, use bytesWritten, writeTime and recordsWritten respectively.

  • Deprecated method AccumulableInfo.apply have been removed because creating AccumulableInfo is disallowed.

  • Event log file will be written as UTF-8 encoding, and Spark History Server will replay event log files as UTF-8 encoding. Previously Spark writes event log file as default charset of driver JVM process, so Spark History Server of Spark 2.x is needed to read the old event log files in case of incompatible encoding.

  • A new protocol for fetching shuffle blocks is used. It's recommended that external shuffle services be upgraded when running Spark 3.0 apps. Old external shuffle services can still be used by setting the configuration spark.shuffle.useOldFetchProtocol to true. Otherwise, Spark may run into errors with messages like IllegalArgumentException: Unexpected message type: <number>.