Home > Java > javaTutorial > How do I Add JAR Files to a Spark Job with Spark-Submit and How Does the Classpath Work?

How do I Add JAR Files to a Spark Job with Spark-Submit and How Does the Classpath Work?

Barbara Streisand
Release: 2024-11-11 04:34:02
Original
251 people have browsed it

How do I Add JAR Files to a Spark Job with Spark-Submit and How Does the Classpath Work?

Adding JAR Files to a Spark Job with Spark-Submit

ClassPath Effects

Using extraClassPath or --driver-class-path sets the classpath for the driver node, while spark.executor.extraClassPath sets it for worker nodes. To have a JAR affect both, specify it in both configurations.

Separation Character

The separator used depends on the operating system:

  • Linux: Colon (:)
  • Windows: Semicolon (;)

File Distribution

In client mode, files are distributed via an HTTP server. In cluster mode, they must be made available to workers through HDFS or other shared storage.

URI Types

Accepted URL schemes include:

  • file: - Served by the driver's HTTP server
  • hdfs:, http:, https:, ftp: - Fetch files directly
  • local: - Assumes files exist on each worker node

Affected Options

  • --jars (or SparkContext.addJar): Adds JARs without modifying the classpath.
  • --conf spark.driver.extraClassPath: Adds JARs to the driver classpath.
  • --conf spark.driver.extraLibraryPath: Adds paths to external libraries for the driver.
  • --conf spark.executor.extraClassPath: Adds JARs to the worker classpath.
  • --conf spark.executor.extraLibraryPath: Adds paths to external libraries for workers.

Priority

Values set directly on the SparkConf take precedence over flags or Spark-submit options.

For Simplicity

In client mode, one can use the following to add JARs for both driver and workers:

spark-submit --jars additional1.jar,additional2.jar \
  --driver-class-path additional1.jar:additional2.jar \
  --conf spark.executor.extraClassPath=additional1.jar:additional2.jar \
  --class MyClass main-application.jar
Copy after login

In cluster mode, however, ensure JARs are accessible through a shared storage system.

The above is the detailed content of How do I Add JAR Files to a Spark Job with Spark-Submit and How Does the Classpath Work?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template