livy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rabe, Jens" <jens.r...@iwes.fraunhofer.de>
Subject RE: Submitting a PySpark batch job ignores jars sent with it
Date Mon, 08 Oct 2018 10:40:15 GMT
Please disregard, I used an obsolete version of the jar which did indeed not have the classes
in...

From: Rabe, Jens <jens.rabe@iwes.fraunhofer.de>
Sent: Montag, 8. Oktober 2018 12:31
To: user@livy.incubator.apache.org
Subject: Submitting a PySpark batch job ignores jars sent with it

Hello,

I defined a custom format to read data into spark. This works when used in Scala Spark or
e.g. from Zeppelin, also with PySpark.

I now try to use this from Livy. I post something like this to http://mylivy:8998/batches:

{
  "file":"/path/to/myjob.py",
  "args":["foo", "bar"],
  "jars":"/path/to/myformat-assembly.jar"
}

In the log I see the jar gets loaded and added:

    "2018-10-08 12:23:28 INFO  SparkContext:54 - Added JAR file:/// path/to/myformat-assembly.jar
at spark://172.30.10.10:45613/jars/ myformat-assembly.jar with timestamp 1538994208755"

But my PySpark job doesn't find the format:

        "Traceback (most recent call last):",
        "  File \"/path/to/myjob.py \", line 13, in <module>",
        "    data = spark.read.format(\"my.custom.format\").load(path)",
        "  File \"/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py\", line 166,
in load",
        "  File \"/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py\", line 1257,
in __call__",
        "  File \"/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 63, in deco",
        "  File \"/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py\", line 328,
in get_return_value",
        "py4j.protocol.Py4JJavaError: An error occurred while calling o29.load.",
        ": java.lang.ClassNotFoundException: Failed to find data source: my.custom.format.
Please find packages at http://spark.apache.org/third-party-projects.html",

When opening a session (which loads the same library jar) and sending the respective command,
it fails as well.

However, I just added a simple object into this library, and calling this works (like using
sc._jvm.somepackage.Foo.bar())

What am I missing?

Mime
View raw message