livy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rabe, Jens" <>
Subject RE: Submitting a PySpark batch job ignores jars sent with it
Date Mon, 08 Oct 2018 10:40:15 GMT
Please disregard, I used an obsolete version of the jar which did indeed not have the classes

From: Rabe, Jens <>
Sent: Montag, 8. Oktober 2018 12:31
Subject: Submitting a PySpark batch job ignores jars sent with it


I defined a custom format to read data into spark. This works when used in Scala Spark or
e.g. from Zeppelin, also with PySpark.

I now try to use this from Livy. I post something like this to http://mylivy:8998/batches:

  "args":["foo", "bar"],

In the log I see the jar gets loaded and added:

    "2018-10-08 12:23:28 INFO  SparkContext:54 - Added JAR file:/// path/to/myformat-assembly.jar
at spark:// myformat-assembly.jar with timestamp 1538994208755"

But my PySpark job doesn't find the format:

        "Traceback (most recent call last):",
        "  File \"/path/to/ \", line 13, in <module>",
        "    data =\"my.custom.format\").load(path)",
        "  File \"/opt/spark/python/lib/\", line 166,
in load",
        "  File \"/opt/spark/python/lib/\", line 1257,
in __call__",
        "  File \"/opt/spark/python/lib/\", line 63, in deco",
        "  File \"/opt/spark/python/lib/\", line 328,
in get_return_value",
        "py4j.protocol.Py4JJavaError: An error occurred while calling o29.load.",
        ": java.lang.ClassNotFoundException: Failed to find data source: my.custom.format.
Please find packages at",

When opening a session (which loads the same library jar) and sending the respective command,
it fails as well.

However, I just added a simple object into this library, and calling this works (like using

What am I missing?

View raw message