livy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Decker, Seth Andrew" <>
Subject new user to livy/spark, basic use case questions
Date Wed, 16 May 2018 14:16:56 GMT

I'm new to Livy and Spark and have a question about how to properly use it.

I'm wanting to use spark both for the interactive scripting side as well as for passing it
data/parameters to run x defined algorithms/applications. I'm looking at using the livy to
interface with spark restfully, but am not sure if I can handle things how I want(or if that's
the intended way to use them). So I can pass python script into spark through livy, which
is great.

Is the intended way to get results to save to an hdfs/database/data store and then read those
results in? I noticed in the java/python clients that you can create your job through livy
and get the results back in the livy http message, which seems much simpler but I'm having
trouble/concerns over using that path.

My first issue is I don't necessarily need the client to know the job. I'd rather that be
saved to hdfs in apache/Hadoop and then livy just tells spark to run it with x parameters/input.
Is this doable with the client or do I just stick with the http api?

If the previous is possible is it also possible to run a python script in spark, called via
the java client? From my sleuthing in the github page it looks like you upload/run/submit
jars in java and .py in python, and I would probably have use cases of wanting to run both(such
as having tensorflow .py scripts, or custom java code). Is there a way to run both from the
same client?

Seth Decker

View raw message