livy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saisai Shao <>
Subject Re: Some questions about cached data in Livy
Date Wed, 11 Jul 2018 12:17:05 GMT
Hi Wandong,

Livy's shared object mechanism mainly used to share objects between
different Livy jobs, this is mainly used for Job API. For example job A
create a object Foo which wants to be accessed by Job B, then user could
store this object Foo into JobContext with a provided name, after that Job
B could get this object by the name.

This is different from Spark's cache mechanism. What you mentioned above
(tmp table) is a Spark provided table cache mechanism, which is unrelated
to Livy.

Wandong Wu <> 于2018年7月11日周三 下午5:46写道:

> Dear Sir or Madam:
>       I am a Livy beginner. I use Livy, because within an interactive
> session, different spark jobs could share cached RDDs or DataFrames.
>       When I read some parquet files and create a table called “TmpTable”.
> The following queries will use this table. Does it mean this table has been
> cached?
>       If cached, where is the table cached? The table is cached in Livy or
> Spark cluster?
>       Spark also supports cache function.  When I read some parquet files
> and create a table called “TmpTable2”. I add such code: sql_ctx.cacheTable(
> *'tmpTable2'*).
>       In the next query using this table. It will be cached in Spark
> cluster. Then the following queries could use this cached table.
>       What is the difference between cached in Livy and cached in Spark
> cluster?
> Thanks!
> Yours
> Wandong

View raw message