phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@interset.com>
Subject Re: REG: Using Sequences in Phoenix Data Frame
Date Mon, 17 Aug 2015 14:19:10 GMT
Hi Satya,

I don't believe sequences are supported by the broader Phoenix map-reduce
integration, which the phoenix-spark module uses under the hood.

One workaround that would give you sequential IDs, is to use the
'zipWithIndex' method on the underlying Spark RDD, with a small 'map()'
operation to unpack / reorganize the tuple, before saving it to Phoenix.

Good luck!

Josh

On Sat, Aug 15, 2015 at 10:02 AM, Ns G <nsgnsg84@gmail.com> wrote:

> Hi All,
>
> I hope that someone will reply to this email as all my previous emails
> have been unanswered.
>
> I have 10-20 Million records in file and I want to insert it through
> Phoenix-Spark.
> The table primary id is generated by a sequence. So, every time an upsert
> is done, the sequence Id gets generated.
>
> Now I want to implement this in Spark and more precisely using data
> frames. Since RDDs are immutables, How can I add sequence to the rows in
> dataframe?
>
> Thanks for any help or direction or suggestion.
>
> Satya
>

Mime
View raw message