phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vamsi Krishna <vamsi.attl...@gmail.com>
Subject Re: Does phoenix CsvBulkLoadTool write to WAL/Memstore
Date Wed, 16 Mar 2016 14:33:45 GMT
Thanks Gabriel & Ravi.

I have a data processing job wirtten in Spark-Scala.
I do a join on data from 2 data files (CSV files) and do data
transformation on the resulting data. Finally load the transformed data
into phoenix table using Phoenix-Spark plugin.
On seeing that Phoenix-Spark plugin goes through regular HBase write path
(writes to WAL), i'm thinking of option 2 to reduce the job execution time.

*Option 2:* Do data transformation in Spark and write the transformed data
to a CSV file and use Phoenix CsvBulkLoadTool to load data into Phoenix
table.

Has anyone tried this kind of exercise? Any thoughts.

Thanks,
Vamsi Attluri

On Tue, Mar 15, 2016 at 9:40 PM Ravi Kiran <maghamravikiran@gmail.com>
wrote:

> Hi Vamsi,
>    The upserts through Phoenix-spark plugin definitely go through WAL .
>
>
> On Tue, Mar 15, 2016 at 5:56 AM, Gabriel Reid <gabriel.reid@gmail.com>
> wrote:
>
>> Hi Vamsi,
>>
>> I can't answer your question abotu the Phoenix-Spark plugin (although
>> I'm sure that someone else here can).
>>
>> However, I can tell you that the CsvBulkLoadTool does not write to the
>> WAL or to the Memstore. It simply writes HFiles and then hands those
>> HFiles over to HBase, so the memstore and WAL are never
>> touched/affected by this.
>>
>> - Gabriel
>>
>>
>> On Tue, Mar 15, 2016 at 1:41 PM, Vamsi Krishna <vamsi.attluri@gmail.com>
>> wrote:
>> > Team,
>> >
>> > Does phoenix CsvBulkLoadTool write to HBase WAL/Memstore?
>> >
>> > Phoenix-Spark plugin:
>> > Does saveToPhoenix method on RDD[Tuple] write to HBase WAL/Memstore?
>> >
>> > Thanks,
>> > Vamsi Attluri
>> > --
>> > Vamsi Attluri
>>
>
> --
Vamsi Attluri

Mime
View raw message