Please keep communication on the mailing list.
Remember that you can execute partial-row upserts with Phoenix. As long
as you can generate the primary key from each stream, you don't need to
do anything special in Kafka streams. You can just submit 5 UPSERTS (one
for each stream), and the Phoenix table will eventually have the
aggregated row when you are finished.
On 4/16/18 1:30 PM, Rabin Banerjee wrote:
> Actually I haven't finalised anything just looking at different options.
>
> Basically if I want to join 5 streams and I want to create a
> denormalized stream. Now the problem is if Stream 1's output for current
> window is key 1,2,3,4,5. and might happen that all the other keys have
> already emitted that key before, I can not join them with Kafka
> streams.I need to maintain the whole state for all the streams. So I
> need to figure out the key 1,2,3,4,5 from all the stream and generate a
> combined one as realtime as possible.
>
>
> On Mon, Apr 16, 2018 at 9:04 PM, Josh Elser <elserj@apache.org
> <mailto:elserj@apache.org>> wrote:
>
> Short-answer: no.
>
> You're going to be much better off de-normalizing your five tables
> into one table and eliminate the need for this JOIN.
>
> What made you decide to want to use Phoenix in the first place?
>
>
> On 4/16/18 6:04 AM, Rabin Banerjee wrote:
>
> HI all,
>
> I am new to phoenix, I wanted to know if I have to join 5 huge
> tables where all are keyed based on the same id (i.e. one id
> columns is common between all of them), is there any
> optimization to add to make this join faster , as all the data
> for a particular key for all 5 tables will reside in the same
> region server .
>
> To explain it bit more, suppose we have 5 streams all having a
> common id that we can join with are getting stored in 5
> different hbase table. And we want to join them with Phoenix but
> we dont want cross region shuffle as we already know that the
> key is common in all 5 tables.
>
>
> Thanks //
>
>
|