flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Buntu Dev <buntu...@gmail.com>
Subject Real-time events sessionization and more
Date Tue, 17 Mar 2015 20:25:17 GMT
We got Kafka->Flume->Kite Dataset sink configured to write to Hive backed
dataset. One of the main requirements for us is to do some sessionization
on the data and do funnel analysis.

We are currently handling this relying on Impala/Hive but its quite slow
and given that we want the reports to be updated quite frequently, it
doesn't seem to be scaling well.

Wanted to know if there is any way to intercept the Flume events and do
some sessionzation and concat the events before writing to the dataset. If
so, how would one go about holding onto a state for each user session, etc?

Thanks!

Mime
View raw message