Whats the best known usage of flume with hive? Just curious to see what everyone is using. My requirements are standard..
- Currently writing logs onto HDFS from different production servers.
- Need to pre process the logs before writing onto hive.
- Need a way to merge the files generated by flume.
I see that there is a flume+hive sink plugin, but did not find much usage data on that. I could write a custom sink or a custom decorator to do the pre processing & then run every hour cron jobs to write data from HDFS to hive.