flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From iain wright <iainw...@gmail.com>
Subject Re: Flume events to Spark sink ?
Date Tue, 01 Sep 2015 18:07:22 GMT
I was able to get approach #2 working here for a flume sink -> spark
streaming:
http://spark.apache.org/docs/latest/streaming-flume-integration.html

We ended up not moving forward with this approach and will instead look to
integrate spark reading from kafka.

If you want to give approach #2 from above a shot, below is the relevant
part of the ansible playbook that should get you past a couple issues we
ran into. I don't have the flume config laying around but there wasn't
anything too tricky with it.

Note that this sink is not production ready, and does not contain metrics
output

# SPARK - No longer used, kept for reference
 - name: Get spark sink dependencies
   get_url: url="
http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-streaming-flume-sink_2.10/1.3.1/spark-streaming-flume-sink_2.10-1.3.1.jar"
dest={{install_directory}}/lib/spark-streaming-flume-sink_2.10-1.3.1.jar

 - name: Get scala dependency for spark sink
   get_url: url="
http://search.maven.org/remotecontent?filepath=org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar"
dest={{install_directory}}/lib/scala-library-2.10.4.jar

 - name: Get spark, needed for guava at org/apache/spark-project/util/guava
   get_url: url="
http://d3kbcqa49mib13.cloudfront.net/spark-1.3.1-bin-hadoop2.6.tgz"
dest=/tmp/spark-1.3.1-bin-hadoop2.6.tgz

 - name: Extract spark
   command: tar -xvzf /tmp/spark-1.3.1-bin-hadoop2.6.tgz -C /tmp/
creates=/tmp/spark-1.3.1-bin-hadoop2.6

 - name: Copy spark libs to flume/lib
   shell: rsync -ci
/tmp/spark-1.3.1-bin-hadoop2.6/lib/spark-assembly-1.3.1-hadoop2.6.0.jar
{{install_directory}}/lib/spark-assembly-1.3.1-hadoop2.6.0.jar
   register: rsync_result
   changed_when: "rsync_result.stdout != ''"

 - name: Get latest avro, needed for spark sink to not break
   get_url: url="
http://archive.apache.org/dist/avro/avro-1.7.7/java/{{item}}"
dest={{install_directory}}/lib/{{item}}
   with_items:
   - avro-1.7.7.jar
   - avro-ipc-1.7.7.jar

 - name: Make sure old avro jars dont exist
   file: path={{install_directory}}/lib/{{item}} state=absent
   with_items:
   - avro-1.7.3.jar
   - avro-ipc-1.7.3.jar

HTH,

-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Tue, Sep 1, 2015 at 9:51 AM, Sutanu Das <sd2302@att.com> wrote:

> Hi Team,
>
>
>
> Is there anyway to stream flume events to Spark sink?
>
>
>
> Is there anyway to steam flume events to Storm sink?
>
>
>
> If anyone has successfully accomplished this, we would love to hear about
> the high-level configsā€¦.
>
>
>
> Thanks!
>
>
>
>
>

Mime
View raw message