flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sutanu Das <sd2...@att.com>
Subject RE: Need Urgent Help (please) with HTTP Source/JSON Handling
Date Fri, 04 Sep 2015 18:04:04 GMT
Huge THANKS Hari.

I just did this per your recommendation/docs – and it worked !!!, I can now see the body
data in HDFS file, Yay!!!

curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d  ['{"headers"
: {"a":"b", "c":"d"},"body": "jonathan_sutaun_body"}'] http://localhost:8889


Question:

Is it possible to get the headers contents/value as well in the HDFS file including body contents?
For that, do we need to write our own custom interceptor ?

Our developer says, he is passing json data via a python script in dict (key, value) pairs
and we need both header and body contents/data in HDFS.

Please advise!

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Friday, September 04, 2015 12:43 PM
To: user@flume.apache.org
Subject: Re: Need Urgent Help (please) with HTTP Source/JSON Handling

The JSONHandler requires the data to be in a specific format: https://flume.apache.org/releases/content/1.5.0/apidocs/org/apache/flume/source/http/JSONHandler.html


Thanks,
Hari

On Fri, Sep 4, 2015 at 10:38 AM, Sutanu Das <sd2302@att.com<mailto:sd2302@att.com>>
wrote:
Dear Community,

We are trying to send http/json messages and no errors in Flume get all files in HDFS is NULL
(no data seen), we are passing events as JSON Strings, yet, when we see files in HDFS, we
see not data

Is there a HDFS “sink” parameter to show JSON data in hdfs?

We are testing this a simple command as this -- curl -H "Accept: application/json" -H "Content-type:
application/json" -X POST -d ['{"id":100}'] http://localhost:8889

We are passing json as string yet hdfs data is created yet no data seen inside hdfs file.


Please HELP, please!




Here is our config:

ale.sources = source1
ale.channels = channel1
ale.sinks =  sink1

# Define the source
ale.sources.source1.type = http
#ale.sources.source1.handler = org.apache.flume.source.http.JSONHandler
ale.sources.source1.port = 8889
ale.sources.source1.bind = 0.0.0.0

# Define the channel 1
ale.channels.channel1.type = memory
ale.channels.channel1.capacity = 10000000
ale.channels.channel1.transactionCapacity = 10000000

# Define a logging sink
ale.sinks.sink1.type = hdfs
ale.sinks.sink1.channel = channel1
ale.sinks.sink1.hdfs.path = hdfs://ham-dal-d001.corp.wayport.net:8020/prod/hadoop/smallsite/flume_ingest_ale2_hak_dev/station/%Y/%m/%d/%H<http://ham-dal-d001.corp.wayport.net:8020/prod/hadoop/smallsite/flume_ingest_ale2_hak_dev/station/%25Y/%25m/%25d/%25H>
#ale.sinks.sink1.hdfs.fileType = DataStream
#ale.sinks.sink1.hdfs.writeFormat = Text
ale.sinks.sink1.hdfs.filePrefix = Ale_2_topology_http_json_raw
ale.sinks.sink1.hdfs.useLocalTimeStamp = true
ale.sinks.sink1.hdfs.round = true
ale.sinks.sink1.hdfs.roundValue = 1
ale.sinks.sink1.hdfs.roundUnit = hour
ale.sinks.sink1.hdfs.callTimeout = 10000000000

Mime
View raw message