flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikram Kulkarni <vikulka...@expedia.com>
Subject Re: Unable to setup HDFS sink
Date Mon, 14 Jan 2013 07:41:54 GMT
Got that. Was just reading about that and other options.
Thanks again!

From: Nitin Pawar <nitinpawar432@gmail.com<mailto:nitinpawar432@gmail.com>>
Reply-To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Date: Sunday, January 13, 2013 11:39 PM
To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Re: Unable to setup HDFS sink

that might the default hdfs sink rollover time.

you can always configure it the way you want
1) Number of events in each time
2) How much time you want till a the file gets rolled over

On Mon, Jan 14, 2013 at 1:06 PM, Vikram Kulkarni <vikulkarni@expedia.com<mailto:vikulkarni@expedia.com>>
Thanks for your prompt replies. I had switched my core-site.xml and was now using 8020.
That worked, however, I am getting the following output on the console:
Once I send the event to the flume source, it correctly outputs it to the console but display
the following messages in the log:
2013-01-13 23:28:20,178 (hdfs-hdfssink-call-runner-0) [INFO - org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:208)]
Creating hdfs://localhost:8020/usr/FlumeData.1358148499961.tmp
2013-01-13 23:28:50,237 (hdfs-hdfssink-roll-timer-0) [INFO - org.apache.flume.sink.hdfs.BucketWriter.renameBucket(BucketWriter.java:427)]
Renaming hdfs://localhost:8020/usr/FlumeData.1358148499961.tmp to hdfs://localhost:8020/usr/FlumeData.1358148499961

Notice the time difference between the 'Creating..' and 'Renaming…' lines. Is about 30 secs
normal ?

Then when I actually go to the dfs file system I do find the FlumeData.1358148499961 file
as expected.


From: Nitin Pawar <nitinpawar432@gmail.com<mailto:nitinpawar432@gmail.com>>
Reply-To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Date: Sunday, January 13, 2013 11:07 PM
To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Re: Unable to setup HDFS sink

Its a jobtracker uri

There shd be a conf in ur hdfs-site.xml and core-site.xml which looks like

You need to use that value

On Jan 14, 2013 12:34 PM, "Vikram Kulkarni" <vikulkarni@expedia.com<mailto:vikulkarni@expedia.com>>
I was able to write using the same hdfs conf from a different sink.
Also, I can open the MapRed administration page successfully at
http://localhost:50030/jobtracker.jsp So that should indicate that the
hdfs path below is valid right? Any other way to check?


On 1/13/13 10:57 PM, "Alexander Alten-Lorenz" <wget.null@gmail.com<mailto:wget.null@gmail.com>>

>Check your HDFS cluster, he's not responding on localhost/<>
>- Alex
>On Jan 14, 2013, at 7:43 AM, Vikram Kulkarni <vikulkarni@expedia.com<mailto:vikulkarni@expedia.com>>
>> I am trying to setup a sink for hdfs for HTTPSource . But I get the
>>following exception when I try to send a simple Json event. I am also
>>using a logger sink and I can clearly see the event output to the
>>console window but it fails to write to hdfs. I have also in a separate
>>conf file successfully written to hdfs sink.
>> Thanks,
>> Vikram
>> Exception:
>> [WARN -
>> HDFS IO error
>> java.io.IOException: Call to localhost/<>
failed on local
>>exception: java.io.EOFException
>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
>> My conf file is as follows:
>> # flume-httphdfs.conf: A single-node Flume with Http Source and hdfs
>>sink configuration
>> # Name the components on this agent
>> agent1.sources = r1
>> agent1.channels = c1
>> # Describe/configure the source
>> agent1.sources.r1.type = org.apache.flume.source.http.HTTPSource
>> agent1.sources.r1.port = 5140
>> agent1.sources.r1.handler = org.apache.flume.source.http.JSONHandler
>> agent1.sources.r1.handler.nickname = random props
>> # Describe the sink
>> agent1.sinks = logsink hdfssink
>> agent1.sinks.logsink.type = logger
>> agent1.sinks.hdfssink.type = hdfs
>> agent1.sinks.hdfssink.hdfs.path = hdfs://localhost:50030/flume/events
>> agent1.sinks.hdfssink.hdfs.file.Type = DataStream
>> # Use a channel which buffers events in memory
>> agent1.channels.c1.type = memory
>> agent1.channels.c1.capacity = 1000
>> agent1.channels.c1.transactionCapacity = 100
>> # Bind the source and sink to the channel
>> agent1.sources.r1.channels = c1
>> agent1.sinks.logsink.channel = c1
>> agent1.sinks.hdfssink.channel = c1
>Alexander Alten-Lorenz
>German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Nitin Pawar

View raw message