According to my experience with Flume's S3 sink, such 404 errors are not indication of real problems. I have been using Flume to write to S3 for the last 6 months, and I see these errors all the time without any data loss. At the beginning I was worried, and posted on this mail list asking for clarification, no definitive conclusion was reached. Now I kinds of believe it's just incorrectly reported.

Shuang

On Tue, Nov 22, 2011 at 9:45 AM, Mark Lewandowski <mark.e.lewandowski@gmail.com> wrote:
I wrote to this list with a similar problem last week.  I started noticing the 404s after upgrading to flume 0.9.4.  The weird part is, most of the requests to S3 are 404-ing, but not all.  Eric Sammer suggested it might be due to data inconsistency on the S3 side, but I'm not sure I believe that.  I also posted a question about this on the S3 forum, but haven't heard back yet.

-Mark


On Tue, Nov 22, 2011 at 8:26 AM, Alexander C.H. Lorenz <wget.null@googlemail.com> wrote:
I'm not sure, in your log I see a lot of 404 instead 200 (means that some buckets could not be load) from a s3 instance (org.jets3t.service.impl.rest.httpclient.RestS3Service). All warnings concerns the same file (0025 at the end), and at least flume will give up. Looks for me like a S3 problem to write the file (s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")

best,
 Alex


On Tue, Nov 22, 2011 at 5:14 PM, Jonathan Meed <jmeed@umich.edu> wrote:
Sorry to seam a little dense. So if I understand this correctly the flume collector is having issues connecting to the flume master and therefore is erroring? Both the flume collector in question and the master are on the same physical machine, and non of the other flume nodes on different machines are showing any errors, which is peculiar. Any suggestions on how to fix this?


Jonathan Meed
University of Michigan
School of Engineering Class of 2013



On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <wget.null@googlemail.com> wrote:
That exception will be send when the master-RPC is'nt reachable:

- Alex

On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jmeed@umich.edu> wrote:
It appears that S3 is working. I can see new flume events getting added in my S3 bucket.

Jonathan Meed
University of Michigan
School of Engineering Class of 2013



On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <wget.null@googlemail.com> wrote:
Yes, looks like an issue in your S3 instance. Are they running and available?

- alex


On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jmeed@umich.edu> wrote:
Hi,

I am actually using a multi-sink for HDFS and S3. Could that be the issues. The config I sued is below. 


config [beacon_flume02use, autoCollectorSource, [collectorSink("hdfs://107.20.248.101/user/flume/beaconlog","syslog" , 30000), collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog") ], exec, config, beacon_flume01use, autoCollectorSource, [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog" , 30000), collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
 

Thanks

Jonathan Meed
University of Michigan
School of Engineering Class of 2013



On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <wget.null@googlemail.com> wrote:
Hi,

runs on Amazon's S3?

org.jets3t.service.impl.rest.httpclient.RestS3Service: Response '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24' - Unexpected response code 404, expected 200

Check if the trackers are running.

- alex


On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jmeed@umich.edu> wrote:
Hi,

I had a flume cluster operating pretty well over the last few weeks. In the past two days 1 of my 2 flume collectors started erroring ever few minutes. It would work fine after a restart moving files for only a few minutes before stopping with the same error. Here's a link to the weblog. Any help would be greatly appreciated. 


Jonathan Meed
University of Michigan
College of Engineering Class of 2013




--
Alexander Lorenz

Think of the environment: please don't print this email unless you really need to.






--
Alexander Lorenz

Think of the environment: please don't print this email unless you really need to.






--
Alexander Lorenz

Think of the environment: please don't print this email unless you really need to.






--
Alexander Lorenz

Think of the environment: please don't print this email unless you really need to.