flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Lewandowski <mark.e.lewandow...@gmail.com>
Subject Re: Flume collectors started crashing regularly all of a sudden
Date Tue, 22 Nov 2011 17:45:59 GMT
I wrote to this list with a similar problem last week.  I started noticing
the 404s after upgrading to flume 0.9.4.  The weird part is, most of the
requests to S3 are 404-ing, but not all.  Eric Sammer suggested it might be
due to data inconsistency on the S3 side, but I'm not sure I believe that.
I also posted a question about this on the S3 forum, but haven't heard back
yet.

-Mark

On Tue, Nov 22, 2011 at 8:26 AM, Alexander C.H. Lorenz <
wget.null@googlemail.com> wrote:

> I'm not sure, in your log I see a lot of 404 instead 200 (means that some
> buckets could not be load) from a s3 instance
> (org.jets3t.service.impl.rest.httpclient.RestS3Service). All warnings
> concerns the same file (0025 at the end), and at least flume will give up.
> Looks for me like a S3 problem to write the file (s3n://
> hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")
>
> best,
>  Alex
>
>
> On Tue, Nov 22, 2011 at 5:14 PM, Jonathan Meed <jmeed@umich.edu> wrote:
>
>> Sorry to seam a little dense. So if I understand this correctly the flume
>> collector is having issues connecting to the flume master and therefore is
>> erroring? Both the flume collector in question and the master are on the
>> same physical machine, and non of the other flume nodes on different
>> machines are showing any errors, which is peculiar. Any suggestions on how
>> to fix this?
>>
>>
>> Jonathan Meed
>> University of Michigan
>> School of Engineering Class of 2013
>> jmeed@umich.edu
>> 917-880-7974
>>
>>
>>
>> On Tue, Nov 22, 2011 at 11:07 AM, Alexander C.H. Lorenz <
>> wget.null@googlemail.com> wrote:
>>
>>> That exception will be send when the master-RPC is'nt reachable:
>>>
>>> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/java/com/cloudera/flume/handlers/endtoend/CollectorAckListener.java
>>>
>>> - Alex
>>>
>>> On Tue, Nov 22, 2011 at 4:58 PM, Jonathan Meed <jmeed@umich.edu> wrote:
>>>
>>>> It appears that S3 is working. I can see new flume events getting added
>>>> in my S3 bucket.
>>>>
>>>> Jonathan Meed
>>>> University of Michigan
>>>> School of Engineering Class of 2013
>>>> jmeed@umich.edu
>>>> 917-880-7974
>>>>
>>>>
>>>>
>>>> On Tue, Nov 22, 2011 at 10:55 AM, Alexander C.H. Lorenz <
>>>> wget.null@googlemail.com> wrote:
>>>>
>>>>> Yes, looks like an issue in your S3 instance. Are they running and
>>>>> available?
>>>>>
>>>>> - alex
>>>>>
>>>>>
>>>>> On Tue, Nov 22, 2011 at 4:52 PM, Jonathan Meed <jmeed@umich.edu>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>  I am actually using a multi-sink for HDFS and S3. Could that be
the
>>>>>> issues. The config I sued is below.
>>>>>>
>>>>>>
>>>>>> config [beacon_flume02use, autoCollectorSource,
>>>>>> [collectorSink("hdfs://107.20.248.101/user/flume/beaconlog","syslog"
>>>>>> , 30000),
>>>>>> collectorSink("s3n://hooklogic-data-east/flume/%Y/%m/%d/%H","syslog")
],
>>>>>> exec, config, beacon_flume01use, autoCollectorSource,
>>>>>> [collectorSink("hdfs://ip-address/user/flume/beaconlog","syslog"
, 30000),
>>>>>> collectorSink("s3n://east/flume/%Y/%m/%d/%H","syslog") ]]
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Jonathan Meed
>>>>>> University of Michigan
>>>>>> School of Engineering Class of 2013
>>>>>> jmeed@umich.edu
>>>>>>  917-880-7974
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 22, 2011 at 10:50 AM, Alexander C.H. Lorenz <
>>>>>> wget.null@googlemail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> runs on Amazon's S3?
>>>>>>>
>>>>>>> org.jets3t.service.impl.rest.httpclient.RestS3Service: Response
>>>>>>> '/flume%2F2011%2F11%2F20%2F03%2Fsyslog20111121-193628975-0500.2960659853113025.00000025_%24folder%24'
>>>>>>> - Unexpected response code 404, expected 200
>>>>>>>
>>>>>>> Check if the trackers are running.
>>>>>>>
>>>>>>> - alex
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 22, 2011 at 4:45 PM, Jonathan Meed <jmeed@umich.edu>wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I had a flume cluster operating pretty well over the last
few
>>>>>>>> weeks. In the past two days 1 of my 2 flume collectors started
erroring
>>>>>>>> ever few minutes. It would work fine after a restart moving
files for only
>>>>>>>> a few minutes before stopping with the same error. Here's
a link to the
>>>>>>>> weblog. Any help would be greatly appreciated.
>>>>>>>>
>>>>>>>> http://pastebin.com/fspsNdYC
>>>>>>>>
>>>>>>>> Jonathan Meed
>>>>>>>> University of Michigan
>>>>>>>> College of Engineering Class of 2013
>>>>>>>> jmeed@umich.edu
>>>>>>>> 917-880-7974
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alexander Lorenz
>>>>>>> http://mapredit.blogspot.com
>>>>>>>
>>>>>>> *P **Think of the environment: please don't print this email
unless
>>>>>>> you really need to.*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alexander Lorenz
>>>>> http://mapredit.blogspot.com
>>>>>
>>>>> *P **Think of the environment: please don't print this email unless
>>>>> you really need to.*
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless you
>>> really need to.*
>>>
>>>
>>>
>>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> *P **Think of the environment: please don't print this email unless you
> really need to.*
>
>
>

Mime
View raw message