flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Alfeld <jalf...@gmail.com>
Subject Re: Flume filechannel fails to initialilze
Date Fri, 13 Nov 2015 15:53:30 GMT
I agree that in an ideal situation that upgrading the infrastructure or
adding nodes would be the way to go, in the real world we sometimes have to
work within a set of limitations that are outside our control.

Jeff

On Fri, Nov 13, 2015 at 4:33 AM Ahmed Vila <avila@devlogic.eu> wrote:

> Hi Hari,
>
> I think it's actually more than enough. Given the purpose and architecture
> of Flume-ng it feels wrong allowing it to become all around queue.
> Needing more on a single node, the more your system becomes vulnerable to
> a full-stop situation. That's an obvious sign that your infrastructure
> architecture is flawing to begin with.
>
> Instead, invest in infrastructure, add redundant flume and do round-robin
> load balancing.
>
>
> On Thu, Nov 12, 2015 at 8:19 PM, Hari Shreedharan <
> hshreedharan@cloudera.com> wrote:
>
>> So there are a couple of issues related to int overflows - basically the
>> checkpoint file is mmap-ed, so indexing is on integer, and since read 16
>> bytes per event — the total number of events can be about 2 billion / 16 or
>> so (give or take) — so your channel capacity needs to be below that. I have
>> not looked at the exact numbers, but this is an approximate range. If this
>> is something that concerns you, please file a jira. I wanted to get to this
>> at some point, but didn’t see the urgency.
>>
>> Thanks,
>> Hari Shreedharan
>>
>>
>>
>>
>> On Nov 12, 2015, at 8:39 AM, Jeff Alfeld <jalfeld@gmail.com> wrote:
>>
>> Now that the channels are working again it raises the question of why did
>> this occur? If there is a theoretical limit to a filechannel size outside
>> of disk space limitations, what is that limit?
>>
>> Jeff
>>
>> On Thu, Nov 12, 2015 at 10:23 AM Jeff Alfeld <jalfeld@gmail.com> wrote:
>>
>>> Thanks for the assist, it seems that clearing the directories once more
>>> and lowering the capacity of the channel has allowed the service to start
>>> successfully on this server.
>>>
>>> Jeff
>>>
>>> On Thu, Nov 12, 2015 at 10:03 AM Ahmed Vila <avila@devlogic.eu> wrote:
>>>
>>>> 10M channel capacity seems to be exaggerated to me. Try to lower it
>>>> down.
>>>> Please check if you have at least 512MB of free space on the device
>>>> where you're storing channel data and checkpoint.
>>>>
>>>> To me, this seems that it tries to reply the channel log, but it
>>>> encounters an EOF. Please make sure that there is no hidden files in there.
>>>> Maybe removing settings for data and checkpoint dirs would be the best
>>>> bet to try first, so it creates ~/.flume/file-channel/checkpoint and
>>>> ~/.flume/file-channel/data
>>>>
>>>> At the end, you might want to try playing with setting use-fast-reply
>>>> or even use-log-reply-v1 to true.
>>>>
>>>>
>>>> On Tue, Nov 10, 2015 at 5:38 PM, Jeff Alfeld <jalfeld@gmail.com> wrote:
>>>>
>>>>> I am having an issue on a server that I am standing up to forward log
>>>>> data from a spooling directory to our hadoop cluster. I am receiving
the
>>>>> following errors when flume is starting up:
>>>>>
>>>>> 10 Nov 2015 16:13:25,751 INFO  [conf-file-poller-0]
>>>>> (org.apache.flume.node.Application.startAllComponents:145)  - Starting
>>>>> Channel bluecoat-channel
>>>>> 10 Nov 2015 16:13:25,751 INFO  [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.FileChannel.start:269)  - Starting
>>>>> FileChannel bluecoat-channel { dataDirs:
>>>>> [/Dropbox/flume_tmp/bluecoat-channel/data] }...
>>>>> 10 Nov 2015 16:13:25,751 INFO  [conf-file-poller-0]
>>>>> (org.apache.flume.node.Application.startAllComponents:145)  - Starting
>>>>> Channel fs-channel
>>>>> 10 Nov 2015 16:13:25,751 INFO  [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.FileChannel.start:269)  - Starting
>>>>> FileChannel fs-channel { dataDirs: [/Dropbox/flume_tmp/fs-channel/data]
}...
>>>>> 10 Nov 2015 16:13:25,778 INFO  [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.Log.<init>:336)  - Encryption is
not enabled
>>>>> 10 Nov 2015 16:13:25,778 INFO  [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.Log.<init>:336)  - Encryption is
not enabled
>>>>> 10 Nov 2015 16:13:25,779 INFO  [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.Log.replay:382)  - Replay started
>>>>> 10 Nov 2015 16:13:25,779 INFO  [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.Log.replay:382)  - Replay started
>>>>> 10 Nov 2015 16:13:25,780 INFO  [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.Log.replay:394)  - Found NextFileID 0,
from
>>>>> []
>>>>> 10 Nov 2015 16:13:25,780 INFO  [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.Log.replay:394)  - Found NextFileID 0,
from
>>>>> []
>>>>> 10 Nov 2015 16:13:25,784 ERROR [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.Log.replay:492)  - Failed to initialize
Log
>>>>> on [channel=bluecoat-channel]
>>>>> java.io.EOFException
>>>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827)
>>>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860)
>>>>> at
>>>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80)
>>>>> at org.apache.flume.channel.file.Log.replay(Log.java:426)
>>>>> at
>>>>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290)
>>>>> at
>>>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> 10 Nov 2015 16:13:25,786 ERROR [lifecycleSupervisor-1-0]
>>>>> (org.apache.flume.channel.file.FileChannel.start:301)  - Failed to start
>>>>> the file channel [channel=bluecoat-channel]
>>>>> java.io.EOFException
>>>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827)
>>>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860)
>>>>> at
>>>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80)
>>>>> at org.apache.flume.channel.file.Log.replay(Log.java:426)
>>>>> at
>>>>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290)
>>>>> at
>>>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> 10 Nov 2015 16:13:25,784 ERROR [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.Log.replay:492)  - Failed to initialize
Log
>>>>> on [channel=fs-channel]
>>>>> java.io.EOFException
>>>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827)
>>>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860)
>>>>> at
>>>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80)
>>>>> at org.apache.flume.channel.file.Log.replay(Log.java:426)
>>>>> at
>>>>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290)
>>>>> at
>>>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> 10 Nov 2015 16:13:25,787 ERROR [lifecycleSupervisor-1-2]
>>>>> (org.apache.flume.channel.file.FileChannel.start:301)  - Failed to start
>>>>> the file channel [channel=fs-channel]
>>>>> java.io.EOFException
>>>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827)
>>>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860)
>>>>> at
>>>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80)
>>>>> at org.apache.flume.channel.file.Log.replay(Log.java:426)
>>>>> at
>>>>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290)
>>>>> at
>>>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> Any suggestions on why this is occurring? I have tried stopping the
>>>>> service and clearing the  contents of the  data and checkpoint directories
>>>>> with no change. I have  verified that the flume daemon user account has
>>>>> full permissions to the checkpoint and data directories also.
>>>>>
>>>>> Below is the config that I am currently trying to use:
>>>>>
>>>>>
>>>>> #global
>>>>> agent.sources = bluecoat-src fs-src
>>>>> agent.channels = bluecoat-channel fs-channel
>>>>> agent.sinks = bc-avro fs-avro
>>>>>
>>>>>
>>>>> #kc bluecoat logs
>>>>> agent.sources.bluecoat-src.type = spooldir
>>>>> agent.sources.bluecoat-src.channels = bluecoat-channel
>>>>> agent.sources.bluecoat-src.spoolDir = /Dropbox/flume
>>>>> agent.sources.bluecoat-src.basenameHeader = true
>>>>> agent.sources.bluecoat-src.basenameHeaderKey = basename
>>>>> agent.sources.bluecoat-src.deserializer = line
>>>>> agent.sources.bluecoat-src.deserializer.maxLineLength = 32000
>>>>> agent.sources.bluecoat-src.deletePolicy = immediate
>>>>> agent.sources.bluecoat-src.decodeErrorPolicy = IGNORE
>>>>> agent.sources.bluecoat-src.maxBackoff = 10000
>>>>>
>>>>> agent.channels.bluecoat-channel.type = file
>>>>> agent.channels.bluecoat-channel.capacity = 100000000
>>>>> agent.channels.bluecoat-channel.checkpointDir =
>>>>> /Dropbox/flume_tmp/bluecoat-channel/checkpoint
>>>>> agent.channels.bluecoat-channel.dataDirs =
>>>>> /Dropbox/flume_tmp/bluecoat-channel/data
>>>>>
>>>>> agent.sinks.bc-avro.type = avro
>>>>> agent.sinks.bc-avro.channel = bluecoat-channel
>>>>> agent.sinks.bc-avro.hostname = {destination server address}
>>>>> agent.sinks.bc-avro.port = 4141
>>>>> agent.sinks.bc-avro.batch-size = 250
>>>>> agent.sinks.bc-avro.compression-type = deflate
>>>>> agent.sinks.bc-avro.compression-level = 9
>>>>>
>>>>>
>>>>> #kc fs logs
>>>>> agent.sources.fs-src.type = spooldir
>>>>> agent.sources.fs-src.channels = fs-channel
>>>>> agent.sources.fs-src.spoolDir = /Dropbox/fs
>>>>> agent.sources.fs-src.deserializer = line
>>>>> agent.sources.fs-src.deserializer.maxLineLength = 32000
>>>>> agent.sources.fs-src.deletePolicy = immediate
>>>>> agent.sources.fs-src.decodeErrorPolicy = IGNORE
>>>>> agent.sources.fs-src.maxBackoff = 10000
>>>>>
>>>>> agent.channels.fs-channel.type = file
>>>>> agent.channels.fs-channel.capacity = 100000000
>>>>> agent.channels.fs-channel.checkpointDir =
>>>>> /Dropbox/flume_tmp/fs-channel/checkpoint
>>>>> agent.channels.fs-channel.dataDirs = /Dropbox/flume_tmp/fs-channel/data
>>>>>
>>>>> agent.sinks.fs-avro.type = avro
>>>>> agent.sinks.fs-avro.channel = fs-channel
>>>>> agent.sinks.fs-avro.hostname = {destination server address}
>>>>> agent.sinks.fs-avro.port = 4145
>>>>> agent.sinks.fs-avro.batch-size = 250
>>>>> agent.sinks.fs-avro.compression-type = deflate
>>>>> agent.sinks.fs-avro.compression-level = 9
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best regards,
>>>> Ahmed Vila | Senior software developer
>>>> DevLogic | Sarajevo | Bosnia and Herzegovina
>>>>
>>>> Office : +387 33 942 123
>>>> Mobile: +387 62 139 348
>>>>
>>>> Website: www.devlogic.eu
>>>> E-mail   : avila@devlogic.eu
>>>> ---------------------------------------------------------------------
>>>> This e-mail and any attachment is for authorised use by the intended
>>>> recipient(s) only. This email contains confidential information. It should
>>>> not be copied, disclosed to, retained or used by, any party other than the
>>>> intended recipient. Any unauthorised distribution, dissemination or copying
>>>> of this E-mail or its attachments, and/or any use of any information
>>>> contained in them, is strictly prohibited and may be illegal. If you are
>>>> not an intended recipient then please promptly delete this e-mail and any
>>>> attachment and all copies and inform the sender directly via email. Any
>>>> emails that you send to us may be monitored by systems or persons other
>>>> than the named communicant for the purposes of ascertaining whether the
>>>> communication complies with the law and company policies.
>>>>
>>>> ---------------------------------------------------------------------
>>>> This e-mail and any attachment is for authorised use by the intended
>>>> recipient(s) only. This email contains confidential information. It should
>>>> not be copied, disclosed to, retained or used by, any party other than the
>>>> intended recipient. Any unauthorised distribution, dissemination or copying
>>>> of this E-mail or its attachments, and/or any use of any information
>>>> contained in them, is strictly prohibited and may be illegal. If you are
>>>> not an intended recipient then please promptly delete this e-mail and any
>>>> attachment and all copies and inform the sender directly via email. Any
>>>> emails that you send to us may be monitored by systems or persons other
>>>> than the named communicant for the purposes of ascertaining whether the
>>>> communication complies with the law and company policies.
>>>
>>>
>>
>
>
> --
>
> Best regards,
> Ahmed Vila | Senior software developer
> DevLogic | Sarajevo | Bosnia and Herzegovina
>
> Office : +387 33 942 123
> Mobile: +387 62 139 348
>
> Website: www.devlogic.eu
> E-mail   : avila@devlogic.eu
> ---------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. This email contains confidential information. It should
> not be copied, disclosed to, retained or used by, any party other than the
> intended recipient. Any unauthorised distribution, dissemination or copying
> of this E-mail or its attachments, and/or any use of any information
> contained in them, is strictly prohibited and may be illegal. If you are
> not an intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender directly via email. Any
> emails that you send to us may be monitored by systems or persons other
> than the named communicant for the purposes of ascertaining whether the
> communication complies with the law and company policies.
>
> ---------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. This email contains confidential information. It should
> not be copied, disclosed to, retained or used by, any party other than the
> intended recipient. Any unauthorised distribution, dissemination or copying
> of this E-mail or its attachments, and/or any use of any information
> contained in them, is strictly prohibited and may be illegal. If you are
> not an intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender directly via email. Any
> emails that you send to us may be monitored by systems or persons other
> than the named communicant for the purposes of ascertaining whether the
> communication complies with the law and company policies.

Mime
View raw message