flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: .SpoolingFileLineReader warning....
Date Tue, 20 Nov 2012 17:01:10 GMT
Are /var/log and /mnt/flume on the same file system? mv across file
systems is a cp and then delete.

On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <danoyoung@gmail.com> wrote:
> Coolio, thank you Brock.
>
> Did a quick test with a mv vs cp.....this is one test, repeated a few
> times....the smaller file (~60M) seemed to work fine w/ a mv, but a larger
> file we're seeing the same behavior....
>
>
> Start with no logs in the SpoolingDirectory:
>
> ls -lrt /mnt/flume/clickstream
>
> /mnt/flume/clickstream:
> total 0
>
>
>
> Review the logs that will be rotated via logrotate.d into the respective
> directory; /mnt/flume/clickstream
>
> ls -lrt /var/log/clickstream
> /var/log/clickstream:
> total 64112
> -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log
>
>
>
> Review logrotate config in /etc/logrotate.d. Note here, I changed from cp -p
> to a mv.....
>
> /var/log/clickstream/clickstream.log
> {
>   missingok
>   rotate 3
>   compress
>   delaycompress
>   copytruncate
>   notifempty
>   size 50M
>   dateext
>   dateformat -%Y-%m-%d-%s
>   create 666 ubuntu ubuntu
>   postrotate
>   mv $1 /mnt/flume/clickstream/ 2>&1
>   endscript
> }
>
>
> I run logrotate.d/clickstream.POST with the ~60MB file, and everything
> looked fine....now I try a ~190MB file....
>
>
>
> ls -lrt /var/log/clickstream
> /var/log/clickstream:
> total 192336
> -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log
>
>
> Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume log.
>
> ....
> ....
> 20 Nov 2012 16:45:07,117 WARN  [pool-13-thread-1]
> (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328)  -
> Could not find file:
> /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906
> java.io.FileNotFoundException:
> /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 (Permission
> denied)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:138)
> at java.io.FileReader.<init>(FileReader.java:72)
> at
> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
> at
> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
> at
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 20 Nov 2012 16:45:13,174 INFO  [hdfs-c1s3-call-runner-5]
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:205)  - Creating
> s3n://X:Y@my-bucket/clicks/2012/11/clicks-2012-11-20-16-40-10.145.184.200-.1353429913104.gz.tmp
> ....
> ....
>
> But it seems that FLume did process the log.....
>
> ls -lrt /mnt/flume/clickstream/
> total 256704
> -rw-rw-r-- 1 ubuntu ubuntu  65648336 Nov 20 16:25
> clickstream.log-2012-11-20-1353428715.COMPLETED
> -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:45
> clickstream.log-2012-11-20-1353429906.COMPLETED
>
>
> Regards,
>
> Dano
>
>
>
>
>
> On Tue, Nov 20, 2012 at 9:21 AM, Brock Noland <brock@cloudera.com> wrote:
>>
>> Yeah I think that makes sense, I have created a JIRA for this
>>
>> https://issues.apache.org/jira/browse/FLUME-1733
>>
>> Brock
>>
>> On Tue, Nov 20, 2012 at 9:02 AM, Dan Young <danoyoung@gmail.com> wrote:
>> > Hey Brock,
>> >
>> > I can do some more testing on my side with smaller files as well as
>> > doing a
>> > mv vs a cp . I do believe that a slight delay would be helpful since
>> > people
>> > will be moving/copying large files around.
>> >
>> > Regards ,
>> >
>> > Dano
>> >
>> > On Nov 20, 2012 5:26 AM, "Brock Noland" <brock@cloudera.com> wrote:
>> >>
>> >> Thinking about this more, I think it's probably going to be quite
>> >> common for people to cp large files into the spooling directory.
>> >> Patrick, what do you think about waiting until the mtime is say 1
>> >> second old?
>> >>
>> >> Brock
>> >>
>> >> On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland <brock@cloudera.com>
>> >> wrote:
>> >> > My guess is that the file does not have the correct permissions while
>> >> > being copied.
>> >> >
>> >> > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls
-al
>> >> > test*
>> >> > [1] 18780
>> >> > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0
>> >> > -rw------- 1 noland noland   52334592 Nov 19 17:27 test-1
>> >> >
>> >> >
>> >> > For large files, it probably makes sense to copy the file in as .file
>> >> > and then rename it to file.
>> >> >
>> >> > Brock
>> >> >
>> >> > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <pwendell@gmail.com>
>> >> > wrote:
>> >> >> The spooling source gets a directory listing, then reads each file,
>> >> >> then
>> >> >> renames it to X.COMPLETED. Is it possible some other process deleted
>> >> >> that
>> >> >> file between when Flume listed the directory and when it tried
to
>> >> >> open
>> >> >> the
>> >> >> file? Otherwise, I'm confused why the file would not be present
in
>> >> >> the
>> >> >> listing you give here.
>> >> >>
>> >> >>
>> >> >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell
>> >> >> <pwendell@gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hey Dan,
>> >> >>>
>> >> >>> You say that it seems like Flume has already processed the
log...
>> >> >>> why
>> >> >>> do
>> >> >>> you think that?
>> >> >>>
>> >> >>> When you listed the directory contents I don't see the original
or
>> >> >>> the
>> >> >>> COMPLETED version of the file that Flume is complaining about:
>> >> >>>
>> >> >>> /clickstream.log-2012-11-17-1353163623
>> >> >>>
>> >> >>> doesn't appear in the
>> >> >>>
>> >> >>> /mnt/flume/clickstream/
>> >> >>>
>> >> >>> directory listing anywhere.
>> >> >>>
>> >> >>>
>> >> >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <danoyoung@gmail.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hello Brock,
>> >> >>>>
>> >> >>>> It seems like we get this message each time that logrotate
runs
>> >> >>>> and
>> >> >>>> is in
>> >> >>>> the process of copying the file to the SpoolingDirectory.
It seems
>> >> >>>> that
>> >> >>>> Flume starts reading the file as soon as it shows up in
the
>> >> >>>> SpoolingDirectory.....  Maybe it's trying to read the file
while
>> >> >>>> it's
>> >> >>>> still
>> >> >>>> being written to????
>> >> >>>>
>> >> >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN -
>> >> >>>>
>> >> >>>>
>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)]
>> >> >>>> Could not find file:
>> >> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
>> >> >>>> java.io.FileNotFoundException:
>> >> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
>> >> >>>> (Permission
>> >> >>>> denied)
>> >> >>>> at java.io.FileInputStream.open(Native Method)
>> >> >>>> at java.io.FileInputStream.<init>(FileInputStream.java:138)
>> >> >>>> at java.io.FileReader.<init>(FileReader.java:72)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>> >> >>>> at
>> >> >>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> >> >>>> at
>> >> >>>>
>> >> >>>>
>> >> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> >> >>>> at java.lang.Thread.run(Thread.java:722)
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <brock@cloudera.com>
>> >> >>>> wrote:
>> >> >>>>>
>> >> >>>>> Ok, do you mind sharing your log rotate config to see
if we can
>> >> >>>>> reproduce?
>> >> >>>>>
>> >> >>>>> --
>> >> >>>>> Brock Noland
>> >> >>>>> Sent with Sparrow
>> >> >>>>>
>> >> >>>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young
wrote:
>> >> >>>>>
>> >> >>>>> Hey Brock,
>> >> >>>>>
>> >> >>>>> No I have not modified the conf while the agent was
running.
>> >> >>>>>
>> >> >>>>> /mnt/flume is local. Note that this is running on an
ec2 instance
>> >> >>>>> and
>> >> >>>>> the disk is the ephemeral drive, not EBS.
>> >> >>>>>
>> >> >>>>> Regards ,
>> >> >>>>>
>> >> >>>>> Dano
>> >> >>>>>
>> >> >>>>> On Nov 17, 2012 8:58 AM, "Brock Noland" <brock@cloudera.com>
>> >> >>>>> wrote:
>> >> >>>>>
>> >> >>>>> Hi,
>> >> >>>>>
>> >> >>>>> I highly doubt it's related to
>> >> >>>>> (https://issues.apache.org/jira/browse/FLUME-1721)
but have you
>> >> >>>>> modified the configuration file since starting the
agent?  If so,
>> >> >>>>> can
>> >> >>>>> you restart the agent and see if the error continues?
>> >> >>>>>
>> >> >>>>> Also, is /mnt/flume local disk or NAS?
>> >> >>>>>
>> >> >>>>> Brock
>> >> >>>>>
>> >> >>>>> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <danoyoung@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>> > First a bit of context, I'm using logrotate to
monitor and copy
>> >> >>>>> > (cp
>> >> >>>>> > -p) log
>> >> >>>>> > files to a flume spooling directory source.  So
every hour,
>> >> >>>>> > logrotate
>> >> >>>>> > checks
>> >> >>>>> > for and copies a file from the source to the flume
destination.
>> >> >>>>> > I
>> >> >>>>> > see
>> >> >>>>> > the
>> >> >>>>> > following warning message in the flume logs.
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > 17 Nov 2012 14:47:07,682 WARN  [pool-10-thread-1]
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328)
>> >> >>>>> > -
>> >> >>>>> > Could not find file:
>> >> >>>>> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623
>> >> >>>>> > java.io.FileNotFoundException:
>> >> >>>>> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623
>> >> >>>>> > (Permission
>> >> >>>>> > denied)
>> >> >>>>> > at java.io.FileInputStream.open(Native Method)
>> >> >>>>> > at java.io.FileInputStream.<init>(FileInputStream.java:138)
>> >> >>>>> > at java.io.FileReader.<init>(FileReader.java:72)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> >> >>>>> > at
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> >> >>>>> > at java.lang.Thread.run(Thread.java:722)
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > Although it appears that Flume processes the log,
I'm curious
>> >> >>>>> > why
>> >> >>>>> > I''m
>> >> >>>>> > seeing this and if I have anything with permissions
incorrect?
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > Here's the permissions:
>> >> >>>>> >
>> >> >>>>> > source log directory under /var/log:
>> >> >>>>> > drwxrwxr-x 2 ubuntu    ubuntu   4096 Nov 17 14:47
clickstream
>> >> >>>>> >
>> >> >>>>> > source files:
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu   9055750 Nov 17 13:29
>> >> >>>>> > clickstream.log-2012-11-17-1353158953.gz
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu  13583565 Nov 17 14:17
>> >> >>>>> > clickstream.log-2012-11-17-1353161821.gz
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
>> >> >>>>> > clickstream.log-2012-11-17-1353163623
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu  65648336 Nov 17 14:52
>> >> >>>>> > clickstream.log
>> >> >>>>> >
>> >> >>>>> > flume source directory under /mnt/flume:
>> >> >>>>> > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream
>> >> >>>>> >
>> >> >>>>> > flume source files:
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29
>> >> >>>>> > clickstream.log-2012-11-17-1353158953.COMPLETED
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17
>> >> >>>>> > clickstream.log-2012-11-17-1353161821.COMPLETED
>> >> >>>>> > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47
>> >> >>>>> > clickstream.log-2012-11-17-1353163623.COMPLETED
>> >> >>>>> >
>> >> >>>>> > Any insight would be appreciated.
>> >> >>>>> >
>> >> >>>>> > Regards,
>> >> >>>>> >
>> >> >>>>> > Dan
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> --
>> >> >>>>> Apache MRUnit - Unit testing MapReduce -
>> >> >>>>> http://incubator.apache.org/mrunit/
>> >> >>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Apache MRUnit - Unit testing MapReduce -
>> >> > http://incubator.apache.org/mrunit/
>> >>
>> >>
>> >>
>> >> --
>> >> Apache MRUnit - Unit testing MapReduce -
>> >> http://incubator.apache.org/mrunit/
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce -
>> http://incubator.apache.org/mrunit/
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Mime
View raw message