flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: About jdbc channel
Date Mon, 01 Oct 2012 00:39:30 GMT
Hi Yanzhi,

Flume-1.3.0 did have a bug which we fixed recently. This bug caused the File Channel to not
delete some older files on time, causing a huge number of files to deleted in certain cases.
The bug is https://issues.apache.org/jira/browse/FLUME-1606. This one has been fixed and will
be there in the final version of Flume-1.3.0. This should fix the issue you are facing. You
can either wait for the release or check out trunk and build it. Let me know if you still
see massive backlogs. The performance of File Channel is likely to be an order of magnitude
or more better than JDBC channel.

 Also it is not recommended to use File Channel on anything other than a local disk. Better
not to use a network mounted disk (considering the guarantees that most network file systems
give).


Thanks,
Hari


--  
Hari Shreedharan


On Sunday, September 30, 2012 at 4:32 PM, Yanzhi.liu wrote:

> Hello Hari:
>     I am using the flume 1.2.0.But I am talking about the flume 1.3.0.But I am caring
the version for 1.3.0 that it is n't stable .But I am thinking about your ideas.
> Thank you very much for your ideas!
> My Name:
> Yanzhi Liu
>  
>  
>  
>  
>  
>  
>  
>  
> ------------------ 原始邮件 ------------------  
> 发件人: "Hari Shreedharan"<hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>;
> 发送时间: 2012年9月29日(星期六) 下午3:04
> 收件人: "user"<user@flume.apache.org (mailto:user@flume.apache.org)>;  
> 主题: Re: About jdbc channel
>  
>  
> Hi Yanzhi,  
>  
> I am not sure what file lock you are talking about. File Channel by itself does not do
any time based locking. The only locking it does is to ensure that multiple channels do not
use the same data directory (so there is no issue of lock being lost - the lock file is simply
deleted at channel stop).  Also, the file channel deletes files as the data gets transmitted.
The file channel maxFileSize is configurable, and supports a maximum of around 1.52GB. Adding
multiple HDFS sinks can improve performance too.  
>  
> What version of Flume are you using? I'd suggest trying out File Channel from trunk (or
the upcoming v1.3.0). JDBC channel is generally a lot slower. I have tested Flume in various
configurations and never encountered issues with the file channel. Can you give me details
on the file channel problems you faced? It might be a simple config issue, and easily fixable.
 
>  
> As for the JDBC channel, I am not sure of a good configuration - as I have not really
used it much. Please wait for someone else to reply if you still feel the JDBC channel is
better.  
>  
>  
> Thanks  
> Hari
>  
> --  
> Hari Shreedharan
>  
>  
> On Friday, September 28, 2012 at 11:49 PM, Yanzhi.liu wrote:
>  
> > Hello Hari:
> >     Thanks for your question.But,I am using jdbc channel also use the file channel.File
channel has a problem when there is more than one source to the file channel transmission
the filechannel the Datadir accumulation of a large number of files, the the hdfs sink can
not quickly deal with these files, it will cause a file lock is lost, so that can not continue,
eventually leading to the entire flume cluster comprehensive stop.In order to better monitor,
I therefore joined jdbc channel, how the number of event mangodb statistics can prevent data
loss.
> >     So I want to get a good configuration for jdbc channel.
> > My Name:
> > Yanzhi Liu
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > ------------------ 原始邮件 ------------------  
> > 发件人: "Hari Shreedharan"<hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>;
> > 发送时间: 2012年9月29日(星期六) 中午1:32
> > 收件人: "user"<user@flume.apache.org (mailto:user@flume.apache.org)>; 

> > 主题: Re: About jdbc channel
> >  
> >  
> > Is there any specific reason that you are using jdbc channel? I would recommend
using the FileChannel. The File Channel is what we would currently recommend for use as a
durable channel. We have improved the channel a lot in the recent weeks. To take advantage
of the latest features added to the channel, you can build it and drop in the new jars, or
wait for the next release, which should happen soon.  
> >  
> > Thanks,  
> > Hari
> >  
> > --  
> > Hari Shreedharan
> >  
> >  
> > On Friday, September 28, 2012 at 9:04 PM, Yanzhi.liu wrote:
> >  
> > > Hello:
> > >     I am using the mongodb database.My flume source is custom directory source.
> > >     I am configuration with jdbc channel,but the flume.log was :
> > > 2012-09-28 20:56:49,468 INFO lifecycle.LifecycleSupervisor: Stopping component:
org.apache.flume.channel.jdbc.JdbcChannel@1690ab (mailto:org.apache.flume.channel.jdbc.JdbcChannel@1690ab)
> > > 2012-09-28 20:56:49,612 INFO impl.JdbcChannelProviderImpl: Embedded Derby shutdown
raised SQL STATE 45000 as expected.
> > > 2012-09-28 20:56:49,613 INFO properties.PropertiesFileConfigurationProvider:
Creating channels
> > > 2012-09-28 20:56:49,613 WARN impl.JdbcChannelProviderImpl: No connection URL
specified. Using embedded derby database instance.
> > > 2012-09-28 20:56:49,613 WARN impl.JdbcChannelProviderImpl: Overriding values
for - driver: org.apache.derby.jdbc.EmbeddedDriver, user: saconnectUrl: jdbc:derby:/home/flume/.flume/jdbc-channel/db;create=true,
jdbc properties file: null, dbtype: DERBY
> > > I want to know how to configuration that the jdbc channel will be run.  
> > > Thanks very much!
> > > My Name:
> > > Yanzhi Liu
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> >  
> >  
>  


Mime
View raw message