flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Kafka Source Error
Date Mon, 07 Dec 2015 09:37:30 GMT
Kafka consumers keep track of the progress made (offset) in Zookeeper
(there is an option in the newer versions to change that but Flume still
uses the old way).
What happens there is that when the consumer tries to get messages from the
offset it knows but Kafka comes back saying that offset is not present and
it requests the client to "reset" the offset (not sure if it resets to the
oldest or newest).

That might indicate either a conflict because some other kafka cluster is
using the same Zookeeper and groupId, or that the kafka retention is so low
that messages in the queue get deleted before they can be processed. (A
third option is that the Kafka cluster is completely messed up).

In my view, this is a Kafka issue and not Flume's, try to troubleshoot
Kafka first.
When you say "messages lost in the flume pipeline", do you mean that error
you are getting or you have some other issue?

On 7 December 2015 at 09:02, Zhishan Li <zhishanlee@gmail.com> wrote:

> Hi Gonzalo,
> Thanks for your reply. But I still can not figure out why the value of
> *offset* is frequently reset. I am sure, the *groupId* is unique for my
> kafka cluster.
> I find that some messages lost in flume pipeline.  But I don’t know the
> reason. Please do me a favour.
> Thanks,
> On 7 Dec, 2015, at 4:06 pm, Gonzalo Herreros <gherreros@gmail.com> wrote:
> What that means is that the KafkaSource is trying to read messages from
> the last time it was running (or at least the last time some client used
> kafka with the same groupId) but they have been already deleted by Kafka so
> is working you that there are messages that have been missed.
> Even if is the first time you use the KafkaSource, maybe somebody used a
> Kafka consumer with the same groupId long ago. It's better if you make up
> your own groupId so you don't have strange conflicts.
> Regards,
> Gonzalo
> On 7 December 2015 at 04:37, Zhishan Li <zhishanlee@gmail.com> wrote:
>> When I use KafkaSource, the following error is raised:
>> *07 Dec 2015 04:13:11,571 ERROR [ConsumerFetcherThread-]
>> (kafka.utils.Logging$class.error:97)  - Current offset 482243452 for
>> partition [5] out of range; reset offset to 483146676 *
>> * Current offset 482243452 for partition [log,5] out of range; reset
>> offset to 483146676*
>> * consumed offset: 482243452 doesn't match fetch offset: 483146676 for
>> log:5: fetched offset = 483147611: consumed offset = 482243452;*
>> *  Consumer may lose data*
>> But the default configuration of KafkaSource is used.
>> What happens during the agent running?
>> Thanks

View raw message