Hi Gonzalo,

Thanks for your reply. But I still can not figure out why the value of offset is frequently reset. I am sure, the groupId is unique for my kafka cluster.

I find that some messages lost in flume pipeline.  But I don’t know the reason. Please do me a favour. 

Thanks,



On 7 Dec, 2015, at 4:06 pm, Gonzalo Herreros <gherreros@gmail.com> wrote:

What that means is that the KafkaSource is trying to read messages from the last time it was running (or at least the last time some client used kafka with the same groupId) but they have been already deleted by Kafka so is working you that there are messages that have been missed.
Even if is the first time you use the KafkaSource, maybe somebody used a Kafka consumer with the same groupId long ago. It's better if you make up your own groupId so you don't have strange conflicts.

Regards,
Gonzalo


On 7 December 2015 at 04:37, Zhishan Li <zhishanlee@gmail.com> wrote:
When I use KafkaSource, the following error is raised:

07 Dec 2015 04:13:11,571 ERROR [ConsumerFetcherThread-] (kafka.utils.Logging$class.error:97)  - Current offset 482243452 for partition [5] out of range; reset offset to 483146676
Current offset 482243452 for partition [log,5] out of range; reset offset to 483146676
consumed offset: 482243452 doesn't match fetch offset: 483146676 for log:5: fetched offset = 483147611: consumed offset = 482243452;
  Consumer may lose data

But the default configuration of KafkaSource is used. 

What happens during the agent running?

Thanks