flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Abbina <Heman...@eiqnetworks.com>
Subject Re: Possibility of persisting the connection
Date Tue, 17 Nov 2015 18:31:22 GMT
Hi Hari,

Thanks for the response. Agree with you on the HTTP source case.

Will check the Kafka sink again, to see what causes the reconnections.

Sent from my HTC

----- Reply message -----
From: "Hari Shreedharan" <hshreedharan@cloudera.com>
To: "user@flume.apache.org" <user@flume.apache.org>
Subject: Possibility of persisting the connection
Date: Tue, Nov 17, 2015 11:33 PM

Actually in both cases, the connections should be persistent. In HTTP Source case, the client
decides when to close the connection - the HTTP Source is the server, it does not close any

Kafka Sink uses the Kafka Producer API to talk to Kafka. If the connections are re-opened
it could be because of a bug in the Kafka API, or because of the way your events are being
partitioned between brokers (which is based on the event key you set).

Hari Shreedharan

On Nov 17, 2015, at 9:58 AM, Hemanth Abbina <HemanthA@eiqnetworks.com<mailto:HemanthA@eiqnetworks.com>>

Hi Gonzalo,

Thanks for your response.

No, the Kafka sink connection is not the same all times.I have observed the connections closing
and reconnecting.

Sent from my HTC

----- Reply message -----
From: "Gonzalo Herreros" <gherreros@gmail.com<mailto:gherreros@gmail.com>>
To: "user" <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Possibility of persisting the connection
Date: Tue, Nov 17, 2015 11:08 PM

For the sink, I would be surprised if the connection to kafka is not the same all the time.
For the http source you could create a custom source where you keep a long lived http connection
and have some way of detecting where a batch of events is sent (e.g. a new line character).


On 17 November 2015 at 17:16, Hemanth Abbina <HemanthA@eiqnetworks.com<mailto:HemanthA@eiqnetworks.com>>

Though it's against the basic design principle of Flume, I have one question.

Is this possible to persist the connection between source & sink and re-use ?

We are using HTTP source, File channel & Kafka sink and with that configuration, not getting
the expected throughput because of the reconnections of the source & sink for every event.

So, would it be possible to re-use the same HTTP and Kafka connections for multiple transactions
? (even with a custom source & sink)


View raw message