One clarification - as Mubarak mentioned, there is already a Jira for this FLUME-1318. So instead of filing a new issue, you can add your details and thoughts to this.
Hi Inder,On Mon, Jul 9, 2012 at 12:02 AM, Inder Pall <firstname.lastname@example.org> wrote:
Arvind,to me this is an important use-case for frequent prod rollouts. How about thinking in the direction of supporting graceful shutdown for agents.I do believe that the Agent does shutdown gracefully on interrupt. Specifically the components are started in a specific order (FLUME-1236) and then shutdown in the reverse order (FLUME-1325). If you find that is not the case, do please file a Jira with appropriate details.I can't think of an elegant solution at the moment which will address all cases however what are thoughts regarding something on the lines ->1. agent receives a shutdown signal.2. puts all it's channel in isolation mode(wherein no sources can put stuff into it.)3. when the sinks attached to this channel drain( we do the real shutdown).i know we can find issues with this algo however i want to highlight the importance of graceful shutdown being supported as a first class use-case here.I guess what you are asking for is a drain-and-shutdown semantic. I think it is a perfectly reasonable request and something we should consider carefully as it will likely be used in production environments. In order to implement that, we would need to first create a system that allows the ability to send soft-interrupts such as commands via a socket and then create an implementation that provides for the semantics you describe above, along with regular shutdown semantics.The best bet to go about this would be start by filing a Jira, and adding as many details as you can to clearly specify it. And perhaps even taking a crack at doing a patch for it!Regards,Arvind Prabhakar- Inder--
On Mon, Jul 9, 2012 at 12:16 PM, alo alt <email@example.com> wrote:Simple solution:
Two configs on different ports, iptables with transparent forwarding to both ports. Block the first one, all events will be redirected to the other port. Wait 5 minutes, the mem channel should be clear now. Do you changes, start the new config, redirect the traffic to these port and change the other config.
On Jul 9, 2012, at 8:29 AM, Arvind Prabhakar wrote:
> On Sun, Jul 8, 2012 at 11:18 PM, Senthilvel Rangaswamy <firstname.lastname@example.org
>> We are using Flume 1.2.0 with memory channel. When we rollout new
>> we may need to restart flume at which point any events in memory channel
>> is gone. Any
>> ways to avoid this ?
> One way to address this would be to make sure that the upstream sink or
> client can be routed to a different agent when necessary. That way when you
> do want to restart the file channel, you would first route all the traffic
> elsewhere, drain the channel and then do the shutdown as necessary. Once
> the system is back up, you could route the traffic back to this agent.
> I am sure that there are multiple other ways of doing this.
> Arvind Prabhakar
>> "If there's anything more important than my ego around, I want it
>> caught and shot now."
>> - Douglas Adams.
German Hadoop LinkedIn Group: http://goo.gl/N8pCF
Tech Platforms @Inmobi
Linkedin - http://goo.gl/eR4Ub