flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shiva Ram <shivaram.hadoop2...@gmail.com>
Subject Re: WELCOME to user@flume.apache.org
Date Fri, 02 Oct 2015 09:52:49 GMT
Thanks Ahmed Vila.

I will consider the suggestions you have mentioned above when I design the
flume agent.

*Thanks & Regards,*

*Shiva Ram*
*Website: http://datamaking.com <http://datamaking.com>Facebook Page:
www.facebook.com/datamaking <http://www.facebook.com/datamaking>*

On Fri, Oct 2, 2015 at 3:12 PM, Ahmed Vila <avila@devlogic.eu> wrote:

> Hi Shiva,
>
> If your files are immutable (once the file is placed in a directory, they
> won't be changed ever afterwards), then the best source to use is spooling
> directory.
> If the files are mutable, then avoid spooling directory source as Flume
> will throw an exception and shut the source down, so you'll have to restart
> it.
>
> You can put flume on a different server than the one where files reside
> and have that folder mounted as a local folder via NFS or similar.
> That isn't an option if you'll mount source folder across the firewall,
> two networks or an internet.
>
> With exec source it's hard to achieve cross-node execution as it will have
> to execute a real bash command you provide it with on a remote node.
> If you still achieve it, it will be very slow due to constant SSH
> negotiation.
>
> Either way, I would most definitely recommend to put flume on a same node
> where the source folder is, or at least closest to the source like in the
> same network.
> That way you can minimize influence of network jitters and dropouts to the
> source. All sources that pull data will fail ungracefully if they encounter
> an error fetching data and you'll end up restarting flume.
>
> If the HDFS is cross-network or across the internet, I would suggest
> bonding two flumes on both sides of a wire via AvroSink on source node and
> AvroSource on destination node since they support fundamental things for
> such harsh transport environment, like serialization, compression, SSL
> security over a single TCP connection and a need to have only one port open
> etc.
> Then, you configure Flume on destination to drain via HdfsSink into the
> HDFS.
>
>
> On Fri, Oct 2, 2015 at 7:08 AM, Shiva Ram <shivaram.hadoop2015@gmail.com>
> wrote:
>
>> Set files are placed in the remote server[not a hadoop cluster node],
>> which source type is suitable for collecting these files from remote server
>> to HDFS using Flume. The initial study on Flume, I came to know source type
>> "Exec", "Spooling Directory" can be used to collect these file, I want to
>> know whether Flume service should run the remote server[source system from
>> where i want to get the data]? Thanks.
>>
>> *Thanks & Regards,*
>>
>> *Shiva Ram*
>> *Website: http://datamaking.com <http://datamaking.com>Facebook Page:
>> www.facebook.com/datamaking <http://www.facebook.com/datamaking>*
>>
>> On Fri, Oct 2, 2015 at 10:36 AM, <user-help@flume.apache.org> wrote:
>>
>>> Hi! This is the ezmlm program. I'm managing the
>>> user@flume.apache.org mailing list.
>>>
>>> Acknowledgment: I have added the address
>>>
>>>    shivaram.hadoop2015@gmail.com
>>>
>>> to the user mailing list.
>>>
>>> Welcome to user@flume.apache.org!
>>>
>>> Please save this message so that you know the address you are
>>> subscribed under, in case you later want to unsubscribe or change your
>>> subscription address.
>>>
>>>
>>> --- Administrative commands for the user list ---
>>>
>>> I can handle administrative requests automatically. Please
>>> do not send them to the list address! Instead, send
>>> your message to the correct command address:
>>>
>>> To subscribe to the list, send a message to:
>>>    <user-subscribe@flume.apache.org>
>>>
>>> To remove your address from the list, send a message to:
>>>    <user-unsubscribe@flume.apache.org>
>>>
>>> Send mail to the following for info and FAQ for this list:
>>>    <user-info@flume.apache.org>
>>>    <user-faq@flume.apache.org>
>>>
>>> Similar addresses exist for the digest list:
>>>    <user-digest-subscribe@flume.apache.org>
>>>    <user-digest-unsubscribe@flume.apache.org>
>>>
>>> To get messages 123 through 145 (a maximum of 100 per request), mail:
>>>    <user-get.123_145@flume.apache.org>
>>>
>>> To get an index with subject and author for messages 123-456 , mail:
>>>    <user-index.123_456@flume.apache.org>
>>>
>>> They are always returned as sets of 100, max 2000 per request,
>>> so you'll actually get 100-499.
>>>
>>> To receive all messages with the same subject as message 12345,
>>> send a short message to:
>>>    <user-thread.12345@flume.apache.org>
>>>
>>> The messages should contain one line or word of text to avoid being
>>> treated as sp@m, but I will ignore their content.
>>> Only the ADDRESS you send to is important.
>>>
>>> You can start a subscription for an alternate address,
>>> for example "john@host.domain", just add a hyphen and your
>>> address (with '=' instead of '@') after the command word:
>>> <user-subscribe-john=host.domain@flume.apache.org>
>>>
>>> To stop subscription for this address, mail:
>>> <user-unsubscribe-john=host.domain@flume.apache.org>
>>>
>>> In both cases, I'll send a confirmation message to that address. When
>>> you receive it, simply reply to it to complete your subscription.
>>>
>>> If despite following these instructions, you do not get the
>>> desired results, please contact my owner at
>>> user-owner@flume.apache.org. Please be patient, my owner is a
>>> lot slower than I am ;-)
>>>
>>> --- Enclosed is a copy of the request I received.
>>>
>>> Return-Path: <shivaram.hadoop2015@gmail.com>
>>> Received: (qmail 43413 invoked by uid 99); 2 Oct 2015 05:06:54 -0000
>>> Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142)
>>>     by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Oct 2015 05:06:54
>>> +0000
>>> Received: from localhost (localhost [127.0.0.1])
>>>         by spamd1-us-west.apache.org (ASF Mail Server at
>>> spamd1-us-west.apache.org) with ESMTP id A1269C14BD
>>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>>> gmail.com@flume.apache.org>; Fri,  2 Oct 2015 05:06:53 +0000 (UTC)
>>> X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
>>> X-Spam-Flag: NO
>>> X-Spam-Score: 3.131
>>> X-Spam-Level: ***
>>> X-Spam-Status: No, score=3.131 tagged_above=-999 required=6.31
>>>         tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
>>>         FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3,
>>>         RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01,
>>> URIBL_BLOCKED=0.001]
>>>         autolearn=disabled
>>> Authentication-Results: spamd1-us-west.apache.org (amavisd-new);
>>>         dkim=pass (2048-bit key) header.d=gmail.com
>>> Received: from mx1-us-east.apache.org ([10.40.0.8])
>>>         by localhost (spamd1-us-west.apache.org [10.40.0.7])
>>> (amavisd-new, port 10024)
>>>         with ESMTP id CjJlyeYvk98Y
>>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>>> gmail.com@flume.apache.org>;
>>>         Fri,  2 Oct 2015 05:06:49 +0000 (UTC)
>>> Received: from mail-ig0-f180.google.com (mail-ig0-f180.google.com
>>> [209.85.213.180])
>>>         by mx1-us-east.apache.org (ASF Mail Server at
>>> mx1-us-east.apache.org) with ESMTPS id D4FBA42B32
>>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>>> gmail.com@flume.apache.org>; Fri,  2 Oct 2015 05:06:48 +0000 (UTC)
>>> Received: by igxx6 with SMTP id x6so9676936igx.1
>>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>>> gmail.com@flume.apache.org>; Thu, 01 Oct 2015 22:06:42 -0700 (PDT)
>>> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
>>>         d=gmail.com; s=20120113;
>>>
>>> h=mime-version:in-reply-to:references:date:message-id:subject:from:to
>>>          :content-type;
>>>         bh=W4CNcckri44NbE1Oxr7dX2Sqd3SyZ+fbygPB84QfoW4=;
>>>
>>> b=U5ECXsUfh+BabyrKs3fWSkau4ItIQmhGMFojV40mE9Wmd9njMInTSCoHP0tKetDy9W
>>>
>>>  3wOkHIUKhlcJN1V8Q2XVLXvQ9pxsgOXIBh6CJLKuWW+ROySftRYURLypX8kvjl480Uvp
>>>
>>>  iosJBrfG9VCP6WGaRTFqLr7ncGr7kSafiAlnUYnfkK9j6DgZZMv31gynAD+uyjQYgmI9
>>>
>>>  U01YKPiG0nzWf2usFbSFS0ZwNU0iPCeWGzWZsTi4irbpOJGwh0H1bfORasby80kg2VPW
>>>
>>>  ECUbqM8luLRGqp+JigZzSB6nmMdTiWjFrVjFdVDc1a2MMqZH7Bx9/0f3STIglhFTYolj
>>>          CtvA==
>>> MIME-Version: 1.0
>>> X-Received: by 10.50.70.98 with SMTP id l2mr2264433igu.52.1443762402446;
>>> Thu,
>>>  01 Oct 2015 22:06:42 -0700 (PDT)
>>> Received: by 10.107.15.210 with HTTP; Thu, 1 Oct 2015 22:06:42 -0700
>>> (PDT)
>>> In-Reply-To: <1443762280.42117.ezmlm@flume.apache.org>
>>> References: <1443762280.42117.ezmlm@flume.apache.org>
>>> Date: Fri, 2 Oct 2015 10:36:42 +0530
>>> Message-ID: <CAA8xGAEzME9N=
>>> ZtQmP2XfGufkigiK5jmuLGtCj6pd-VNV75V2g@mail.gmail.com>
>>> Subject: Re: confirm subscribe to user@flume.apache.org
>>> From: Shiva Ram <shivaram.hadoop2015@gmail.com>
>>> To: user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>>> gmail.com@flume.apache.org
>>> Content-Type: multipart/alternative;
>>> boundary=047d7b3a959223534105211821a4
>>>
>>>
>>
>
>
> --
>
> Best regards,
> Ahmed Vila | Senior software developer
> DevLogic | Sarajevo | Bosnia and Herzegovina
>
> Office : +387 33 942 123
> Mobile: +387 62 139 348
>
> Website: www.devlogic.eu
> E-mail   : avila@devlogic.eu
> ---------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. This email contains confidential information. It should
> not be copied, disclosed to, retained or used by, any party other than the
> intended recipient. Any unauthorised distribution, dissemination or copying
> of this E-mail or its attachments, and/or any use of any information
> contained in them, is strictly prohibited and may be illegal. If you are
> not an intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender directly via email. Any
> emails that you send to us may be monitored by systems or persons other
> than the named communicant for the purposes of ascertaining whether the
> communication complies with the law and company policies.
>
> ---------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. This email contains confidential information. It should
> not be copied, disclosed to, retained or used by, any party other than the
> intended recipient. Any unauthorised distribution, dissemination or copying
> of this E-mail or its attachments, and/or any use of any information
> contained in them, is strictly prohibited and may be illegal. If you are
> not an intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender directly via email. Any
> emails that you send to us may be monitored by systems or persons other
> than the named communicant for the purposes of ascertaining whether the
> communication complies with the law and company policies.

Mime
View raw message