flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Vila <av...@devlogic.eu>
Subject Re: WELCOME to user@flume.apache.org
Date Fri, 02 Oct 2015 09:42:23 GMT
Hi Shiva,

If your files are immutable (once the file is placed in a directory, they
won't be changed ever afterwards), then the best source to use is spooling
directory.
If the files are mutable, then avoid spooling directory source as Flume
will throw an exception and shut the source down, so you'll have to restart
it.

You can put flume on a different server than the one where files reside and
have that folder mounted as a local folder via NFS or similar.
That isn't an option if you'll mount source folder across the firewall, two
networks or an internet.

With exec source it's hard to achieve cross-node execution as it will have
to execute a real bash command you provide it with on a remote node.
If you still achieve it, it will be very slow due to constant SSH
negotiation.

Either way, I would most definitely recommend to put flume on a same node
where the source folder is, or at least closest to the source like in the
same network.
That way you can minimize influence of network jitters and dropouts to the
source. All sources that pull data will fail ungracefully if they encounter
an error fetching data and you'll end up restarting flume.

If the HDFS is cross-network or across the internet, I would suggest
bonding two flumes on both sides of a wire via AvroSink on source node and
AvroSource on destination node since they support fundamental things for
such harsh transport environment, like serialization, compression, SSL
security over a single TCP connection and a need to have only one port open
etc.
Then, you configure Flume on destination to drain via HdfsSink into the
HDFS.


On Fri, Oct 2, 2015 at 7:08 AM, Shiva Ram <shivaram.hadoop2015@gmail.com>
wrote:

> Set files are placed in the remote server[not a hadoop cluster node],
> which source type is suitable for collecting these files from remote server
> to HDFS using Flume. The initial study on Flume, I came to know source type
> "Exec", "Spooling Directory" can be used to collect these file, I want to
> know whether Flume service should run the remote server[source system from
> where i want to get the data]? Thanks.
>
> *Thanks & Regards,*
>
> *Shiva Ram*
> *Website: http://datamaking.com <http://datamaking.com>Facebook Page:
> www.facebook.com/datamaking <http://www.facebook.com/datamaking>*
>
> On Fri, Oct 2, 2015 at 10:36 AM, <user-help@flume.apache.org> wrote:
>
>> Hi! This is the ezmlm program. I'm managing the
>> user@flume.apache.org mailing list.
>>
>> Acknowledgment: I have added the address
>>
>>    shivaram.hadoop2015@gmail.com
>>
>> to the user mailing list.
>>
>> Welcome to user@flume.apache.org!
>>
>> Please save this message so that you know the address you are
>> subscribed under, in case you later want to unsubscribe or change your
>> subscription address.
>>
>>
>> --- Administrative commands for the user list ---
>>
>> I can handle administrative requests automatically. Please
>> do not send them to the list address! Instead, send
>> your message to the correct command address:
>>
>> To subscribe to the list, send a message to:
>>    <user-subscribe@flume.apache.org>
>>
>> To remove your address from the list, send a message to:
>>    <user-unsubscribe@flume.apache.org>
>>
>> Send mail to the following for info and FAQ for this list:
>>    <user-info@flume.apache.org>
>>    <user-faq@flume.apache.org>
>>
>> Similar addresses exist for the digest list:
>>    <user-digest-subscribe@flume.apache.org>
>>    <user-digest-unsubscribe@flume.apache.org>
>>
>> To get messages 123 through 145 (a maximum of 100 per request), mail:
>>    <user-get.123_145@flume.apache.org>
>>
>> To get an index with subject and author for messages 123-456 , mail:
>>    <user-index.123_456@flume.apache.org>
>>
>> They are always returned as sets of 100, max 2000 per request,
>> so you'll actually get 100-499.
>>
>> To receive all messages with the same subject as message 12345,
>> send a short message to:
>>    <user-thread.12345@flume.apache.org>
>>
>> The messages should contain one line or word of text to avoid being
>> treated as sp@m, but I will ignore their content.
>> Only the ADDRESS you send to is important.
>>
>> You can start a subscription for an alternate address,
>> for example "john@host.domain", just add a hyphen and your
>> address (with '=' instead of '@') after the command word:
>> <user-subscribe-john=host.domain@flume.apache.org>
>>
>> To stop subscription for this address, mail:
>> <user-unsubscribe-john=host.domain@flume.apache.org>
>>
>> In both cases, I'll send a confirmation message to that address. When
>> you receive it, simply reply to it to complete your subscription.
>>
>> If despite following these instructions, you do not get the
>> desired results, please contact my owner at
>> user-owner@flume.apache.org. Please be patient, my owner is a
>> lot slower than I am ;-)
>>
>> --- Enclosed is a copy of the request I received.
>>
>> Return-Path: <shivaram.hadoop2015@gmail.com>
>> Received: (qmail 43413 invoked by uid 99); 2 Oct 2015 05:06:54 -0000
>> Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142)
>>     by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Oct 2015 05:06:54
>> +0000
>> Received: from localhost (localhost [127.0.0.1])
>>         by spamd1-us-west.apache.org (ASF Mail Server at
>> spamd1-us-west.apache.org) with ESMTP id A1269C14BD
>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>> gmail.com@flume.apache.org>; Fri,  2 Oct 2015 05:06:53 +0000 (UTC)
>> X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
>> X-Spam-Flag: NO
>> X-Spam-Score: 3.131
>> X-Spam-Level: ***
>> X-Spam-Status: No, score=3.131 tagged_above=-999 required=6.31
>>         tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
>>         FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3,
>>         RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01,
>> URIBL_BLOCKED=0.001]
>>         autolearn=disabled
>> Authentication-Results: spamd1-us-west.apache.org (amavisd-new);
>>         dkim=pass (2048-bit key) header.d=gmail.com
>> Received: from mx1-us-east.apache.org ([10.40.0.8])
>>         by localhost (spamd1-us-west.apache.org [10.40.0.7])
>> (amavisd-new, port 10024)
>>         with ESMTP id CjJlyeYvk98Y
>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>> gmail.com@flume.apache.org>;
>>         Fri,  2 Oct 2015 05:06:49 +0000 (UTC)
>> Received: from mail-ig0-f180.google.com (mail-ig0-f180.google.com
>> [209.85.213.180])
>>         by mx1-us-east.apache.org (ASF Mail Server at
>> mx1-us-east.apache.org) with ESMTPS id D4FBA42B32
>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>> gmail.com@flume.apache.org>; Fri,  2 Oct 2015 05:06:48 +0000 (UTC)
>> Received: by igxx6 with SMTP id x6so9676936igx.1
>>         for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>> gmail.com@flume.apache.org>; Thu, 01 Oct 2015 22:06:42 -0700 (PDT)
>> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
>>         d=gmail.com; s=20120113;
>>
>> h=mime-version:in-reply-to:references:date:message-id:subject:from:to
>>          :content-type;
>>         bh=W4CNcckri44NbE1Oxr7dX2Sqd3SyZ+fbygPB84QfoW4=;
>>
>> b=U5ECXsUfh+BabyrKs3fWSkau4ItIQmhGMFojV40mE9Wmd9njMInTSCoHP0tKetDy9W
>>
>>  3wOkHIUKhlcJN1V8Q2XVLXvQ9pxsgOXIBh6CJLKuWW+ROySftRYURLypX8kvjl480Uvp
>>
>>  iosJBrfG9VCP6WGaRTFqLr7ncGr7kSafiAlnUYnfkK9j6DgZZMv31gynAD+uyjQYgmI9
>>
>>  U01YKPiG0nzWf2usFbSFS0ZwNU0iPCeWGzWZsTi4irbpOJGwh0H1bfORasby80kg2VPW
>>
>>  ECUbqM8luLRGqp+JigZzSB6nmMdTiWjFrVjFdVDc1a2MMqZH7Bx9/0f3STIglhFTYolj
>>          CtvA==
>> MIME-Version: 1.0
>> X-Received: by 10.50.70.98 with SMTP id l2mr2264433igu.52.1443762402446;
>> Thu,
>>  01 Oct 2015 22:06:42 -0700 (PDT)
>> Received: by 10.107.15.210 with HTTP; Thu, 1 Oct 2015 22:06:42 -0700 (PDT)
>> In-Reply-To: <1443762280.42117.ezmlm@flume.apache.org>
>> References: <1443762280.42117.ezmlm@flume.apache.org>
>> Date: Fri, 2 Oct 2015 10:36:42 +0530
>> Message-ID: <CAA8xGAEzME9N=
>> ZtQmP2XfGufkigiK5jmuLGtCj6pd-VNV75V2g@mail.gmail.com>
>> Subject: Re: confirm subscribe to user@flume.apache.org
>> From: Shiva Ram <shivaram.hadoop2015@gmail.com>
>> To: user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015=
>> gmail.com@flume.apache.org
>> Content-Type: multipart/alternative; boundary=047d7b3a959223534105211821a4
>>
>>
>


-- 

Best regards,
Ahmed Vila | Senior software developer
DevLogic | Sarajevo | Bosnia and Herzegovina

Office : +387 33 942 123
Mobile: +387 62 139 348

Website: www.devlogic.eu
E-mail   : avila@devlogic.eu
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended
recipient(s) only. This email contains confidential information. It should
not be copied, disclosed to, retained or used by, any party other than the
intended recipient. Any unauthorised distribution, dissemination or copying
of this E-mail or its attachments, and/or any use of any information
contained in them, is strictly prohibited and may be illegal. If you are
not an intended recipient then please promptly delete this e-mail and any
attachment and all copies and inform the sender directly via email. Any
emails that you send to us may be monitored by systems or persons other
than the named communicant for the purposes of ascertaining whether the
communication complies with the law and company policies.

-- 
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. This email contains confidential information. It should 
not be copied, disclosed to, retained or used by, any party other than the 
intended recipient. Any unauthorised distribution, dissemination or copying 
of this E-mail or its attachments, and/or any use of any information 
contained in them, is strictly prohibited and may be illegal. If you are 
not an intended recipient then please promptly delete this e-mail and any 
attachment and all copies and inform the sender directly via email. Any 
emails that you send to us may be monitored by systems or persons other 
than the named communicant for the purposes of ascertaining whether the 
communication complies with the law and company policies.

Mime
View raw message