flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ashutosh(오픈플랫폼개발팀) <sharma.ashut...@kt.com>
Subject RE: flume ng error while going for hdfs sink
Date Fri, 06 Jul 2012 10:08:02 GMT
Hi Amit,

For your problem (1): There is syntax error in your HDFS sink configuration, that’s why
the file is getting stored in sequence file format.
agent1.sinks.HDFS.hdfs.file.Type = DataStream
agent1.sinks.HDFS.hdfs.file.Format = Text

You need to correct it as below:
agent1.sinks.HDFS.hdfs.fileType = DataStream
agent1.sinks.HDFS.hdfs.writeFormat = Text

I hope this will solve your first problem.
----------------------------------------
----------------------------------------
Thanks & Regards,
Ashutosh Sharma
----------------------------------------

From: Amit Handa [mailto:amithanda01@gmail.com]
Sent: Friday, July 06, 2012 6:44 PM
To: flume-user@incubator.apache.org
Subject: Re: flume ng error while going for hdfs sink

Hi,

@Mike thanks for ur reply.

1) After executing Flume-ng agent, and avro client, File is created in HDFS.
I used today same flume-ng setup with hadoop 1.0.1.
Now i m facing problem that through avro client i am sending normal text file. But inside
HDFS File content is coming like as shown below. I want in HDFS this file content should be
in normal text format
HDFS File Content:
"SEQ^F!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable^@^@^@^@^@^@^UªG^Oòá~v¾­z/<87>^[~ð^@^@^@)^@^@^@^H^@^@^A8[<8e>)Ú^@^@^@^]We
are modifying the file now^@^@^@

Given txt file content through AvroClient is
                 We are modifying the file now

Kindly provide ur inputs to resolve this issue.
my flume.conf file content is as folows:
# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory


# Define an Avro source called avro-source1 on agent1 and tell it
# to bind to 0.0.0.0:41414<http://0.0..0.0:41414>. Connect it to channel ch1.
agent1.sources..avro-source1.channels = ch1
agent1.sources.avro-source1.type = avro
agent1.sources.avro-source1.selector.type=replicating
agent1.sources.avro-source1.bind = 0.0.0.0
agent1.sources.avro-source1.port = 41414


# Define a hdfs sink that simply logs all events it receives
# and connect it to the other end of the same channel.
agent1.sinks.HDFS..channel = ch1
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:54310/user/hadoop-node1/flumeTest
agent1.sinks.HDFS.hdfs.file.Type = DataStream
agent1.sinks.HDFS.hdfs.file.Format = Text

# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1
agent1.sources = avro-source1
agent1.sinks = HDFS


2) AT Flume NG Side still i am getting security related IO Exception. when i start flume-ng
using above configuration file.
Exception log coming at flume-ng side is :
2012-07-06 11:14:42,957 (conf-file-poller-0) [DEBUG - org.apache.hadoop.security.Groups.<init>(Groups.java:59)]
Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
2012-07-06 11:14:42,961 (conf-file-poller-0) [DEBUG - org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)]
java.io.IOException: config()
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
    at org.apache.hadoop.security.UserGroupInformation..ensureInitialized(UserGroupInformation.java:187)
    at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239)
    at org.apache.hadoop.security.KerberosName.<clinit>(KerberosName.java:83)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:212)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:187)
    at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239)
    at org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:516)
    at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:239)
    at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
    at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSinks(PropertiesFileConfigurationProvider.java:373)
    at org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:223)
    at org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
    at org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
    at org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)




With Regards,
Amit Handa

On Fri, Jul 6, 2012 at 12:21 AM, Mike Percy <mpercy@cloudera.com<mailto:mpercy@cloudera.com>>
wrote:
On Thu, Jul 5, 2012 at 12:28 AM, Amit Handa <amithanda01@gmail.com<mailto:amithanda01@gmail.com>>
wrote:
HI All,

While trying to run Flume ng using HDFS SInk, and using avro Client.. i am getting IOException.
Kindly help in resolving this issue

Exception log is as follows:
2012-07-05 12:01:32,789 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)]
Creating instance of sink HDFS typehdfs
2012-07-05 12:01:32,816 (conf-file-poller-0) [DEBUG - org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)]
java.io.IOException: config()
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
    at org.apache.hadoop.security.UserGroupInformation..ensureInitialized(UserGroupInformation.java:187)
    at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:239)
....

Nothing is wrong with this, you are running at DEBUG level and Hadoop is giving you debug-level
output. If you don't want to get DEBUG level messages from Hadoop while running Flume at DEBUG
level then you will need to add something like:

log4j.logger.org.apache.hadoop = INFO

To your log4j.properties file.

Are you experiencing any problems with your setup?

Regards,
Mike



이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을
포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의
전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는
것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인
또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다.
This E-mail may contain confidential information and/or copyright material. This email is
intended for the use of the addressee only. If you receive this email by mistake, please either
delete it without reproducing, distributing or retaining copies thereof or notify the sender
immediately.
Mime
View raw message