When I configure a collectorSink to use a format different from the
default specified in the flume-conf.xml file, it continues to use the
default. I get the following error
2011-09-09 16:02:29,059 [Roll-TriggerThread-0] WARN conf.FlumeBuilder:
Deprecated syntax: Expected a format spec but instead had a (String)
raw
even though I'm using the following combination of configurations. Is
the xml file configuration supposed to supercede that of the runtime
configuration?
I'm running version "Flume 0.9.4-cdh3u1"
Here's the complete startup script I'm using:
#!/bin/sh
gnome-terminal -e "flume master"
sleep 10
flume shell -c localhost -e "exec config agent
'tail(\"/var/log/apache2/access.log\")' '[console,
collectorSink(\"hdfs://localhost/flume/avro/\",\"log\",60000,avrojson)]'"
gnome-terminal -e "flume node -n agent"
Here's the appropriate entry from my flume-conf.xml file:
<property>
<name>flume.collector.output.format</name>
<value>raw</value>
<description>The output format for the data written by a Flume
collector node. There are several formats available:
syslog - outputs events in a syslog-like format
log4j - outputs events in a pattern similar to Hadoop's log4j pattern
raw - Event body only. This is most similar to copying a file but
does not preserve any uniqifying metadata like host/timestamp/nanos.
avro - Avro Native file format. Default currently is uncompressed.
avrojson - this outputs data as json encoded by avro
avrodata - this outputs data as a avro binary encoded data
debug - used only for debugging
</description>
</property>
|