Hi
My flume setup is:
Source Agent : cat source - File Channel - Avro Sink
Dest Agent : avro source - File Channel - HDFS Sink.
There is only 1 source agent and 1 destination agent.
I measure throughput as amount of data written to HDFS per second.
( I have rolling interval 30 sec; so If 60 MB file is generated in
30 sec the
throughput is : -- 2 MB/sec ).
I have run source agent on various machines with different
hardware configurations :
(In all cases I run flume agent with JAVA OPTIONS as
"-DJAVA_OPTS="-Xms500m -Xmx1g -Dcom.sun.management.jmxremote
-XX:MaxDirectMemorySize=2g")
JDK is 32 bit.
Experiment 1:
=====
RAM : 16 GB
Processor: Intel Xeon E5620 @ 2.40 GHz (16 cores).
64 bit Processor with 64 bit Kernel.
Throughput: 2 MB/sec
Experiment 2:
======
RAM : 4 GB
Processor: Intel Xeon E5504 @ 2.00GHz (4 cores). 32 bit Processor
64 bit Processor with 32 bit Kernel.
Throughput : 30 KB/sec
Experiment 3:
======
RAM : 8 GB
Processor:Intel Xeon E5520 @ 2.27 GHz (16 cores).32 bit Processor
64 bit Processor with 32 bit Kernel.
Throughput : 80 KB/sec
-- So as can be seen there is huge difference in the throughput
with same configuration but
different hardware.
-- In the first case where throughput is more RES is around 160 MB
in other cases it is in
the range of 40 MB - 50 MB.
Can anybody please give insights that why there is this huge
difference in the throughput?
What is the correlation between RAM and filechannel/HDFS sink
performance and also
with 32-bit/64 bit kernel?
Regards,
Jagadish