Looks like this is happening because for certain metrics which cannot be converted into float (for example strings), we are sending the type as float, causing Ganglia to log those messages. I think it should be fairly easy to write a patch to fix this. I filed FLUME-1870 to track this.


Not sure when or how it broke, as I know of people using it in production. There is a way to configure it for different versions of Ganglia, like 3.0, 3.1. Might be worth trying both values to see if it's a problem with one or the other: http://flume.apache.org/FlumeUserGuide.html#ganglia-reporting

I have not had success with Ganglia, due to the same issue I think as you've encountered.

i have the same problem here, using Flume-NG 1.2 from CDH4.1.2.
I deleted all related RRDs, gmetad recreates them and i see those errors again.

Disk space, inodes, ... are fine. All RRDs not related to Flume-NG are fine, too.

These errors result in a gmetad crash after while (more metrics => earlier crash). If I disable ganglia support gmetad runs without any problem.


Looks like the RRD's are damaged, maybe the harddisk full?

> We just noticed that ganglia's gmetad is spamming messages like the following
> Dec  9 03:13:20 om-pat-obs01 /usr/sbin/gmetad[17407]: RRD_update (/var/lib/ganglia/rrds/Flume KDDI/blog-wap02/flume.SINK.avro2.Type.rrd): /var/lib/ganglia/rrds/Flume KDDI/blog-wap02/flume.SINK.avro2.Type.rrd: conversion of 'SINK' to float not complete: tail 'SINK'
> Dec  9 03:13:20 om-pat-obs01 /usr/sbin/gmetad[17407]: RRD_update (/var/lib/ganglia/rrds/Flume KDDI/blog-wap02/flume.CHANNEL.ch1.Type.rrd): /var/lib/ganglia/rrds/Flume KDDI/blog-wap02/flume.CHANNEL.ch1.Type.rrd: conversion of 'CHANNEL' to float not complete: tail 'CHANNEL'
> Dec  9 03:13:20 om-pat-obs01 /usr/sbin/gmetad[17407]: RRD_update (/var/lib/ganglia/rrds/Flume KDDI/blog-wap11/flume.SINK.avro1.Type.rrd): /var/lib/ganglia/rrds/Flume KDDI/blog-wap11/flume.SINK.avro1.Type.rrd: conversion of 'SINK' to float not complete: tail 'SINK'
> Dec  9 03:13:20 om-pat-obs01 /usr/sbin/gmetad[17407]: RRD_update (/var/lib/ganglia/rrds/Flume KDDI/blog-wap11/flume.SOURCE.scribe.Type.rrd): /var/lib/ganglia/rrds/Flume KDDI/blog-wap11/flume.SOURCE.scribe.Type.rrd: conversion of 'SOURCE' to float not complete: tail 'SOURCE'
> The counters are tracked fine on the web interface, but we've had issues(gmetad crashing or not starting up, I'm still trying to get specifics from the responsible people). I can't say for sure if this is a ganglia problem or a problem with flumes ganglia support. Anyone else feeding their counters to ganglia getting similar? Perhaps the ganglia ml may be more appropriate, not sure on this one.

