Hi,
I am using ScribeSource and I met a string encoding problem.
I found that LogEntry class use method *iprot.readingString()* when read.
But the thrift *TBinaryProtocal*'s implementation of readingString() is to
convert byte array to string with "*UTF-8*" encoding. But my scribe data to
send is "*GBK*" encoding, so thrift use "*UTF-8*" to encode my message
cause a encoding problem.
I don't know if flume scribe source only accept UTF-8 encoding message now?
If we can auto support other message encoding or through configuration, it
would be nice to me.
LogEntry
public void read(org.apache.thrift.protocol.TProtocol iprot) throws
org.apache.thrift.TException {
org.apache.thrift.protocol.TField field;
iprot.readStructBegin();
while (true)
{
field = iprot.readFieldBegin();
if (field.type == org.apache.thrift.protocol.TType.STOP) {
break;
}
switch (field.id) {
case 1: // CATEGORY
if (field.type == org.apache.thrift.protocol.TType.STRING) {
this.category = iprot.readString();
} else {
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, field.type);
}
break;
case 2: // MESSAGE
if (field.type == org.apache.thrift.protocol.TType.STRING) {
this.message = iprot.readString();
} else {
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, field.type);
}
break;
default:
org.apache.thrift.protocol.TProtocolUtil.skip(iprot, field.type);
}
iprot.readFieldEnd();
}
iprot.readStructEnd();
// check for required fields of primitive type, which can't be
checked in the validate method
validate();
}
TBinaryProtocol
public String readString() throws TException {
int size = this.readI32();
if(this.trans_.getBytesRemainingInBuffer() >= size) {
try {
String e = new String(this.trans_.getBuffer(),
this.trans_.getBufferPosition(), size, "UTF-8");
this.trans_.consumeBuffer(size);
return e;
} catch (UnsupportedEncodingException var3) {
throw new TException("JVM DOES NOT SUPPORT UTF-8");
}
} else {
return this.readStringBody(size);
}
}
|