flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Subhramanian <deepak.subhraman...@gmail.com>
Subject Re: Json over netcat source
Date Fri, 09 May 2014 11:02:11 GMT
Sorry. My mistake. It is loading JSON data properly after the temporary fix.


On Thu, May 8, 2014 at 6:24 PM, Deepak Subhramanian <
deepak.subhramanian@gmail.com> wrote:

> Hi Ashish,
>
> Thanks for the solution. I made the changes and I can see the JSON message
> now. There is a JIRA raised on the same issue.
>
> https://issues.apache.org/jira/browse/FLUME-2126
>
>
> From Hive when I load JSON data it automatically splits JSON fields to
> different columns. For some reason the ESSink doesnt load in the same way.
> I am not sure if I am setting the correct type. There is a parameter es.
> input.json I have to set to true in hive table . Is there any similar
> variable I have to set for ESSink
>
> Here is the raw data I am getting in Kibana.
>
> {
>   "_index": "test-2014-05-08",
>   "_type": "parsed_logs",
>   "_id": "7qSBgRx-Q_GLaCDWARs_Cg",
>   "_score": null,
>   "_source": {
>     "@message": "{\"action\":{\"id\":\"00001\"}}",
>     "@timestamp": "2014-05-08T16:48:44.180Z",
>     "@type": "application/json",
>     "@fields": {
>       "_attachment_mimetype": "application/json",
>       "timestamp": "1399567724180",
>       "_type": "application/json",
>       "type": "application/json"
>     }
>   },
>   "sort": [
>     1399567724180
>   ]
> }
>
>
>
> On Sun, Apr 13, 2014 at 4:56 PM, Ashish <paliwalashish@gmail.com> wrote:
>
>> little more on the issue
>>
>> builder.field(fieldName, tmp); calls the XContentBuilder API where class
>> type is determined and appropriate method is called. Since tmp, which is
>> instance of XContentBuilder, doesn't match any of the defined if conditions
>> it goes to final else where the tmp.toString() is called, and field(String,
>> String) method is called so we get object address in index.
>>
>> Replacing
>> builder.field(fieldName, tmp);
>> with
>> builder.field(fieldName, tmp.string());
>>
>> shall make things work, but I am not sure if this would be the best way
>> to use the API.
>>
>> Got the answer from ES user list :)
>>
>> http://elasticsearch-users.115913.n3.nabble.com/Issue-with-posting-json-data-to-elastic-search-via-Flume-td4054017.html
>>
>> Can ES experts comment on the best way forward?
>>
>>
>>
>> On Sun, Apr 13, 2014 at 8:10 PM, Ashish <paliwalashish@gmail.com> wrote:
>>
>>> Have been able to reproduce the problem locally using the existing test
>>> cases inside ES Sink. The problem does exist.
>>>
>>> Did some initial investigation, the framework is able to detect the JSON
>>> content and tries to add it as complex field.
>>> timestamp is added only if present in header.
>>>
>>> In the class org.apache.flume.sink.elasticsearch.ContentBuilderUtil
>>>
>>> public static void addComplexField(XContentBuilder builder, String
>>> fieldName,
>>>       XContentType contentType, byte[] data) throws IOException {
>>>     XContentParser parser = null;
>>>     try {
>>>       XContentBuilder tmp = jsonBuilder();
>>>       parser = XContentFactory.xContent(contentType).createParser(data);
>>>       parser.nextToken();
>>>       tmp.copyCurrentStructure(parser);
>>>       builder.field(fieldName, tmp); <<<< This is where the we might
>>> have an issue (real action is happening inside this method
>>>                                             call)
>>>
>>> Can someone familiar with this part look further into this? I shall
>>> debug further as soon as I have free cycles.
>>>
>>> thanks
>>> ashish
>>>
>>>
>>>
>>> On Fri, Apr 11, 2014 at 5:24 PM, Deepak Subhramanian <
>>> deepak.subhramanian@gmail.com> wrote:
>>>
>>>>  Thanks Simon. I am also struggling with no luck. I tried using the
>>>> latest flume elastic search sink jar  build from 1.5SNAPSHOT ,but still no
>>>> luck. I will try to see if it is an issue with elastic search api . When
I
>>>> loaded json using hive it loaded JSON properly. But we have to pass a
>>>> property es.input.json in hive.  Is there a way to pass the same in Flume.
>>>>
>>>> CREATE EXTERNAL TABLE json (data STRING <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-1>)
>>>>
>>>>
>>>>
>>>>
>>>> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
>>>> TBLPROPERTIES('es.resource' = '...',
>>>>
>>>>
>>>>
>>>>
>>>>               'es.input.json` = 'yes' <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-2>);
>>>>
>>>>
>>>
>>>
>>> --
>>> thanks
>>> ashish
>>>
>>> Blog: http://www.ashishpaliwal.com/blog
>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>>
>>
>>
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>
>
>
> --
> Deepak Subhramanian
>



-- 
Deepak Subhramanian

Mime
View raw message