flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: flume problem
Date Wed, 02 Mar 2016 12:03:17 GMT
That looks like json but it's avro.
You need a read it using the avro library or change your sink to serialize
text.

On 2 March 2016 at 11:28, Baris Akgun (Garanti Teknoloji) <
BarisAkgu@garanti.com.tr> wrote:

> I added sink example . Why does flume add yellow part. I thought that
> yellow part means content type.
>
>
>
> Flume has to be sink a new line for each separate post. Am ı right? In our
> example flume continue to sink new post after last post. It is not sink as
> a new line for new posts.
>
>
>
>
>
> thanks
>
>
>
>
>
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable
> ��mq��I$�����{��US-�\�I{"id":"tag:search.twitter.com
> ,2005:642910625514016769","objectType":"activity","actor":{"objectType":"person","id":"id:
> twitter.com:770439973","link":"http://www.twitter.com/CanKurnaz5","displayName":"Can
> Kurnaz","postedTime":"2012-08-20T23:40:34.000Z","image":"
> https://pbs.twimg.com/profile_images/642910076290879492/T4-UBuZE_normal.jpg","summary":null,"links":[{"href":null,"rel":"me"}],"friendsCount":62,"followersCount":40,"listedCount":1,"statusesCount":378,"twitterTimeZone":null,"verified":false,"utcOffset":null,"preferredUsername":"CanKurnaz5","languages":["tr"],"favoritesCount":507},"verb":"post","postedTime":"2015-09-13T04:00:12.000Z","generator":{"displayName":"Twitter
> for iPhone","link":"http://twitter.com/download/iphone
> "},"provider":{"objectType":"service","displayName":"Twitter","link":"
> http://www.twitter.com"},"link":"
> http://twitter.com/CanKurnaz5/statuses/642910625514016769","body":"@mbilgehandemir
> 32 milyon dolar garanti para güzel bende yerim o paraya
> dayak","object":{"objectType":"note","id":"object:search.twitter.com,2005:642910625514016769","summary":"@mbilgehandemir
> 32 milyon dolar garanti para güzel bende yerim o paraya dayak","link":"
> http://twitter.com/CanKurnaz5/statuses/642910625514016769
> ","postedTime":"2015-09-13T04:00:12.000Z"},"inReplyTo":{"link":"
> http://twitter.com/mbilgehandemir/statuses/642908818138120192"},"favoritesCount":0,"twitter_entities":{"hashtags":[],"trends":[],"urls":[],"user_mentions":[{"screen_name":"mbilgehandemir","name":"Bilgehan
> Demir","id":180756706,"id_str":"180756706","indices":[0,15]}],"symbols":[]},"twitter_filter_level":"low","twitter_lang":"tr","retweetCount":0,"gnip":{"matching_rules":[{"value":"(bio_lang:tr
> OR twitter_lang:tr OR lang:tr) (\"garantı\" OR \"garanti\" OR \"GARANTI\"
> OR \"GARANTİ\")","tag":null},{"value":"(bio_lang:tr OR twitter_lang:tr OR
> lang:tr) garanti -(#29ekimekadartakipteyiz OR #UnfsuzSeriTakipleselim OR
> #TatilAskinaTakipleselim OR #takipedenitakipederim OR #DurmaTakipleselim OR
> #SenSakrakTakipleselimYine OR #TakipedeneAnindaGeriTakip OR
> #TakipEdenTakipEdilir OR #Garanti_Takipçiyim OR #GARANTI_TAKIP�?IYIM OR
> #BicirBicirTakipleselim OR #geritakip OR #KosKosTakipvar OR
> #AnindaGeriTakip OR #KesinTakipVar OR #Hepimiz_Takipleselim OR
> #Hepimiz_Takipleselim OR #TakipleselimMutluOlalim OR #garantitakip OR
> #TakipcininiyisiRTyapar OR #Hepimiz_Takipleselim. OR
> #takibetakip)","tag":null},{"value":"(bio_lang:tr OR twitter_lang:tr OR
> lang:tr) garanti -(\"Garanti süresi\" OR \"Gülmek garanti\" OR \"Puan
> Garanti\" OR \"Final Garanti\" OR \"Garanti kupon\" OR \"Unfollow Garanti\"
> OR \"Garanti ediyorum\" OR \"Takip
> Garanti\")","tag":null},{"value":"(bio_lang:tr OR twitter_lang:tr OR
> lang:tr) garanti -(\"Gülmek garanti\" OR \"Garanti edebilirim\" OR \"Kilo
> garanti\" OR \"Cennet garanti\" OR \"Kupa Garanti\" OR \"Garanti
> veriyorum\" OR \"Garanti belgesi\")","tag":null},{"value":"(bio_lang:tr OR
> twitter_lang:tr OR lang:tr) garanti -(#CenkAkyolYalnizDegildir OR
> #TakipedeneAnindaGeriTakip OR #autofollowback OR #ifollowback OR #ff OR
> #takipedenitakipederim OR #TeamFollowBack OR #followme OR #takibetakip OR
> #takipedentakipedilir OR #garantiemakelaars OR #DostlarlaTakipleselim OR
> #takipedentakipedilir OR #KizliErkekliTakipleselim OR
> #TakipleselimMutluOlalim OR #BizSuperizYineTakiplesiyoruz OR
> #DurmaTakipleselim OR #takibetakip OR #FigürsüzlerIleGeceTakibi OR
> #MuratCuvallSayesindeTakiplesiyoruz)","tag":null},{"value":"(bio_lang:tr OR
> twitter_lang:tr OR lang:tr) garanti -(\"Garanti Takip\" OR \"Tur garanti\"
> OR \"Kupa garanti\" OR \"sampiyonluk garanti\" OR \"Garanti kapsami\" OR
> \"madalya garanti\" OR \"takipçi garanti\" OR \"lig garanti\" OR \"ligi
> garanti\" OR \"kopmak garanti\" OR \"Garanti_Takipciyim\" OR \"TAKİP EDENİ
> TAKİP EDERİM\" OR \"takip edeni takip
> ederim\")","tag":null},{"value":"(bio_lang:tr OR twitter_lang:tr OR
> lang:tr) garanti -(#takipcikazan OR #TakipleselimMutluOlalim OR
> #TakipVarDediler OR #EtkilesimIleSeriTakip OR #Cumatakipteyiz OR
> #TakipedeneAnindaGeriTakip OR #Takiplerdeyiz OR #AnindaGeriTakip OR
> #garanti_takipçiyim OR #takipçikazan)","tag":null},{"value":"(bio_lang:tr
> OR twitter_lang:tr OR lang:tr) (contains:garantı OR contains:garanti OR
> contains:paracard)","tag":null},{"value":"(bio_lang:tr OR twitter_lang:tr
> OR lang:tr) garanti -(#EtkilesimciTayfaTakiplesiyor OR
> #BugünPazarKarsilikliTakipleselim OR #TümHayranGruplariHaftaSonuTakibi OR
> #Takiplerdeyiz OR #MeteHorozogluHayranlariTakiplesiyor OR #garantilitakip
> OR #Garanti_Takipciyim OR #DurmaTakipleselim OR #Hepimiz_Takipleselim OR
> #100de100GeriTakip OR #Problemsiz_FullTakipteyiz OR
> #EtkilesimSeverlerIleSeriTakip)","tag":null}],"language":{"value":"tr"}}}
> ������mq��I$�����{��
>
> ?S-�\�
>
> 3{"id":"tag:search.twitter.com
> ,2005:642910743302668288","objectType":"activity","actor":{"objectType":"person","id":"id:
> twitter.com:347936349","link":"http://www.twitter.com/semagokcee","displayName":"
> ???? ???","postedTime":"2011-08-03T16:18:41.000Z","image":"
> https://pbs.twimg.com/profile_images/571609282766098432/ed6KzjNX_normal.jpeg","summary":null,"links":[{"href":null,"rel":"me"}],"friendsCount":654,"followersCount":510,"listedCount":10,"statusesCount":21790,"twitterTimeZone":"Baghdad","verified":false,"utcOffset":"10800","preferredUsername":"semagokcee","languages":["tr"],"location":{"objectType":"place","displayName":"AGD
> Eyüpsultan"},"favoritesCount":4531},"verb":"share","postedTime":"2015-09-13T04:00:40.000Z","generator":{"displayName":"Twitter
> for Android","link":"http://twitter.com/download/android
> "},"provider":{"objectType":"service","displayName":"Twitter","link":"
> http://www.twitter.com"},"link":"
> http://twitter.com/semagokcee/statuses/642910743302668288","body":"RT
> @Hadis_Tweet: \"Kim sabah namazını kılarsa, Allah'ın garantisi
> altındadır.\" (Kütüb-i Sitte, c.17, s.541)\n#Hadis","object":{"id":"tag:
> search.twitter.com
> ,2005:642906134421065728","objectType":"activity","actor":{"objectType":"person","id":"id:
> twitter.com:266950198","link":"http://www.twitter.com/Hadis_Tweet","displayName":"Hadis-i
> �?erif","postedTime":"2011-03-16T02:39:31.000Z","image":"
> https://pbs.twimg.com/profile_images/378800000408493403/279dd45b0a07d3965afefa59b1245f22_normal.png","summary":"Burada
> Hadis-i �?erif Payla�?ılır.Gayemiz Hz.Muhammed 'in(a.s.m) Sahih
> Kaynaklardan Hadis-i �?erifler'ini payla�?mak ve duyurmaktır.Selam ve Dua
> ile @Dua_Kardesligi","links":[{"href":"http://hadistweet.blogspot.com.tr/","rel":"me"}],"friendsCount":3387,"followersCount":376308,"listedCount":382,"statusesCount":5025,"twitterTimeZone":"Istanbul","verified":false,"utcOffset":"10800","preferredUsername":"Hadis_Tweet","languages":["tr"],"favoritesCount":623},"verb":"post","postedTime":"2015-09-13T03:42:21.000Z","generator":{"displayName":"Twitter
> Web Client","link":"http://twitter.com
> "},"provider":{"objectType":"service","displayName":"Twitter","link":"
> http://www.twitter.com"},"link":"
> http://twitter.com/Hadis_Tweet/statuses/642906134421065728","body":"\"Kim
> sabah namazını kılarsa, Allah'ın garantisi altındadır.\" (Kütüb-i Sitte,
> c.17, s.541)\n#Hadis","object":{"objectType":"note","id":"object:
> search.twitter.com,2005:642906134421065728","summary":"\"Kim sabah
> namazını kılarsa, Allah'ın garantisi altındadır.\" (Kütüb-i Sitte, c.17,
> s.541)\n#Hadis","link":"
> http://twitter.com/Hadis_Tweet/statuses/642906134421065728","postedTime":"2015-09-13T03:42:21.000Z"},"favoritesCount":61,"twitter_entities":{"hashtags":[{"text":"Hadis","indices":[90,96]}],"trends":[],"urls":[],"user_mentions":[],"symbols":[]},"twitter_filter_level":"low","twitter_lang":"tr"},"favoritesCount":0,"twitter_entities":{"hashtags":[{"text":"Hadis","indices":[107,113]}],"trends":[],"urls":[],"user_mentions":[{"screen_name":"Hadis_Tweet","name":"Hadis-i
> �?erif","id":266950198,"id_str":"266950198","indices":[3,15]}],"symbols":[]},"twitter_filter_level":"low","twitter_lang":"tr","retweetCount":33,"gnip":{"matching_rules":[{"value":"(bio_lang:tr
> OR twitter_lang:tr OR lang:tr) (contains:garantı OR contains:garanti OR
> contains:paracard)","tag":null}],"language":{"value":"tr"}}}
>
>
>
>
>
> *From:* Gonzalo Herreros [mailto:gherreros@gmail.com]
> *Sent:* Wednesday, March 2, 2016 12:01 PM
> *To:* user
> *Subject:* Re: flume problem
>
>
>
> The channel serializes the flume event as avro including the headers, the
> http headers become event headers
>
> However the sink should only store the content, not the headers
>
>
>
> On 2 March 2016 at 09:51, Baris Akgun (Garanti Teknoloji) <
> BarisAkgu@garanti.com.tr> wrote:
>
> No, we send json twitter data but in flume channel ı saw content type word
> for each tweet. Is it normal ? How can ı send just tweets json without any
> content type. I took tweets json from GNIP company.
>
>
>
> Thanks
> iPhone'umdan gönderildi
>
>
> 2 Mar 2016 tarihinde 10:56 saatinde, Gonzalo Herreros <gherreros@gmail.com>
> şunları yazdı:
>
> Could it be that you are serializing avro instead of json?
>
>
>
> On 2 March 2016 at 08:25, Baris Akgun (Garanti Teknoloji) <
> BarisAkgu@garanti.com.tr> wrote:
>
> Hi,
>
>
>
> When I send json data to flume with using http post, flume adds
> Co**ntent-Typeapplication/json** for each json post.
>
>
>
> In my http post java code,  I give the content-type with using
>
>
>
> **con.setRequestProperty("Content-Type", "application/json");** function.
>
>
>
>
>
> I am using blob handler.
>
>
>
> **In flume conf file**
>
>
>
> *tier1.sources.source1.type = org.apache.flume.source.http.HTTPSource
>
> tier1.sources.source1.handler =
> org.apache.flume.sink.solr.morphline.BlobHandler*
>
>
>
> In flume channel, flume adds content type for each post as you see. After
> HDFS sink, The content type word causes a problem when ı try to parse json
> with spark sql or hive serDe.
>
>
>
> **The flume channel log data**
>
>
>
> *^LContent-Typeapplication/jsonú{"id":"+ag:_ea_ch.++i++e_.c-
>
> ^LContentTypeapplication/json‘{"id":"tag:search.twitter.com
> ,2005:642913165047648*
>
>
>
> Is there any idea for that problem?
>
>
>
> Thank a lot.
>
>
>
> *Barış Akgün*
> Analitik Veri Ambarı ve Büyük Veri Yönetimi
> Uzman
>
> Tel
>
> :
>
> Dahili
>
> :
>
> Faks
>
> :
>
>
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>
>
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>

Mime
View raw message