How about using python ?
From: Ashish [mailto:paliwalashish@gmail.com]
Sent: Tuesday, December 31, 2013 9:53 AM
To: user@flume.apache.org
Subject: Re: Event breaking in flume
Have a look at org.apache.flume.serialization.LineDeserializer in flume-ng-core module
On Tue, Dec 31, 2013 at 9:24 AM, Chhaya Vishwakarma <Chhaya.Vishwakarma@lntinfotech.com<mailto:Chhaya.Vishwakarma@lntinfotech.com>>
wrote:
Hi brock
Thanks. Using spooling directory with deserializer looks good however i don't have any idea
of how to write custom deserializer.
Can you give me little hint how should i go about writing my own deserializer it will be a
great help.
Regards,
Chhaya Vishwakarma
From: Brock Noland [mailto:brock@cloudera.com<mailto:brock@cloudera.com>]
Sent: Monday, December 30, 2013 7:48 PM
To: user@flume.apache.org<mailto:user@flume.apache.org>
Subject: Re: Event breaking in flume
Yes, it is possible to handle multi-line events and handling stack traces is very common place.
However, using exec source is going to be limiting. The "correct" solution is:
1) Use spooling directory source
2) Write a little deserializer to handle your format.
Another solution is:
1) replace new lines with something like __NL__ by a perl script in your exec source
2) Use morphlines to replace __NL__ with \n
A third and less desirable solution would be:
1) Use the morphlines intercepter to merge multiple events to a single event. This will not
work well for a varity or reasons but the most common being that the exec source could hit
it's "batch" size in the middle of of a stack trace in which case the stack trace will be
in to different batches.
Brock
On Mon, Dec 30, 2013 at 5:05 AM, Joao Salcedo <joao.salcedo@gmail.com<mailto:joao.salcedo@gmail.com>>
wrote:
Looks that it is possible based on regular expression pattern matching
http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/readMultiLine
On Mon, Dec 30, 2013 at 9:56 PM, Chhaya Vishwakarma <Chhaya.Vishwakarma@lntinfotech.com<mailto:Chhaya.Vishwakarma@lntinfotech.com>>
wrote:
So is it not possible to handle multiline events in flume?
From: Joao Salcedo [mailto:joao.salcedo@gmail.com<mailto:joao.salcedo@gmail.com>]
Sent: Monday, December 30, 2013 4:22 PM
To: user@flume.apache.org<mailto:user@flume.apache.org>
Subject: Re: Event breaking in flume
Maybe you can set up some morphlines and do some ETL in your event.
I hope this help you.
http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/
Cheers
On Mon, Dec 30, 2013 at 9:34 PM, Ashish <paliwalashish@gmail.com<mailto:paliwalashish@gmail.com>>
wrote:
I am not aware of any options out of the box. Maybe someone else can help.
Alternate way is to write a custom source.
On Mon, Dec 30, 2013 at 3:56 PM, Chhaya Vishwakarma <Chhaya.Vishwakarma@lntinfotech.com<mailto:Chhaya.Vishwakarma@lntinfotech.com>>
wrote:
Hi
Exec as source and tail command
From: Ashish [mailto:paliwalashish@gmail.com<mailto:paliwalashish@gmail.com>]
Sent: Monday, December 30, 2013 3:48 PM
To: user@flume.apache.org<mailto:user@flume.apache.org>
Subject: Re: Event breaking in flume
What is the Source you are using?
On Mon, Dec 30, 2013 at 3:23 PM, Chhaya Vishwakarma <Chhaya.Vishwakarma@lntinfotech.com<mailto:Chhaya.Vishwakarma@lntinfotech.com>>
wrote:
Hi,
By default flume considers one line as one event, But I want to do breaking on some other
criteria how it can be achieved in flume? Is it possible to do ?
10 Sep 2013 19:43:33,561 [WebContainer : 9] ERROR - An Error has occured for com.marsh.framework.core.exception.MarshException:
Record has been modified since last retrieved - Resubmit transaction
10 Sep 2013 19:43:33,561 [WebContainer : 9] ERROR - handleException():com.marsh.framework.core.exception.MarshException:
Record has been modified since last retrieved - Resubmit transaction
at com.marsh.csa.serviceagreement.ServiceAgreementImpl.updateAgreement(ServiceAgreementImpl.java(Compiled
Code))
at com.marsh.csa.serviceagreementmgmt.CSAManagerImpl.updateCSA(CSAManagerImpl.java(Compiled
Code))
at com.marsh.csa.serviceagreementmgmt.ejb.EJSRemoteStatelessServiceagreementManager_3dcfd156.updateCSA(Unknown
Source)
at com.marsh.csa.serviceagreementmgmt.ejb._ServiceagreementManagerRemote_Stub.updateCSA(_ServiceagreementManagerRemote_Stub.java(Compiled
Code))
at com.marsh.csa.proxy.CSAProxy.updateCSA(CSAProxy.java(Compiled Code))
at com.marsh.csa.serviceagreement.SaveCSAAction.performAction(SaveCSAAction.java(Compiled
Code))
at com.marsh.csa.serviceagreement.CSAAbstractStrutsAction.execute(CSAAbstractStrutsAction.java(Compiled
Code))
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java(Inlined
Compiled Code))
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java(Compiled Code))
Caused by: com.marsh.framework.core.exception.MarshException: Record has been modified since
last retrieved - Resubmit transaction
at com.marsh.csa.serviceagreement.ServiceAgreementDAO.updateServiceAgreement(ServiceAgreementDAO.java(Compiled
Code))
at com.marsh.csa.serviceagreement.ServiceAgreementDAO.update(ServiceAgreementDAO.java(Compiled
Code))
at com.marsh.csa.serviceagreement.SAUpdateImpl.updateServiceAgreement(SAUpdateImpl.java(Compiled
Code))
at com.marsh.csa.serviceagreement.SAUpdateImpl.update(SAUpdateImpl.java(Compiled Code))
... 26 more
Caused by: com.marsh.framework.core.exception.MarshException: Record has been modified since
last retrieved - Resubmit transaction
at com.marsh.csa.serviceagreement.SaveCSAAction.performAction(SaveCSAAction.java(Compiled
Code))
at com.marsh.csa.serviceagreement.CSAAbstractStrutsAction.execute(CSAAbstractStrutsAction.java(Compiled
Code))
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java(Inlined
Compiled Code))
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java(Compiled Code))
at org.apache.struts.action.ActionServlet.process(ActionServlet.java(Inlined Compiled
Code))
at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java(Compiled Code))
at javax.servlet.http.HttpServlet.service(HttpServlet.java(Compiled Code))
at javax.servlet.http.HttpServlet.service(HttpServlet.java(Compiled Code))
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java(Compiled
Code))
this is a log file which I am writing to HBase. Whatever is highlighted das yellow I want
that as one event and gray as another event.
Basically I want to break the events on Date? Is it possible to do ?
Regards,
Chhaya Vishwakarma
________________________________
The contents of this e-mail and any attachment(s) may contain confidential or privileged information
for the intended recipient(s). Unintended recipients are prohibited from taking action on
the basis of information in this e-mail and using or disseminating the information, and must
notify the sender and delete it from their system. L&T Infotech will not accept responsibility
or liability for the accuracy or completeness of, or the presence of any virus or disabling
code in this e-mail"
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
--
thanks
ashish
Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal
|