flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patricio Eduardo Huichulef Carvajal <phuichu...@gmail.com>
Subject Flume Problem - Data Lost
Date Tue, 20 Jun 2017 20:54:55 GMT
Hello Folks:

     I would like to request your help regarding Flum's Configuration to 
replicate files from one node to other, where we currently have an issue 
about lost files during replication process.

     The following diagram represent the actual architecture where flum 
is working, replicating files in Avro format to HDFS and SolR.


     When we check the information at both destination we have found 
that not all the information were replicate from source, losing files

     Below is the configuration file for Node 2:


*_Nodo2:_*

a1.sources = r1_
__*
*_*# Describe/configure the source

*a1.sources.r1.type = avro
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 50001

a1.channels = c1 c2_
__*
*_*# Use a channel c1 which buffers events in memory
*
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000*

# Use a channel c2 which buffers events in memory

*a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000**_*

*_*# Definición de Interceptor en caso de ser Multiplexación*_*

_// Customer wants to use Replicating, Is it necessary to keep the 
Interceptor declaration inside the file? Because Interceptor is for 
Multiplexing only. //_

*_a1.sources.r1.interceptors = pcgInterceptor
a1.sources.r1.interceptors.pcgInterceptor.type = pcg.PcgInterceptor$Builder
a1.sources.r1.interceptors.pcgInterceptor.solrServer= 
http://XXX.YYY.ZZZ.WWW:NN/solr/pcgs_dt_datos_panel_cntrl_shard1_replica1/
a1.sources.r1.interceptors.pcgInterceptor.paramKeys = 
TPO_REG,START_YEAR,START_MONTH,START_DAY,START_HOUR,START_MINUTE,START_SECONDS,END_YEAR,END_MONTH,END_DAY,END_HOUR,END_MINUTE,END_SECONDS,COD_EST,PROCESS_NAME,TPO_MLL,SGL_SIS,SGL_SUB_SIS,NOM_TAR,COD_SEC,PSO_ARQ,REG_LEI,REG_PCS,REG_RCH,NOM_ARQ,FEC_OPE,DISK,MEMORY,CPU,PID,RANKING,END_LINE
**_*
*_*# Define channel selector and define mapping

*a1.sources.r1.selector.type = replicating
a1.sinks = k1 k2**_*

*_*# Definición de los Sumideros (Sinks) o Destinos
# Describe first SOLR sink k1 to store manager's data only, its 
associated with channel c1

*a1.sinks.k1.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
a1.sinks.k1.channel = c1
a1.sinks.k1.morphlineFile = k1.conf
a1.sinks.k1.morphlineId = pcg
a1.sinks.k1.isProductionMode = true
a1.sinks.k1.batchSize = 1**_*

*_*# Describe k2 sink k2 to store developer’s data only, its associated 
with channel c2
*
a1.sinks.k2.type = hdfs
a1.sinks.k2.channel = c2
a1.sinks.k2.hdfs.path = hdfs://pcg/pcgs_dt_evnto
a1.sinks.k2.hdfs.rollInterval = 0
a1.sinks.k2.hdfs.rollSize = 1073741824
a1.sinks.k2.hdfs.rollCount = 0
a1.sinks.k2.hdfs.idleTimeout = 28800
a1.sinks.k2.hdfs.kerberosPrincipal = ingest@CORP.CORP
a1.sinks.k2.hdfs.kerberosKeytab = /home/ingest/ingest.keytab**_*

*_*# Enlazar la fuente y los sumideros (Sinks) al Canal
# Bind the source and sink to the channel
# a1.sources.spoolDirectory.channels = c1 c2

*a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2*_
_*

I will appreciate your feedback to this doubt

Best Regards
PEHC




Mime
View raw message