flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saritha Ravi <Saritha.R...@chacha.com>
Subject Re: Flume-HBase
Date Thu, 06 Oct 2011 21:05:48 GMT
Thanks MingjieĊ .

On 10/6/11 5:03 PM, "Mingjie Lai" <mjlai09@gmail.com> wrote:

> > You mentioned the collector needs to have direct connection to all
>region
> > servers and master..
> > Could you help me how I can do that.
>
>If you're deploying the nodes at ec2, you don't really need to worry
>about it. The ec2 default security group should allow full connection
>between all nodes which you bring up.
>
>The thing I was talking about is to avoid firewall/vlan blocking ports
>between your flume collector and hbase region servers (which occurs in
>our data center for cluster isolation). Again, I don't think you need to
>worry about it right now.
>
> > I already have hbase and flume master as well as flume collector all
>are
> > different machine I need to link all these together but don't know
>how...
>
>Here is a simple step-by-step example.
>
>1. flume collector: make sure it can access hbase
>
>2. and create a hbaes table, here are some shell scripts:
>$ cat > /tmp/test/list <<EOF
> > list
> > exit
> > EOF
>$ hbase shell /tmp/test/list
>TABLE 
>
>0 row(s) in 0.5820 seconds
>
>$ cat > /tmp/test/table <<EOF
> > create 't1', 'c1','c2'
> > exit
> > EOF
>
>$ hbase shell /tmp/test/table
>0 row(s) in 1.7270 seconds
>
>$ hbase shell /tmp/test/list
>TABLE 
>
>t1 
>
>1 row(s) in 0.4770 seconds
>
>3. copy hbase-site.xml to /usr/lib/flume/conf/ (assume you're using cdh
>flume)
>
>4. copy hbase.jar to /usr/lib/flume/lib/
>
>5. try flume collector to write to hbase
>$ /usr/lib/flume/bin/flume node_nowatch -1 -s -n n1 \
>-c 'n1: tail("/tmp/test/list") | hbase ("t1", "%s", "c1", "", "%S",
>"c2", "", "%{body}");'
>
>6. scan hbase
>$ cat > /tmp/test/scan <<EOF
> > scan 't1'
> > exit
> > EOF
>$ bin/hbase shell /tmp/test/scan
>ROW                           COLUMN+CELL
>  1317934401                   column=c1:, timestamp=1317934402446,
>value=21
>  1317934401                   column=c2:, timestamp=1317934402446,
>value=list
>  1317934402                   column=c1:, timestamp=1317934402451,
>value=22
>  1317934402                   column=c2:, timestamp=1317934402451,
>value=exit
>2 row(s) in 0.5540 seconds
>
>After running it, you can make sure your collector can talk to hbase.
>And you can use your master to configure the collector. Please follow
>Otis's link for detail info.
>
>
>On 10/06/2011 06:34 AM, Saritha Ravi wrote:
>> Hi Mingjie,
>> Good Morning...
>> Thanks  for your response.
>> I would like use default hbase sink I downloaded from
>> https://github.com/cloudera/flume/tree/master/plugins/(placed the
>> hbase-sink.jar in flume-home/lib and updated the flume-site.xml)
>> You mentioned the collector needs to have direct connection to all
>>region
>> servers and master..
>> Could you help me how I can do that.
>>
>> I already have hbase and flume master as well as flume collector all are
>> different machine I need to link all these together but don't know
>>how...
>>
>> Thanks,
>> Saritha.
>>
>> On 10/6/11 2:37 AM, "Mingjie Lai"<mjlai09@gmail.com>  wrote:
>>
>>>> Does flume collector and and hbase master should be in the same
>>> cluster.
>>>
>>> In your case, the flume collector will be writing data to hbase as a
>>> regular hbase client. So it needs to access hbase thru either, 1) hbase
>>> java api, or 2) hbase rest/thrift gateway. If you want to use the
>>> default hbase sink (which uses java api), the collector need to have
>>> direct connection to all region servers and master.
>>>
>>> On the other hand, you can also build your own new hbase REST/thrift
>>> sink. And in this case, the collector only needs to talk to the REST
>>> gateway.
>>>
>>>> Can anyone suggest me
>>>> the basic steps how I can configure these two in ec2 cloud.
>>>
>>> I don't quite understand your question. Sounds like you've already had
>>> hbase, then you can just have some extra machines for flume nodes,
>>> master, etc.
>>>
>>> -mingjie
>>>
>>> On 10/05/2011 07:48 PM, Saritha Ravi wrote:
>>>> Hi All,
>>>>
>>>> I need to configure Flume with hbase in cloud. Could anyone help me
>>>>with
>>>> this. Is there a better documentation.
>>>> Does flume collector and and hbase master should be in the same
>>>>cluster.
>>>> I was able to configure hbase(master ,Zookeeper,region server using
>>>> WHIRR) and flume(Master , collector) from CDH3. Can anyone suggest me
>>>> the basic steps how I can configure these two in ec2 cloud.
>>>>
>>>> Thanks,
>>>> Saritha.
>>>>
>>>> *
>>>> *
>>>
>>
>



Mime
View raw message