phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ivanybma@gmail.com <ivany...@gmail.com>
Subject Re: sqlline thin client upsert successfully, but got timeout problem when try to connect to the phoenix again
Date Tue, 10 Apr 2018 03:57:04 GMT
Hi, sorry to reply late.
that is just a part of my hbase-site.xml.

below is the full content:
************************************hbase/conf/hbase.site.xml**************************
<configuration>
    <property>
    <name>hbase.rootdir</name>
    <value>hdfs://broker.xxx-xxx.local:9000/hbase</value>
    </property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>broker.xxx-xxx.local</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
</property>
<property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>broker.xxx-xxx.local</value>
  </property>
  <property>
     <name>hbase.regionserver.wal.codec</name>
     <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
  <property>
  <name>phoenix.transactions.enabled</name>
  <value>true</value>
</property>
<property>
  <name>data.tx.snapshot.dir</name>
  <value>/tmp/tephra/snapshots</value>
</property>
  <property>
  <name>data.tx.timeout</name>
  <value>120</value>
</property>
<property>
  <name>phoenix.query.timeoutMs</name>
  <value>2800000</value>
</property>
<property>
  <name>hbase.regionserver.lease.period</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.rpc.timeout</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.client.scanner.caching</name>
  <value>2000</value>
</property>
<property>
  <name>hbase.client.scanner.timeout.period</name>
  <value>2200000</value>
</property>
</configuration>

*************************client:  phoenix/bin/hbase-site.xml*************************
<configuration>
<property>
     <name>hbase.regionserver.wal.codec</name>
     <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
<property>
  <name>phoenix.transactions.enabled</name>
  <value>true</value>
</property>
<property>
  <name>data.tx.snapshot.dir</name>
  <value>/tmp/tephra/snapshots</value>
</property>
<property>
  <name>data.tx.timeout</name>
  <value>120</value>
</property>
<property>
  <name>phoenix.query.timeoutMs</name>
  <value>2800000</value>
</property>
<property>
  <name>hbase.regionserver.lease.period</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.rpc.timeout</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.client.scanner.caching</name>
  <value>2000</value>
</property>
<property>
  <name>hbase.client.scanner.timeout.period</name>
  <value>2200000</value>
</property>
</configuration>


On 2018/04/09 18:01:14, Josh Elser <elserj@apache.org> wrote: 
> The hbase-site.xml elements you shared earlier, were those your entire 
> hbase-site contents or just part of it?
> 
> Make sure you have the required properties set as described on 
> https://phoenix.apache.org/secondary_indexing.html for your indexes. If 
> you're still seeing problems, you may need to increase the number of 
> handlers you configured HBase to use.
> 
> While in the stuck state, you may benefit from getting a thread-dump or 
> two from the client and your regionserver(s). This would help in 
> figuring out exactly where things are stuck (like the DEBUG logs would do).
> 
> On 4/9/18 1:30 PM, ivanybma@gmail.com wrote:
> > thanks for your suggestion.  I found something interesting, not sure if that is
some potential reason.  That is my indexes created on my tables.
> > I created a lot of indexes.  After I removed all of the indexes, it seems things
went better(no more hanging like that).  So I am suspecting there is some incompatible or
other issues in the way I set up mu cluster.
> > 
> > Something special i used to create table:
> > )c.DATA_BLOCK_ENCODING='FAST_DIFF', SALT_BUCKETS=3, COMPRESSION='GZ',TRANSACTIONAL=true
;
> > and some indexes I created like this:
> > CREATE INDEX testing_IDX_2  ON xxx.xxx (field1, field2) INCLUDE (field3, field4)
> > 
> > 
> > 
> > On 2018/04/09 17:04:03, Josh Elser <elserj@apache.org> wrote:
> >> Have you looked at DEBUG logging client and server(HBase) side?
> >>
> >> The "Call exception" log messages imply that the client is repeatedly
> >> trying to issue an RPC to a RegionServer and failing. This should be
> >> where you focus your attention. It may be something trivial to fix
> >> related to configuration/security setup.
> >>
> >> On 4/8/18 2:04 AM, ivanybma@gmail.com wrote:
> >>> Hi,  I got below tricky problem:
> >>> Situation:
> >>> I successfully did a upsert into multiple tables with transaction enabled(and
there are many index created on these table).
> >>> Problem:
> >>> after the fist time upsert done successfully, I tried to do the 2nd, 3rd....
and next same upsert, sometime, the 2nd works, then the 3rd upsert will get timeout exception,
at this time, the whole phoenix seems hangs there and keep retrying. I tried to stop the whole
hbase cluster including phoenix queryserver and tepera and restart, then when I try to connect
with sqlline.py, it got hang again.
> >>>
> >>> hbase-site.xml setting:
> >>>     <property>
> >>>        <name>hbase.regionserver.wal.codec</name>
> >>>        <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
> >>> </property>
> >>>     <property>
> >>>     <name>phoenix.transactions.enabled</name>
> >>>     <value>true</value>
> >>> </property>
> >>> <property>
> >>>     <name>data.tx.snapshot.dir</name>
> >>>     <value>/tmp/tephra/snapshots</value>
> >>> </property>
> >>>     <property>
> >>>     <name>data.tx.timeout</name>
> >>>     <value>120</value>
> >>> </property>
> >>> <property>
> >>>     <name>phoenix.query.timeoutMs</name>
> >>>     <value>1800000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.regionserver.lease.period</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.rpc.timeout</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.client.scanner.caching</name>
> >>>     <value>1000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.client.scanner.timeout.period</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>>
> >>>
> >>>
> >>> Below is some queryserver log:
> >>> 18/04/08 05:47:12 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=xxxx.xxxx.local:2181 sessionTimeout=90000 watcher=org.apache.tephra.zookeeper.TephraZKClientService$5@6700104f
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Opening socket connection to
server xxxx.xxxx.local/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown
error)
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Socket connection established
to xxxx.xxxx.local/127.0.0.1:2181, initiating session
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Session establishment complete
on server xxxx.xxxx.local/127.0.0.1:2181, sessionid = 0x162a3c72c9c0012, negotiated timeout
= 90000
> >>> 18/04/08 05:57:39 INFO client.RpcRetryingCaller: Call exception, tries=10,
retries=35, started=38310 ms ago, cancelled=false, msg=row 'SYSTEM.CATALOG,xxxLOAD_*N**_DIM,99999999999999'
on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxxx.xxxx.local,16201,1523166165622,
seqNum=0
> >>> 18/04/08 05:57:49 INFO client.RpcRetryingCaller: Call exception, tries=11,
retries=35, started=48335 ms ago, cancelled=false, msg=row 'SYSTEM.CATALOG,xxxxxLOAD_*N**_DIM,99999999999999'
on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx.xxx.local,16201,1523166165622,
seqNum=0
> >>>
> >>
> 

Mime
View raw message