phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Socket timeout while counting number of rows of a table
Date Wed, 15 Apr 2015 15:13:36 GMT
Hi Pernollet,
What kind of indexing are you using and how big are your tables and
cluster? Have you tried our new MR-based index build? It's not released
yet, but will be in 4.4. You can try it through the 4.x-HBase-0.98 branch.
Thanks,
James

On Wednesday, April 15, 2015, PERNOLLET Martin <
martin.pernollet-ext@sgcib.com> wrote:

>  I ended by tuning two parameters to 7.200.000ms (2 hours) : hbase.regionserver.lease.period
> and phoenix.query.timeoutMs
>
>
>
> The first one to avoid SocketTimeoutException. The second one to avoid
> error (state=08000,code=101) while indexing a column.
>
>
>
> I am fine with that for the moment, but when the DB will grow, I assume
> some request will be longer than 2 hours… and I see no other way to deal
> with queries that require 2 hours than setting timeouts and lease period of
> 2 hours. Let me know if you have a better way to do this.
>
>
>
> Regards,
>
>
>
>
>
>
>
>
>
>
>
> *From:* Billy Watson [mailto:williamrwatson@gmail.com
> <javascript:_e(%7B%7D,'cvml','williamrwatson@gmail.com');>]
> *Sent:* Friday 10 April 2015 18:27
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Socket timeout while counting number of rows of a table
>
>
>
> Really long timeouts are bad b/c if you're experiencing a long-running
> process that would normally hit a timeout (i.e. something is wrong on the
> server or with the application) then it would take you much longer to hit
> your timeout.
>
>
>
> In other words, setting the long timeout won't fix anything, it'll just
> take way longer before you're alerted to the fact that something is wrong.
>
>
>
> If long timeouts weren't bad, then everyone would just set their software
> to always wait forever.
>
>
>    William Watson
> Software Engineer
>
> (904) 705-7056 PCS
>
>
>
> On Fri, Apr 10, 2015 at 12:13 PM, PERNOLLET Martin <
> martin.pernollet-ext@sgcib.com
> <javascript:_e(%7B%7D,'cvml','martin.pernollet-ext@sgcib.com');>> wrote:
>
> Thanks!
>
>
>
> Concerning timeout, I thought “the highest the better”. What would be bad
> with a too long timeout?
>
>
>
> From Ambari I can see following GC logs. Having an EDEN space used @ 98%
> might be a cause for large GC? Latest file show GC activity after I
> interrupted the request.
>
>
>
> gc.log-201504091622
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091622>
>
> 630 bytes
>
> Apr 9, 2015 4:34:07 PM
>
> gc.log-201504091634
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091634>
>
> 630 bytes
>
> Apr 9, 2015 6:11:25 PM
>
> gc.log-201504091811
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091811>
>
> 630 bytes
>
> Apr 9, 2015 6:19:14 PM
>
> gc.log-201504091819
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091819>
>
> 52776 bytes
>
> Apr 10, 2015 5:44:16 PM
>
>
>
> With following content
>
> gc.log-201504091622
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091622>
>
> 630 bytes
>
> Apr 9, 2015 4:34:07 PM
>
>
>
> Heap
>
>  par new generation   total 307200K, used 267739K [0x00000000bc600000, 0x00000000d1350000,
0x00000000d1350000)
>
>   eden space 273088K,  98% used [0x00000000bc600000, 0x00000000ccb76cc0, 0x00000000cd0b0000)
>
>   from space 34112K,   0% used [0x00000000cd0b0000, 0x00000000cd0b0000, 0x00000000cf200000)
>
>   to   space 34112K,   0% used [0x00000000cf200000, 0x00000000cf200000, 0x00000000d1350000)
>
>  concurrent mark-sweep generation total 682688K, used 0K [0x00000000d1350000, 0x00000000fae00000,
0x00000000fae00000)
>
>  concurrent-mark-sweep perm gen total 21248K, used 20241K [0x00000000fae00000, 0x00000000fc2c0000,
0x0000000100000000)
>
>
>
> gc.log-201504091634
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091634>
>
> 630 bytes
>
> Apr 9, 2015 6:11:25 PM
>
>
>
> Heap
>
>  par new generation   total 307200K, used 267702K [0x00000000bc600000, 0x00000000d1350000,
0x00000000d1350000)
>
>   eden space 273088K,  98% used [0x00000000bc600000, 0x00000000ccb6dab8, 0x00000000cd0b0000)
>
>   from space 34112K,   0% used [0x00000000cd0b0000, 0x00000000cd0b0000, 0x00000000cf200000)
>
>   to   space 34112K,   0% used [0x00000000cf200000, 0x00000000cf200000, 0x00000000d1350000)
>
>  concurrent mark-sweep generation total 682688K, used 0K [0x00000000d1350000, 0x00000000fae00000,
0x00000000fae00000)
>
>  concurrent-mark-sweep perm gen total 21248K, used 20240K [0x00000000fae00000, 0x00000000fc2c0000,
0x0000000100000000)
>
>
>
> gc.log-201504091811
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091811>
>
> 630 bytes
>
> Apr 9, 2015 6:19:14 PM
>
>
>
> Heap
>
>  par new generation   total 307200K, used 267702K [0x00000000bc600000, 0x00000000d1350000,
0x00000000d1350000)
>
>   eden space 273088K,  98% used [0x00000000bc600000, 0x00000000ccb6d960, 0x00000000cd0b0000)
>
>   from space 34112K,   0% used [0x00000000cd0b0000, 0x00000000cd0b0000, 0x00000000cf200000)
>
>   to   space 34112K,   0% used [0x00000000cf200000, 0x00000000cf200000, 0x00000000d1350000)
>
>  concurrent mark-sweep generation total 682688K, used 0K [0x00000000d1350000, 0x00000000fae00000,
0x00000000fae00000)
>
>  concurrent-mark-sweep perm gen total 21248K, used 20240K [0x00000000fae00000, 0x00000000fc2c0000,
0x0000000100000000)
>
>
>
>
>
> gc.log-201504091819
> <http://cox.fr.world.socgen:60010/logs/gc.log-201504091819>
>
> 52776 bytes
>
> Apr 10, 2015 5:44:16 PM
>
> 2015-04-09T18:19:18.858+0200: 3.320: [GC2015-04-09T18:19:18.858+0200: 3.320: [ParNew:
550528K->25330K(619328K), 0.0244110 secs] 550528K->25330K(1995584K), 0.0245570 secs]
[Times: user=0.18 sys=0.04, real=0.03 secs]
>
> 2015-04-09T18:20:31.420+0200: 75.882: [GC2015-04-09T18:20:31.420+0200: 75.882: [ParNew:
575858K->17763K(619328K), 0.0115260 secs] 575858K->17763K(1995584K), 0.0117040 secs]
[Times: user=0.10 sys=0.02, real=0.01 secs]
>
> 2015-04-09T18:24:45.356+0200: 329.818: [GC2015-04-09T18:24:45.356+0200: 329.818: [ParNew:
568291K->13634K(619328K), 0.0075260 secs] 568291K->13634K(1995584K), 0.0076920 secs]
[Times: user=0.09 sys=0.01, real=0.01 secs]
>
> 2015-04-09T18:29:26.740+0200: 611.202: [GC2015-04-09T18:29:26.740+0200: 611.202: [ParNew:
564162K->13405K(619328K), 0.0058050 secs] 564162K->13405K(1995584K), 0.0059740 secs]
[Times: user=0.08 sys=0.00, real=0.00 secs]
>
> 2015-04-09T18:34:35.852+0200: 920.314: [GC2015-04-09T18:34:35.852+0200: 920.314: [ParNew:
563933K->12630K(619328K), 0.0067330 secs] 563933K->12630K(1995584K), 0.0069170 secs]
[Times: user=0.08 sys=0.00, real=0.01 secs]
>
> 2015-04-09T18:39:50.501+0200: 1234.963: [GC2015-04-09T18:39:50.501+0200: 1234.964: [ParNew:
563158K->13940K(619328K), 0.0062440 secs] 563158K->13940K(1995584K), 0.0064730 secs]
[Times: user=0.08 sys=0.00, real=0.01 secs]
>
> 2015-04-09T18:44:58.858+0200: 1543.320: [GC2015-04-09T18:44:58.858+0200: 1543.320: [ParNew:
564468K->11237K(619328K), 0.0383070 secs] 564468K->17271K(1995584K), 0.0384820 secs]
[Times: user=0.19 sys=0.01, real=0.04 secs]
>
> 2015-04-09T18:50:14.625+0200: 1859.087: [GC2015-04-09T18:50:14.625+0200: 1859.087: [ParNew:
561765K->3522K(619328K), 0.0082090 secs] 567799K->12213K(1995584K), 0.0084010 secs]
[Times: user=0.08 sys=0.00, real=0.01 secs]
>
> 2015-04-09T18:55:31.114+0200: 2175.576: [GC2015-04-09T18:55:31.114+0200: 2175.576: [ParNew:
554050K->2478K(619328K), 0.0119800 secs] 562741K->11201K(1995584K), 0.0124310 secs]
[Times: user=0.05 sys=0.01, real=0.01 secs]
>
> 2015-04-09T19:00:51.341+0200: 2495.804: [GC2015-04-09T19:00:51.341+0200: 2495.804: [ParNew:
553006K->2444K(619328K), 0.0031950 secs] 561729K->11182K(1995584K), 0.0033530 secs]
[Times: user=0.04 sys=0.00, real=0.00 secs]
>
> 2015-04-09T19:06:13.198+0200: 2817.660: [GC2015-04-09T19:06:13.198+0200: 2817.660: [ParNew:
552972K->2276K(619328K), 0.0043040 secs] 561710K->11018K(1995584K), 0.0044850 secs]
[Times: user=0.04 sys=0.00, real=0.00 secs]
>
> 2015-04-09T19:11:33.636+0200: 3138.098: [GC2015-04-09T19:11:33.636+0200: 3138.098: [ParNew:
552804K->2516K(619328K), 0.0034310 secs] 561546K->11260K(1995584K), 0.0035640 secs]
[Times: user=0.04 sys=0.00, real=0.00 secs]
>
> 2015-04-09T19:16:54.653+0200: 3459.115: [GC2015-04-09T19:16:54.653+0200: 3459.115: [ParNew:
553044K->2477K(619328K), 0.0032890 secs] 561788K->11235K(1995584K), 0.0034050 secs]
[Times: user=0.03 sys=0.01, real=0.01 secs]
>
> 2015-04-09T19:22:13.734+0200: 3778.197: [GC2015-04-09T19:22:13.734+0200: 3778.197: [ParNew:
553005K->2652K(619328K), 0.0039630 secs] 561763K->11417K(1995584K), 0.0040940 secs]
[Times: user=0.04 sys=0.00, real=0.00 secs]
>
> ...
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Vladimir Rodionov [mailto:vladrodionov@gmail.com
> <javascript:_e(%7B%7D,'cvml','vladrodionov@gmail.com');>]
> *Sent:* Thursday 9 April 2015 21:04
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Socket timeout while counting number of rows of a table
>
>
>
> >> 1) Update hbase.rpc.timeout : 1200000 in client side hbase-site.xml
>
>
>
> Bad idea. 20 min of timeout?
>
>
>
> Check RS log files for unusual GC activity (always run hbase with GC stats
> on). That is probably what is going on in there.
>
>
>
> On Thu, Apr 9, 2015 at 11:27 AM, Samarth Jain <samarth.jain@gmail.com
> <javascript:_e(%7B%7D,'cvml','samarth.jain@gmail.com');>> wrote:
>
> Looking at the exception java.lang.RuntimeException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=36,
> exceptions:
>
> Thu Apr 09 16:49:33 CEST 2015, null,
> java.net.SocketTimeoutException: callTimeout=60000, callDuration=62366
>
>
>
> it looks like it is *not* coming from the Phoenix phoenix.query.timeoutMs
> setting. You would need to do two things:
>
>
>
> 1) Update hbase.rpc.timeout : 1200000 in client side hbase-site.xml
>
> 2) Make sure the class path you are using on the client side to connect to
> the hbase cluster is picking up hbase-site.xml. Otherwise your overrides
> won't work.
>
>
>
>
>
>
>
> On Thu, Apr 9, 2015 at 11:13 AM, Thomas D'Silva <tdsilva@salesforce.com
> <javascript:_e(%7B%7D,'cvml','tdsilva@salesforce.com');>> wrote:
>
> The phoenix.query.timeoutMs property should be set on the
> hbase-site.xml of the client (in the phoenix/bin) directory, not the
> server hbase-site.xml. See
> https://github.com/forcedotcom/phoenix/wiki/Tuning .  Did you try just
> setting it on the client side config before starting sqlline and
> running the query?
>
> Thanks,
> Thomas
>
>
> On Thu, Apr 9, 2015 at 9:29 AM, PERNOLLET Martin
> <martin.pernollet-ext@sgcib.com
> <javascript:_e(%7B%7D,'cvml','martin.pernollet-ext@sgcib.com');>> wrote:
> > I have to mention I also tried changing these properties on HBase side :
> >
> >
> >
> > hbase.regionserver.lease.period : 120000
> >
> > hbase.rpc.timeout : 1200000
> >
> >
> >
> > I am running on Hortonworks 2.2.0
> >
> > Phoenix 4.2.0
> >
> > HBase 0.98.4
> >
> >
> >
> > From: PERNOLLET Martin (EXT) ItecCttDir
> > Sent: Thursday 9 April 2015 17:52
> > To: 'user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>'
> > Subject: Socket timeout while counting number of rows of a table
> >
> >
> >
> > Hi,
> >
> >
> >
> > When asking to Phoenix to count the lines of a HBase table (select
> > count("UUID") from "bulk_1month") it fails after one minute :
> >
> >
> >
> > java.lang.RuntimeException:
> org.apache.phoenix.exception.PhoenixIOException:
> > org.apache.phoenix.exception.PhoenixIOException: Failed after
> attempts=36,
> > exceptions:
> >
> > Thu Apr 09 16:49:33 CEST 2015, null, java.net.SocketTimeoutException:
> > callTimeout=60000, callDuration=62366: row '' on table 'bulk_1month' at
> > region=bulk_1month,,1428582098717.2b2c2f1b5eab43e15b5789c2aa0dfc80.,
> > hostname=reid,60020,1428590222546, seqNum=37
> >
> >
> >
> >         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
> >
> >         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
> >
> >         at sqlline.SqlLine.print(SqlLine.java:1735)
> >
> >         at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
> >
> >         at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
> >
> >         at sqlline.SqlLine.dispatch(SqlLine.java:821)
> >
> >         at sqlline.SqlLine.begin(SqlLine.java:699)
> >
> >         at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
> >
> >         at sqlline.SqlLine.main(SqlLine.java:424)
> >
> >
> >
> > A post (https://github.com/forcedotcom/phoenix/issues/730) suggested to
> edit
> > timeout values so I added the following properties to HBase configuration
> > via Ambari.
> >
> >
> >
> >     <property>
> >
> >       <name>phoenix.query.keepAliveMs</name>
> >
> >       <!—changed to timeout from 1 min to 10 min -->
> >
> >       <value>600000</value>
> >
> >     </property>
> >
> >
> >
> >     <property>
> >
> >       <name>phoenix.query.timeoutMs</name>
> >
> >       <!—changed to timeout from 60 sec to 2h -->
> >
> >       <value>7200000</value>
> >
> >     </property>
> >
> >
> >
> > And once HBase restarted I copied the updated HBase conf to Phoenix bin/
> > directory :
> >
> >
> >
> > cp /etc/hbase/conf/hbase-site.xml  /usr/hdp/2.2.0.0-2041/phoenix/bin
> >
> >
> >
> > It did not change anything to the actual timeout.
> >
> >
> >
> > Did I miss a property or am I wrong while copying the hbase settings?
> >
> >
> >
> > Thanks for your help!
> >
> >
> >
> >
> >
> > *************************************************************************
> > This message and any attachments (the "message") are confidential,
> intended
> > solely for the addressee(s), and may contain legally privileged
> information.
> > Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible
> > to alteration.
> > Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be
> > liable for the message if altered, changed or
> > falsified.
> > Please visit http://swapdisclosure.sgcib.com for important information
> with
> > respect to derivative products.
> >                               ************
> > Ce message et toutes les pieces jointes (ci-apres le "message") sont
> > confidentiels et susceptibles de contenir des informations couvertes
> > par le secret professionnel.
> > Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> > utilisation ou diffusion non autorisee est interdite.
> > Tout message electronique est susceptible d'alteration.
> > La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre
> > de ce message s'il a ete altere, deforme ou falsifie.
> > Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> > recueillir d'importantes informations sur les produits derives.
> > *************************************************************************
>
>
>
>
>
> *************************************************************************
> This message and any attachments (the "message") are confidential,
> intended solely for the addressee(s), and may contain legally privileged
> information.
> Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or
> falsified.
> Please visit http://swapdisclosure.sgcib.com for important information
> with respect to derivative products.
>                               ************
> Ce message et toutes les pieces jointes (ci-apres le "message") sont
> confidentiels et susceptibles de contenir des informations couvertes
> par le secret professionnel.
> Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> utilisation ou diffusion non autorisee est interdite.
> Tout message electronique est susceptible d'alteration.
> La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre de ce message s'il a ete altere, deforme ou falsifie.
> Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> recueillir d'importantes informations sur les produits derives.
> *************************************************************************
>
>
>

Mime
View raw message