phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <marks1900-pos...@yahoo.com.au>
Subject Re: Getting swamped with Phoenix *.tmp files on SELECT.
Date Mon, 18 Apr 2016 20:43:47 GMT
I have narrowed this issue down to select statement below.  When I have iterated through the
query results of this select statement, I do ensure that the JDBC close statements on my ResultSet,
Statement and Connection are called.

For now, I am go with the suggested work-around and implement something like tmpwatch as a
Java scheduled service.

Also, I think I found someone else who seems to be having this style of issue in Phoenix:
 https://issues.apache.org/jira/browse/PHOENIX-1395

--
String sql =  "SELECT TR.ID"    + "  ,TR.CLIENT_ID"    + "  ,TR.BRAND_ID"    + "
 ,TR.SITE_ID"    + "  ,TR.EMAIL"    + "  ,COUNT(TS2.ID) + MAX(TR.repeatBrandShortVisit)
AS repeatBrandShortVisit"    + "  ,SUM(CASE WHEN TS2.SESSION_TYPE = 1 THEN 1 ELSE 0 END)
+ MAX(TR.repeatBrandLongVisit) AS repeatBrandLongVisit"    + "  ,SUM(CASE WHEN TS2.SITE_ID
= TR.SITE_ID THEN 1 ELSE 0 END) + MAX(TR.repeatSiteShortVisit) AS repeatSiteShortVisit" 
  + "  ,SUM(CASE WHEN TS2.SITE_ID = TR.SITE_ID AND TS2.SESSION_TYPE = 1 THEN 1 ELSE 0 END)
+ MAX(TR.repeatSiteLongVisit) AS repeatSiteLongVisit"    + "  FROM ("    + "  SELECT
TSE.ID"    + "      ,TSE.CLIENT_ID"    + "      ,TSE.BRAND_ID"    + "      ,TSE.SITE_ID" 
  + "      ,TSE.EMAIL"    + "      ,COUNT(TS1.ID) AS repeatBrandShortVisit"   
+ "      ,SUM(CASE WHEN TS1.SESSION_TYPE = 1 THEN 1 ELSE 0 END) AS repeatBrandLongVisit" 
  + "      ,SUM(CASE WHEN TS1.SITE_ID = TSE.SITE_ID THEN 1 ELSE 0 END) AS repeatSiteShortVisit" 
  + "      ,SUM(CASE WHEN TS1.SITE_ID = TSE.SITE_ID AND TS1.SESSION_TYPE = 1 THEN 1 ELSE
0 END) AS repeatSiteLongVisit"    + "  FROM ("    + "      SELECT ID"    + "  
       ,CLIENT_ID"    + "          ,BRAND_ID"    + "          ,SITE_ID" 
  + "          ,EMAIL"    + "      FROM user.SESSION_EXPIRATION "    + "    
 WHERE NEXT_CHECK <= CURRENT_TIME()"    + "      LIMIT " + batchSize    + "  
   ) AS TSE"    + "  LEFT OUTER JOIN user.SESSION TS1"    + "      ON TS1.CLIENT_ID
= TSE.CLIENT_ID"    + "      AND TS1.BRAND_ID = TSE.BRAND_ID"    + "  GROUP BY TSE.ID" 
  + "      ,TSE.CLIENT_ID"    + "      ,TSE.BRAND_ID"    + "      ,TSE.SITE_ID" 
  + "      ,TSE.EMAIL"    + "  ) AS TR"    + "  LEFT OUTER JOIN user.SESSION TS2" 
  + "      ON TS2.EMAIL = TR.EMAIL"    + "      AND TS2.BRAND_ID = TR.BRAND_ID" 
  + "  GROUP BY TR.ID"    + "  ,TR.CLIENT_ID"    + "  ,TR.BRAND_ID"    + "  ,TR.SITE_ID" 
  + "  ,TR.EMAIL";

--

      From: Samarth Jain <samarth@apache.org>
Cc: "user@phoenix.apache.org" <user@phoenix.apache.org>
 Sent: Monday, 18 April 2016, 15:39
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
Marks,
FWIW, we had a problem with tmp files left over in case of failures - https://issues.apache.org/jira/browse/PHOENIX-1448.
But this has been fixed since 4.2.1 release. To help us, can you post a sample query where
you are seeing tmp files left over? Are you sure the application is cleanly closing, in a
try-finally block, all the JDBC statements, result sets and phoenix connections? 
On Mon, Apr 18, 2016 at 8:54 AM, <Mark> wrote:

Currently I am running out of disk space as a direct result of these spool temp files (350GB
+), any ideas on how to address this?  These .tmp files never seem to be cleaned up after
each query.  Is there any work-around?

      From: Samarth Jain <samarth.jain@gmail.com>
 To: "user@phoenix.apache.org" <user@phoenix.apache.org> 
Sent: Friday, 15 April 2016, 17:00
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
FWIW, with phoenix 4.7, we no longer need to spool results on the client. Instead we rely
on pacing scanners as and when needed. To utitlize the feature though, you would need to make
sure that you are using HBase versions that are at least as new as:
HBase 0.98.17 for HBase 0.98HBase 1.0.3 for HBase 1.0HBase 1.1.3 for HBase 1.1 and beyond 

On Fri, Apr 15, 2016 at 1:51 PM, Alok Singh <alok@cloudability.com> wrote:

We ran into something similar, here is the ticket https://issues.apache.org/jira/browse/PHOENIX-2685The
work around that mitigated this issue for us was to lower the value of phoenix.query.spoolThresholdBytes
to 10 MB. It is counter intuitive, but, due to the way the spooling iterator interacts with
global memory manager, it works.
Alok

Alok
alok@cloudability.com
On Fri, Apr 15, 2016 at 1:42 PM, <Mark> wrote:

I am using an Ambari HDP distribution of the Phoenix client (/usr/hdp/2.3.4.0-3485/phoenix/phoenix-4.4.0.2.3.4.0-3485-client.jar),
and to close database connections I am using the standard Java JDBC try-with-resources process
 (http://www.mastertheboss.com/jboss-server/jboss-datasource/using-try-with-resources-to-close-database-connections
, https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html).

      From: Samarth Jain
Sent: Friday, 15 April 2016, 16:03
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
  
What version of phoenix are you using? Is the application properly closing statements and
result sets?

On Friday, April 15, 2016, wrote:

I am running into an issue where a huge number temporary files are being created in my C:\Users\myuser\AppData\Local\Temp
folder, they are around 20MB big and never get cleaned up.  These *.tmp files grew to around
200GB before I stopped the server.
Example file names:
7a0967de-9dff-432b-bcfe-de30bc630add5176202498513378657.tmp
813e40e1-afa9-4847-919c-7c55f95f8a475501154042645376476.tmp
1329da43-561d-4e68-9120-56bd650a6ac98781585316402092121.tmp

Currently, I have my Phoenix Client jar deployed to Wildfly 10 as described here:  https://docs.jboss.org/author/display/TEIID/Phoenix+Data+Sources
These *.tmp files only appear when I run SELECT queries.
Any help would be appreciated.


   





   



  
Mime
View raw message