phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <marks1900-pos...@yahoo.com.au>
Subject Re: Getting swamped with Phoenix *.tmp files on SELECT.
Date Thu, 21 Apr 2016 18:22:13 GMT
Was unable to apply the PHOENIX-2556 patch (https://issues.apache.org/jira/secure/attachment/12780561/PHOENIX-2556.patch
from https://issues.apache.org/jira/browse/PHOENIX-2556) as suggested.  As I am using a much
older version of Phoenix (4.4 vs 4.7) there are quite a few conflicts.

Not sure when I will get the chance to spend more time on this.  In the meantime I have created
a JIRA issue to track this:  https://issues.apache.org/jira/browse/PHOENIX-2850
Maybe I can update these source files by hand.  What files do you want to updated and what
changes do yo want made?
https://github.com/hortonworks/phoenix-release/blob/HDP-2.4.0.0-tag/phoenix-core


      From: Samarth Jain <samarth@apache.org>
 To: "user@phoenix.apache.org" <user@phoenix.apache.org>; Maryann Xue <maryannxue@apache.org>

 Sent: Thursday, 21 April 2016, 11:54
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
My first suggestion would be to upgrade to the latest version of Phoenix which it doesn't
seem like is possible in your case. Second would be to try applying the patch in PHOENIX-2556
and see if that solves the issue for you. If that doesn't work either, please file a JIRA
and one of us will take a look at it.
- Samarth
On Wed, Apr 20, 2016 at 1:41 PM, <Mark> wrote:

So I did some debugging and found the code responsible for this *.tmp files being created.
https://github.com/hortonworks/phoenix-release/blob/HDP-2.4.0.0-tag/phoenix-core/src/main/java/org/apache/phoenix/iterate/MappedByteBufferQueue.java#L304
--org.apache.phoenix.iterate.MappedByteBufferQueue$MappedByteBufferSegmentQueue
--
private void flush(T entry) throws IOException {  Queue<T> inMemQueue = getInMemoryQueue(); 
int resultSize = sizeOf(entry);  maxResultSize = Math.max(maxResultSize, resultSize);  totalResultSize
= hasMaxQueueSize ? maxResultSize * inMemQueue.size() : (totalResultSize + resultSize); 
if (totalResultSize >= thresholdBytes) {    this.file = File.createTempFile(UUID.randomUUID().toString(),
null);    RandomAccessFile af = new RandomAccessFile(file, "rw");    FileChannel fc =
af.getChannel();    int writeIndex = 0;    mappingSize = Math.min(Math.max(maxResultSize,
DEFAULT_MAPPING_SIZE), totalResultSize);    MappedByteBuffer writeBuffer = fc.map(MapMode.READ_WRITE,
writeIndex, mappingSize);
--

Is any fix or work-around, so that these temp files don't exhaust my free disk space?

      From: 
 To: "user@phoenix.apache.org" <user@phoenix.apache.org>;
 Sent: Tuesday, 19 April 2016, 12:00
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
Would it be possible for you to upgrade to the latest version of Phoenix (4.7)? It is likely
that this bug has been fixed in the latest release. One of the possibly related JIRA that
was fixed in 4.7 is PHOENIX-2556. Although, Maryanne would be the best person to comment on
this. 
Maryanne, would you mind taking a look?
On Mon, Apr 18, 2016 at 1:43 PM, <Mark> wrote:

I have narrowed this issue down to select statement below.  When I have iterated through
the query results of this select statement, I do ensure that the JDBC close statements on
my ResultSet, Statement and Connection are called.

For now, I am go with the suggested work-around and implement something like tmpwatch as a
Java scheduled service.

Also, I think I found someone else who seems to be having this style of issue in Phoenix:
 https://issues.apache.org/jira/browse/PHOENIX-1395

--
String sql =  "SELECT TR.ID"    + "  ,TR.CLIENT_ID"    + "  ,TR.BRAND_ID"    + "
 ,TR.SITE_ID"    + "  ,TR.EMAIL"    + "  ,COUNT(TS2.ID) + MAX(TR.repeatBrandShortVisit)
AS repeatBrandShortVisit"    + "  ,SUM(CASE WHEN TS2.SESSION_TYPE = 1 THEN 1 ELSE 0 END)
+ MAX(TR.repeatBrandLongVisit) AS repeatBrandLongVisit"    + "  ,SUM(CASE WHEN TS2.SITE_ID
= TR.SITE_ID THEN 1 ELSE 0 END) + MAX(TR.repeatSiteShortVisit) AS repeatSiteShortVisit" 
  + "  ,SUM(CASE WHEN TS2.SITE_ID = TR.SITE_ID AND TS2.SESSION_TYPE = 1 THEN 1 ELSE 0 END)
+ MAX(TR.repeatSiteLongVisit) AS repeatSiteLongVisit"    + "  FROM ("    + "  SELECT
TSE.ID"    + "      ,TSE.CLIENT_ID"    + "      ,TSE.BRAND_ID"    + "      ,TSE.SITE_ID" 
  + "      ,TSE.EMAIL"    + "      ,COUNT(TS1.ID) AS repeatBrandShortVisit"   
+ "      ,SUM(CASE WHEN TS1.SESSION_TYPE = 1 THEN 1 ELSE 0 END) AS repeatBrandLongVisit" 
  + "      ,SUM(CASE WHEN TS1.SITE_ID = TSE.SITE_ID THEN 1 ELSE 0 END) AS repeatSiteShortVisit" 
  + "      ,SUM(CASE WHEN TS1.SITE_ID = TSE.SITE_ID AND TS1.SESSION_TYPE = 1 THEN 1 ELSE
0 END) AS repeatSiteLongVisit"    + "  FROM ("    + "      SELECT ID"    + "  
       ,CLIENT_ID"    + "          ,BRAND_ID"    + "          ,SITE_ID" 
  + "          ,EMAIL"    + "      FROM user.SESSION_EXPIRATION "    + "    
 WHERE NEXT_CHECK <= CURRENT_TIME()"    + "      LIMIT " + batchSize    + "  
   ) AS TSE"    + "  LEFT OUTER JOIN user.SESSION TS1"    + "      ON TS1.CLIENT_ID
= TSE.CLIENT_ID"    + "      AND TS1.BRAND_ID = TSE.BRAND_ID"    + "  GROUP BY TSE.ID" 
  + "      ,TSE.CLIENT_ID"    + "      ,TSE.BRAND_ID"    + "      ,TSE.SITE_ID" 
  + "      ,TSE.EMAIL"    + "  ) AS TR"    + "  LEFT OUTER JOIN user.SESSION TS2" 
  + "      ON TS2.EMAIL = TR.EMAIL"    + "      AND TS2.BRAND_ID = TR.BRAND_ID" 
  + "  GROUP BY TR.ID"    + "  ,TR.CLIENT_ID"    + "  ,TR.BRAND_ID"    + "  ,TR.SITE_ID" 
  + "  ,TR.EMAIL";

--

      From: Samarth Jain <samarth@apache.org>
Cc: "user@phoenix.apache.org" <user@phoenix.apache.org>
 Sent: Monday, 18 April 2016, 15:39
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
Marks,
FWIW, we had a problem with tmp files left over in case of failures - https://issues.apache.org/jira/browse/PHOENIX-1448.
But this has been fixed since 4.2.1 release. To help us, can you post a sample query where
you are seeing tmp files left over? Are you sure the application is cleanly closing, in a
try-finally block, all the JDBC statements, result sets and phoenix connections? 
On Mon, Apr 18, 2016 at 8:54 AM, <Mark> wrote:

Currently I am running out of disk space as a direct result of these spool temp files (350GB
+), any ideas on how to address this?  These .tmp files never seem to be cleaned up after
each query.  Is there any work-around?

      From: Samarth Jain <samarth.jain@gmail.com>
 To: "user@phoenix.apache.org" <user@phoenix.apache.org> 
Sent: Friday, 15 April 2016, 17:00
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
   
FWIW, with phoenix 4.7, we no longer need to spool results on the client. Instead we rely
on pacing scanners as and when needed. To utitlize the feature though, you would need to make
sure that you are using HBase versions that are at least as new as:
HBase 0.98.17 for HBase 0.98HBase 1.0.3 for HBase 1.0HBase 1.1.3 for HBase 1.1 and beyond 

On Fri, Apr 15, 2016 at 1:51 PM, Alok Singh <alok@cloudability.com> wrote:

We ran into something similar, here is the ticket https://issues.apache.org/jira/browse/PHOENIX-2685The
work around that mitigated this issue for us was to lower the value of phoenix.query.spoolThresholdBytes
to 10 MB. It is counter intuitive, but, due to the way the spooling iterator interacts with
global memory manager, it works.
Alok

Alok
alok@cloudability.com
On Fri, Apr 15, 2016 at 1:42 PM, <Mark> wrote:

I am using an Ambari HDP distribution of the Phoenix client (/usr/hdp/2.3.4.0-3485/phoenix/phoenix-4.4.0.2.3.4.0-3485-client.jar),
and to close database connections I am using the standard Java JDBC try-with-resources process
 (http://www.mastertheboss.com/jboss-server/jboss-datasource/using-try-with-resources-to-close-database-connections
, https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html).

      From: Samarth Jain
Sent: Friday, 15 April 2016, 16:03
 Subject: Re: Getting swamped with Phoenix *.tmp files on SELECT.
  
What version of phoenix are you using? Is the application properly closing statements and
result sets?

On Friday, April 15, 2016, wrote:

I am running into an issue where a huge number temporary files are being created in my C:\Users\myuser\AppData\Local\Temp
folder, they are around 20MB big and never get cleaned up.  These *.tmp files grew to around
200GB before I stopped the server.
Example file names:
7a0967de-9dff-432b-bcfe-de30bc630add5176202498513378657.tmp
813e40e1-afa9-4847-919c-7c55f95f8a475501154042645376476.tmp
1329da43-561d-4e68-9120-56bd650a6ac98781585316402092121.tmp

Currently, I have my Phoenix Client jar deployed to Wildfly 10 as described here:  https://docs.jboss.org/author/display/TEIID/Phoenix+Data+Sources
These *.tmp files only appear when I run SELECT queries.
Any help would be appreciated.


   





   



   



   



  
Mime
View raw message