Yes – setting the phoenix.stats.guideposts.per.region in your case would have created bigger chunks for the parallel queries (100G/10).

 

Regards

Ram

 

From: Perko, Ralph J [mailto:Ralph.Perko@pnnl.gov]
Sent: Tuesday, November 18, 2014 5:29 AM
To: user@phoenix.apache.org
Subject: RE: PhoenixIOException - GlobalMemoryManager

 

Thank you for the response.  I will check it out.

 

I did some further research on the stats portion of phoenix based on the documentation.  I set phoenix.stats.guidepost.per.region=10, ran ‘UPDATE STATISTICS MyTable’ and I am no longer getting the errors.  If I understand how the parallel querying works, with a default size of 100mb for the guidepost.width, and given this table has about 1.5TB of data in it with an HStoreFile size set to 100G to prevent splitting, perhaps the issue was there? Please let me know if you think I’m off here.

 

Thanks

Ralph

 

 

From: Maryann Xue [mailto:maryann.xue@gmail.com]
Sent: Monday, November 17, 2014 3:45 PM
To: user@phoenix.apache.org
Subject: Re: PhoenixIOException - GlobalMemoryManager

 

Hi Ralph,

 

You may want to check this problem against the latest release of Phoenix, coz we just incorporated a fix for a similar issue in our 3.2.1 RC1 and 4.2.1 RC1.

 

 

Thanks,

Maryann

 

On Mon, Nov 17, 2014 at 6:32 PM, Maryann Xue <maryann.xue@gmail.com> wrote:

Hi Ralph,

 

I think this is a known issue reported as PHOENIX-1011 (https://issues.apache.org/jira/browse/PHOENIX-1011). We are still looking at it. Will give you an update once it is solved. 

 

Thanks a lot for the very detailed information, Ralph!

 

 

Thanks,

Maryann

 

On Mon, Nov 17, 2014 at 12:24 PM, Perko, Ralph J <Ralph.Perko@pnnl.gov> wrote:

Hi, while importing data using the CsvBulkLoadTool I’ve run into an issue trying to query the data using sqlline.py.  The bulk load tool was successful.  There were no errors.  However when I attempt to query the data I get some exceptions:

 

java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException

        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)

 

followed by many GlobalMemoryManger errors:

 

WARN memory.GlobalMemoryManager: Orphaned chunk of xxxx bytes found during finalize

 

Not all queries, but most, produce this error and it seems related to the existence of a secondary index table:

 

select * from TABLE limit 10;  --ERROR – index not used

select <un-indexed field> from TABLE limit 10 -- ERROR

 

If I run a query on an INTEGER column with a secondary index I do not get this error:

 

select distinct(fieldx) from TABLE limit 10;  -- SUCCESS!

 

However, a similar query on an indexed VARCHAR field produces a timeout error:

java.lang.RuntimeException: … PhoenixIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?

        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)

 

select count(*) … times out as well

 

Details:

Total records imported: 7.2B

Cluster size: 30 nodes

Splits: 40 (salted)

 

Phoenix version: 4.2.0

HBase version: 0.98

HDP distro 2.1.5

 

I can scan the data with no errors from hbase shell

 

Basic Phoenix table def:

 

CREATE TABLE IF NOT EXISTS

t1_csv_data

(

timestamp BIGINT NOT NULL,

location VARCHAR NOT NULL,

fileid VARCHAR NOT NULL,

recnum INTEGER NOT NULL,

field5 VARCHAR,

...

field45 VARCHAR,

CONSTRAINT pkey PRIMARY KEY (timestamp,

location, fileid,recnum)

)

IMMUTABLE_ROWS=true,COMPRESSION='SNAPPY',SALT_BUCKETS=40, SPLIT_POLICY=’org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy’;

 

-- indexes

CREATE INDEX t1_csv_data_f1_idx ON t1_csv_data(somefield1) COMPRESSION='SNAPPY', SPLIT_POLICY=’org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy’;;

CREATE INDEX t1_csv_data_f2_idx ON t1_csv_data(somefield2) COMPRESSION='SNAPPY', SPLIT_POLICY=’org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy’;;

CREATE INDEX t1_csv_data_f3_idx ON t1_csv_data(somefield3) COMPRESSION='SNAPPY', SPLIT_POLICY=’org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy’;;

 

Thanks for your help,

Ralph

 



 

--

Thanks,
Maryann



 

--

Thanks,
Maryann