phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Perko, Ralph J" <Ralph.Pe...@pnnl.gov>
Subject PhoenixIOException - GlobalMemoryManager
Date Mon, 17 Nov 2014 17:24:17 GMT
Hi, while importing data using the CsvBulkLoadTool I've run into an issue trying to query the
data using sqlline.py.  The bulk load tool was successful.  There were no errors.  However
when I attempt to query the data I get some exceptions:

java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)

followed by many GlobalMemoryManger errors:

WARN memory.GlobalMemoryManager: Orphaned chunk of xxxx bytes found during finalize

Not all queries, but most, produce this error and it seems related to the existence of a secondary
index table:

select * from TABLE limit 10;  --ERROR - index not used
select <un-indexed field> from TABLE limit 10 -- ERROR

If I run a query on an INTEGER column with a secondary index I do not get this error:

select distinct(fieldx) from TABLE limit 10;  -- SUCCESS!

However, a similar query on an indexed VARCHAR field produces a timeout error:
java.lang.RuntimeException: ... PhoenixIOException: Failed after retry of OutOfOrderScannerNextException:
was there a rpc timeout?
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)

select count(*) ... times out as well

Details:
Total records imported: 7.2B
Cluster size: 30 nodes
Splits: 40 (salted)

Phoenix version: 4.2.0
HBase version: 0.98
HDP distro 2.1.5

I can scan the data with no errors from hbase shell

Basic Phoenix table def:

CREATE TABLE IF NOT EXISTS
t1_csv_data
(
timestamp BIGINT NOT NULL,
location VARCHAR NOT NULL,
fileid VARCHAR NOT NULL,
recnum INTEGER NOT NULL,
field5 VARCHAR,
...
field45 VARCHAR,
CONSTRAINT pkey PRIMARY KEY (timestamp,
location, fileid,recnum)
)
IMMUTABLE_ROWS=true,COMPRESSION='SNAPPY',SALT_BUCKETS=40, SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';

-- indexes
CREATE INDEX t1_csv_data_f1_idx ON t1_csv_data(somefield1) COMPRESSION='SNAPPY', SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';;
CREATE INDEX t1_csv_data_f2_idx ON t1_csv_data(somefield2) COMPRESSION='SNAPPY', SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';;
CREATE INDEX t1_csv_data_f3_idx ON t1_csv_data(somefield3) COMPRESSION='SNAPPY', SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';;

Thanks for your help,
Ralph


Mime
View raw message