phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarth Jain <sama...@apache.org>
Subject Re: Problems with Phoenix bulk loader when using row_timestamp feature
Date Wed, 10 Aug 2016 05:05:22 GMT
Ryan,

Can you tell us what the explain plan says for the select count(*) query.

- Samarth


On Tue, Aug 9, 2016 at 12:58 PM, Ryan Templeton <rtempleton@hortonworks.com>
wrote:

> I am working on a project that will be consuming sensor data. The “fact”
> table is defined as:
>
> CREATE TABLE historian.data (
> assetid unsigned_int not null,
> metricid unsigned_int not null,
> ts timestamp not null,
> val double
> CONSTRAINT pk PRIMARY KEY (assetid, metricid, tsp))
> IMMUTABLE_ROWS=true;
>
> I generated a 1million row csv sample dataset and use the Phoenix bulk
> loader to load this data up. The tool reports that all 1,000,000 rows were
> loaded successfully which I can confirm via sqlline.
>
> I then dropped and recreated the table to take advantage of the
> row_timestamp feature
>
> drop table historian.data;
> CREATE TABLE historian.data (
> assetid unsigned_int not null,
> metricid unsigned_int not null,
> ts timestamp not null,
> val double
> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts row_timestamp))
> IMMUTABLE_ROWS=true;
>
> I reran the bulk loader utility which says it completed successfully
>
> [rtempleton@M1 phoenix-client]$ bin/psql.py localhost:2181 -t
> HISTORIAN.DATA /tmp/data.csv
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in [jar:file:/usr/hdp/2.4.3.0-
> 180/phoenix/phoenix-4.4.0.2.4.3.0-180-client.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in [jar:file:/usr/hdp/2.4.3.0-
> 180/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> 16/08/08 20:34:43 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 16/08/08 20:34:44 WARN shortcircuit.DomainSocketFactory: The
> short-circuit local reads feature cannot be used because libhadoop cannot
> be loaded.
>
> csv columns from database.
>
> CSV Upsert complete. 1000000 rows upserted
>
> Time: 65.985 sec(s)
>
> But when I run “select count(*) from historian.data” I see that only the
> first 572 rows appear in the table. These rows correlate to the the first
> 572 rows of the input file.
>
> 0: jdbc:phoenix:localhost:2181> select count(*) from historian.data;
>
> *+------------------------------------------+*
>
> *| **                COUNT(1)                ** |*
>
> *+------------------------------------------+*
>
> *| *572                                     * |*
>
> *+------------------------------------------+*
>
> 1 row selected (4.541 seconds)
>
> 0: jdbc:phoenix:localhost:2181> select min(ts), max(ts) from
> historian.data;
>
>
> *+------------------------------------------+------------------------------------------+*
>
> *| **                MIN(TS)                 ** | **
> MAX(TS)                 ** |*
>
>
> *+------------------------------------------+------------------------------------------+*
>
> *| *2016-08-08 11:05:15.000                 * | *2016-08-08 20:36:15.000
>                * |*
>
> *+------------------------------------------+—————————————————————+*
>
>
>
>
> Any ideas?
>
>
> Thanks,
> Ryan
>

Mime
View raw message