phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Soldatov <sergeysolda...@gmail.com>
Subject Re: Bulk loading into table vs view
Date Tue, 28 Nov 2017 19:55:55 GMT
Please take a look at https://phoenix.apache.org/views.html
All views are 'virtual' tables, so they don't have a dedicated physical
table and operates on top of the table that is specified in the view DDL.

Thanks,
Sergey

On Sat, Nov 25, 2017 at 6:25 AM, Eisenhut, Roman <roman.eisenhut@tum.de>
wrote:

> Dear Phoenix-Team,
>
>
>
> I did some test on bulk-loading data with the psql.py script in
> $PHOENIX_HOME/bin and the tpc-h data on my cluster with 1 master and 3 RS.
> I’ve found that it makes quite a difference whether you:
>
>    1. Create a table
>    2. Bulk load data into that table
>
> Or
>
>    1. Create a table
>    2. Create a view
>    3. Bulk load data in the view
>
>
>
> I was wondering where the overhead is coming from? (you can find my
> numbers below)
>
>
>
> Additionally, I created a view over a table which was already filled and
> phoenix returned “No rows affected”. At the same time I can’t find a table
> in HBase that reflects the view, which makes me wonder whether views are
> actually materialized somewhere. As I’m quite interested in the view
> functionality of Phoenix, I was wondering whether someone can explain what
> is happening when a view is created?
>
>
>
> Best regards,
>
> Roman
>
>
>
>
>
> *psql.py -t X -d '|' X.csv, where X = table name*
>
> *ID*
>
> *TABLE*
>
> *region*
>
> *nation*
>
> *supplier*
>
> *customer*
>
> *part*
>
> *partsupp*
>
> *orders*
>
> *lineitem*
>
> *5*
>
> *25.00*
>
> *10,000*
>
> *150,000*
>
> *200,000*
>
> *800,000*
>
> *1,500,000*
>
> *6,001,215*
>
> *1*
>
> 0.068
>
> 0.11
>
> 2.959
>
> 18.789
>
> 27.881
>
> 107.03
>
> 164.853
>
> 1007.315
>
> *2*
>
> 0.124
>
> 0.093
>
> 2.993
>
> 19.62
>
> 26.954
>
> 80.671
>
> 169.038
>
> 1039.294
>
> *3*
>
> 0.07
>
> 0.092
>
> 2.795
>
> 20.745
>
> 29.036
>
> 76.855
>
> 177.765
>
> 1042.642
>
> *4*
>
> 0.132
>
> 0.101
>
> 2.89
>
> 20.527
>
> 28.121
>
> 78.956
>
> 180.145
>
> 1019.047
>
> *5*
>
> 0.072
>
> 0.116
>
> 3.334
>
> 27.494
>
> 28.891
>
> 75.455
>
> 166.668
>
> 1011.299
>
> *MIN*
>
> 0.068
>
> 0.092
>
> 2.795
>
> 18.789
>
> 26.954
>
> 75.455
>
> 164.853
>
> 1007.315
>
> *MAX*
>
> 0.132
>
> 0.116
>
> 3.334
>
> 27.494
>
> 29.036
>
> 107.03
>
> 180.145
>
> 1042.642
>
> *AVG*
>
> 0.0932
>
> 0.1024
>
> 2.9942
>
> 21.435
>
> 28.1766
>
> 83.7934
>
> 171.6938
>
> 1023.919
>
>
>
> *psql.py -t X_VIEW -d '|' X.csv, where X = table name*
>
> *ID*
>
> *VIEW*
>
> *region*
>
> *nation*
>
> *supplier*
>
> *customer*
>
> *part*
>
> *partsupp*
>
> *orders*
>
> *lineitem*
>
> *5*
>
> *25.00*
>
> *10,000*
>
> *150,000*
>
> *200,000*
>
> *800,000*
>
> *1,500,000*
>
> *6,001,215*
>
> *1*
>
> 0.103
>
> 0.159
>
> 2.644
>
> 22.702
>
> 28.424
>
> 93.897
>
> 201.449
>
>
>
> *2*
>
> 0.097
>
> 0.138
>
> 2.641
>
> 20.926
>
> 32.014
>
> 95.195
>
> 190.939
>
>
>
> *3*
>
> 0.123
>
> 0.076
>
> 3.097
>
> 19.88
>
> 38.426
>
> 90.613
>
> 193.376
>
>
>
> *4*
>
> 0.092
>
> 0.098
>
> 3.14
>
> 23.522
>
> 29.509
>
> 99.443
>
> 192.348
>
>
>
> *5*
>
> 0.089
>
> 0.146
>
> 2.938
>
> 22.196
>
> 34.407
>
> 93.898
>
> 198.012
>
>
>
> *MIN*
>
> 0.089
>
> 0.076
>
> 2.641
>
> 19.88
>
> 28.424
>
> 90.613
>
> 190.939
>
> 0
>
> *MAX*
>
> 0.123
>
> 0.159
>
> 3.14
>
> 23.522
>
> 38.426
>
> 99.443
>
> 201.449
>
> 0
>
> *AVG*
>
> 0.1008
>
> 0.1234
>
> 2.892
>
> 21.8452
>
> 32.556
>
> 94.6092
>
> 195.2248
>
> #DIV/0!
>
>
>

Mime
View raw message