phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Upsert is EXTREMELY slow
Date Wed, 11 Jul 2018 14:05:27 GMT
The explain plan for your tables isn't a substitute for the DDLs. Please 
provide those.

How about sharing your completely hbase-site.xml and hbase-env.sh files, 
rather than just snippets like you have. A full picture is often needed.

Given that HBase cannot directly run on S3, please also describe how 
your have this setup functioning. Do you have an HDFS instance running 
for the HBase WALs? Finally, assuming that you're using EC2 if you're 
using S3, what kind of nodes do you have?

As a courtesy, please take the time to consolidate your thoughts into 
one email instead of sending multiple in rapid succession like this. 
Remember that Apache communities are volunteer based.

On 7/11/18 7:53 AM, alchemist wrote:
> 
> I have written a threaded program to batch upsert data into Phoenix.  I am
> using Phoenix because of secondary index capability.  I am getting extremely
> slow performance in write.
> 
> Explain query looks like following
> 
> 0: jdbc:phoenix:localhost:2181:/hbase> EXPLAIN select count(1) from
> VBQL_PHOENIX_TRANSCRIPT5
> . . . . . . . . . . . . . . . . . . .> ;
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> |                                                     PLAN
> | EST_BYTES_READ  | EST_ |
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> | CLIENT 100-CHUNK 6838539 ROWS 314572800 BYTES PARALLEL 100-WAY FULL SCAN
> OVER VBQL_PHOENIX_TRANSCRIPT_INDEX5  | 314572800       | 6838 |
> |     SERVER FILTER BY FIRST KEY ONLY
> | 314572800       | 6838 |
> |     SERVER AGGREGATE INTO SINGLE ROW
> | 314572800       | 6838 |
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> 
> I have secondary index that looks like this:
> 
> 0: jdbc:phoenix:localhost:2181:/hbase> EXPLAIN select count(1) from
> VBQL_PHOENIX_TRANSCRIPT_INDEX5;
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> |                                                     PLAN
> | EST_BYTES_READ  | EST_ |
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> | CLIENT 100-CHUNK 6838539 ROWS 314572800 BYTES PARALLEL 100-WAY FULL SCAN
> OVER VBQL_PHOENIX_TRANSCRIPT_INDEX5  | 314572800       | 6838 |
> |     SERVER FILTER BY FIRST KEY ONLY
> | 314572800       | 6838 |
> |     SERVER AGGREGATE INTO SINGLE ROW
> | 314572800       | 6838 |
> +---------------------------------------------------------------------------------------------------------------+-----------------+------+
> 
> 
> Tuning Settings used in HBase.  Problem is it is very hard to scale, I tried
> adding more nodes to Hbase cluster, and I also tried adding more threads to
> the client program but it is not scaling beyond 6K per minutes, which is
> VERY SLOW.  Any help is greatly appreciated.
> 
>   <property>
>        <name>index.writer.threads.max</name>
>        <value>30</value>
>   </property>
>   <property>
>        <name>index.builder.threads.max</name>
>        <value>30</value>
>   </property>
>   <property>
>        <name>phoenix.query.threadPoolSize</name>
>        <value>256</value>
>   </property>
>   <property>
>        <name>index.builder.threads.keepalivetime</name>
>        <value>90000</value>
>   </property>
> <property>
>        <name>phoenix.query.timeoutMs</name>
>        <value>90000</value>
>   </property>
> 
>   Rest of the settings I am using are default.
> 
> Source Code  (Simple multithreded  with write need to be saved individually/
> we cannot use batch  write)
> 
>   public void generateAndInsertDataToHbase(int iterationCount ){
>     try {
> 	   	 int iterations =0;
> 	   	 Instant start = Instant.now();
> 	   	 ExecutorService executorService = Executors.newCachedThreadPool();
> 	      while (true) {
> 	    	if (iterations == iterationCount) {
> 	    		verifyRowCountAndExit(start,executorService);
> 	    	}
> 	    	Future<Void> future = executorService.submit(new Callable<Void>() {
> 	    	    public Void call() throws Exception {
> 	    	    	List<VbqlBean> vbqlList =
> VbqlUtils.generateRandomAccountVbql(accountCategoryMap);
> 	    	    	if(vbqlList.size() >0) {
> 	    	    		addVbqlAccountToHbaseAsBatch( vbqlList,connStr);
> 	    	    	}
> 	    	        return null;
> 	    	    }
> 	    	});
> 	    	future.get();
> 	    	iterations ++;
> 	      }
>     }catch(Exception e) {
>         e.printStackTrace();
>     }
>   }
>   
> 
>   public void addVbqlAccountToHbaseAsBatch(List<VbqlBean> vbqlList, String
> connStr) throws Exception{
> 	 Connection connection = null;
> 	 try {
>          connection = DBUtils.initSingleConnection(connStr);
>         // connection.setAutoCommit(false);
>          if(vbqlList.size() >0) {
> 	        for (VbqlBean vbqlBean : vbqlList) {
> 	        	DBUtils.executeSingleVbqlQuery( connStr,  vbqlBean, tableName);
> 	        }
>          }
>          connection.commit();
> 	 } finally {
>          if (connection != null) try { connection.close(); } catch
> (SQLException ignore) {}
>      }
>   }
> 
> 
> 
> 
> 
> --
> Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/
> 

Mime
View raw message