Hi, 

I'm using Phoenix4.6 and in my use case I have a table that keeps a sliding window of 7 days worth of data. I have 3 local indexes on this table and in out use case we have aprox: 150 producers that are inserting data (in batches of 300-1500 events) in real-time.

Some days ago I started to get a lot of errors like the below ones. The number of errors was so large that the cluster performance dropped a lot and my disks read bandwidth was crazy high but the write bandwidth was normal. I can ensure that during that period no readers were running only producers.

ERROR [B.defaultRpcServer.handler=25,queue=5,port=16020] parallel.BaseTaskRunner: Found a failed task because: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa. Index update failed
java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa. Index update failed
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa. Index update failed
Caused by: java.sql.SQLException: ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa.
INFO  [B.defaultRpcServer.handler=25,queue=5,port=16020] parallel.TaskBatch: Aborting batch of tasks because Found a failed task because: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa. Index update failed
ERROR [B.defaultRpcServer.handler=25,queue=5,port=16020] builder.IndexBuildManager: Found a failed index update!
INFO  [B.defaultRpcServer.handler=25,queue=5,port=16020] util.IndexManagementUtil: Rethrowing org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=4276342695061435086 region=BIDDING_EVENTS,\xFEK\x17\xE4\xB1~K\x08,1458435680333.ee29454d68f5b679a8e8cc775dd0edfa. Index update failed

I searched for the error and I made the following changes on the server side:
After I changed these properties I restarted the cluster and the errors were gone but disks read bandwidth was still very high and I was getting responseTooSlow warnings. As a quick solution I created fresh tables and then the problems were gone.

Now, after one day running with new tables I started to see the problem again but I think this was during a major compaction but I would like to understand more the reasons&consequences of these problems.

- What are the major consequences of these errors? I assume that index data is not written within the index table, right? Then, why was the read bandwidth of my disks so high even without readers and after changed those properties?

- Is there any optimal or recommended value for the above properties or am I missing some tunning on other properties for the metadata cache?

Thank you,
Pedro