phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Issue using ANY ARRAY feature
Date Mon, 09 Dec 2019 14:48:15 GMT
Hi Simon,

Thanks for replying back with your fix. We appreciate when folks do this 
so that others can also see the solution.

http://phoenix.apache.org/download.html only publishes the "latest" 
release for a line that we're maintaining. That's why you'll see 4.14.3 
listed on the website, not 4.14.0/1/2.

If you see a release x.y.z, you can reasonably assume[1] that you'll 
also find release x.y.z' where z'=[0,z). We expect that compatibility is 
maintained in the bugfix releases to some line, so there is no reason to 
not update to the latest version.

- Josh

[1] The one caveat here is that if there is a security-issues, we may 
explicitly pull a release from being downloaded.

On 12/8/19 6:12 PM, Simon Mottram wrote:
> Update:
> 
> Just in case anyone hits this issue in future with the AWS managed
> HBase, the fix is to use a very specific version of the driver
> 
> For thick client:
> 
>          <dependency>
>              <groupId>org.apache.phoenix</groupId>
>              <artifactId>phoenix-core</artifactId>
>              <version>4.14.1-HBase-1.4</version>
>          </dependency>
> 
> For thin client:
> 
>          <dependency>
>              <groupId>org.apache.phoenix</groupId>
>              <artifactId>phoenix-queryserver-client</artifactId>
>              <version>4.14.1-HBase-1.4</version>
>          </dependency>
> 
> This required support help from AWS as this driver version is not
> mentioned on the official Apache Phoenix download page
> 
> Regards
> 
> Simon
> 
> 
> On Sun, 2019-11-17 at 21:03 +0000, Simon Mottram wrote:
>> Phoenix Version: 4.14.2-HBase-1.4
>> HBase Version: AWS EMR
>> Release
>> label:emr-5.24.1
>> Hadoop distribution:Amazon
>> Applications:Phoenix 4.14.1,
>> Hue 4.4.0, HBase 1.4.9
>>
>>
>> Having an issue where ANY(ARRAY[]) stops the query returning any
>> results when used in a 'AND' conjunction
>>
>> e.g (fielda = 'a') AND (fieldb = any(array['a','b','c]))
>>
>> Always returns zero results, if i change to disjunction (OR) it works
>> fine but obviously isn't what's wanted here.  Excuse the long post
>> but
>> I wanted to be as clear as possible.
>>
>> It's quite possible I have misunderstood the way ANY() works but...
>>
>> Here's a simple query that returns a correct number of results:
>>
>> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>>   FROM DEV_OAPI.OBSERVATION
>> WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
>> 'TRIAL00015')
>>   ORDER BY OBSERVATIONDATE DESC
>>   LIMIT 10
>>   OFFSET 0
>>
>> So there's definitely records where biomaterial name and trialname
>> have
>> these values
>>
>> If I change it to
>>
>> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>>   FROM DEV_OAPI.OBSERVATION
>> WHERE (TRIALNAME = ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
>>   ORDER BY OBSERVATIONDATE DESC
>>   LIMIT 10
>>   OFFSET 0
>>
>> I get valid results
>>
>> So the ANY(ARRAY[]) function works
>>
>> Here's the explain which looks very odd to more, but it works
>>
>> PLAN
>>    
>>                                                                       
>>    
>> CLIENT 1-CHUNK 50574 ROWS 314572800 BYTES PARALLEL 1-WAY FULL SCAN
>> OVER
>> DEV_OAPI.OBSERVATION
>>    
>>                                                                       
>>    
>> SERVER FILTER BY
>> org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
>> [children=[ARRAY['TRIAL00015'], TRIALNAME =
>> org.apache.phoenix.expression.function.ArrayElemRefExpression
>> [children=[ARRAY['TRIAL00015'], 1]]]]
>>      SERVER TOP 10 ROWS SORTED BY [OBSERVATIONDATE
>> DESC]
>>   C
>> CLIENT MERGE
>> SORT
>>    
>> CLIENT LIMIT 10
>>
>> So far so good, BUT.
>>
>> However if I combine the ARRAY expression with any expression using
>> AND
>> I get zero results, even tho as above both sides of the conjunction
>> return true.
>>
>> e.g.
>>
>> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>>   FROM DEV_OAPI.OBSERVATION
>> WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
>> ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
>>   ORDER BY OBSERVATIONDATE DESC
>>   LIMIT 10
>>   OFFSET 0
>>
>> Explain (newlines added):
>>   SERVER FILTER BY
>>   (BIOMATERIALNAME = 'SCION00424'
>>   AND
>>   org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
>> [children=[ARRAY['TRIAL00015','SOMETHING ELSE'],
>>   TRIALNAME =
>> org.apache.phoenix.expression.function.ArrayElemRefExpression
>> [children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 1]]]])
>>
>> Just out of interest I tried with strings only in the array check
>>
>> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>>   FROM DEV_OAPI.OBSERVATION
>> WHERE (BIOMATERIALNAME = 'ROOT00386') AND ('TRIAL00015' =
>> ANY(ARRAY['TRIAL00015']))
>>   ORDER BY OBSERVATIONDATE DESC
>>   LIMIT 10
>>   OFFSET 0
>>
>> This works fine (in a kind of unhelpful way)
>>
>> I have tested using the thick client:
>>          <!-- HBASE THICK CLIENT DEPS -->
>>          <dependency>
>>              <groupId>org.apache.phoenix</groupId>
>>              <artifactId>phoenix-core</artifactId>
>>              <version>4.14.2-HBase-1.4</version>
>>          </dependency>
>>
>> and the thin client
>>
>>           <dependency>
>>              <groupId>org.apache.phoenix</groupId>
>>              <artifactId>phoenix-queryserver-client</artifactId>
>>              <version>4.14.2-HBase-1.4</version>
>>          </dependency>
>>
>> I've tried using braces and re-ordering but any query of the form:
>> <anyclause> AND field = ANY(ARRAY['value1'...])
>> Returns zero results regardless of values
>>
>> We can't change version as we are using the Amazon AWS EMR managed
>> stack and no other phoenix libraries work.
>>
>> Thanks for taking the time to read this far!
>>
>> Cheers
>>
>> Simon

Mime
View raw message