phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: split count for mapreduce jobs with PhoenixInputFormat
Date Wed, 30 Jan 2019 23:24:56 GMT
Please do not take this advice lightly. Adding (or increasing) salt 
buckets can have a serious impact on the execution of your queries.

On 1/30/19 5:33 PM, venkata subbarayudu wrote:
> You may recreate the table with salt_bucket table option to have 
> reasonable regions and you may try having a secondary index to make the 
> query run faster incase if your Mapreduce job performing specific filters
> 
> On Thu 31 Jan, 2019, 12:09 AM Thomas D'Silva <tdsilva@salesforce.com 
> <mailto:tdsilva@salesforce.com> wrote:
> 
>     If stats are enabled PhoenixInputFormat will generate a split per
>     guidepost.
> 
>     On Wed, Jan 30, 2019 at 7:31 AM Josh Elser <elserj@apache.org
>     <mailto:elserj@apache.org>> wrote:
> 
>         You can extend/customize the PhoenixInputFormat with your own
>         code to
>         increase the number of InputSplits and Mappers.
> 
>         On 1/30/19 6:43 AM, Edwin Litterst wrote:
>          > Hi,
>          > I am using PhoenixInputFormat as input source for mapreduce jobs.
>          > The split count (which determines how many mappers are used
>         for the job)
>          > is always equal to the number of regions of the table from
>         where I
>          > select the input.
>          > Is there a way to increase the number of splits? My job is
>         running too
>          > slow with only one mapper for every region.
>          > (Increasing the number of regions is no option.)
>          > regards,
>          > Eddie
> 

Mime
View raw message