phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)
Date Mon, 07 Dec 2015 23:54:18 GMT
I can't explain that. Seems like the new guidepost width is not taking
effect. Maybe there's some "special" means of updating a region server
config property in HDP? When you update this parameter, do you see less
guideposts created after a major compaction occurs?



On Mon, Dec 7, 2015 at 3:46 PM, Thangamani, Arun <thangar@cobalt.com> wrote:

> I bounced the region servers with phoenix.stats.guidepost.width = 10737418240
> (which is the max file size set from ambari)
>
> Like Matt, I am seeing entries created in the SYSTEM.STATS table as well.
> Any other suggestions James?
>
> From: Matt Kowalczyk <mattk@cloudability.com>
> Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org>
> Date: Monday, December 7, 2015 at 2:52 PM
> To: "user@phoenix.apache.org" <user@phoenix.apache.org>
> Subject: Re: system.catalog and system.stats entries slows down bulk MR
> inserts by 20-25X (Phoenix 4.4)
>
> I'm sorry I poorly communicated in the previous e-mail. I meant to provide
> a list of things that I did. I bounced and then performed a major
> compaction and then ran the select count(*) query.
>
> On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>
>> You need to bounce the cluster *before* major compaction or the region
>> server will continue to use the old guideposts setting during compaction.
>>
>> On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <mattk@cloudability.com>
>> wrote:
>>
>>> bounced, just after major compaction, with the setting as indicated
>>> above. I'm unable to disable the stats table.
>>>
>>> select count(*) from system.stats where physical_name = 'XXXXX';
>>> +------------------------------------------+
>>> |                 COUNT(1)                 |
>>> +------------------------------------------+
>>> | 653                                      |
>>> +------------------------------------------+
>>> 1 row selected (0.036 seconds)
>>>
>>>
>>> On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <jamestaylor@apache.org>
>>> wrote:
>>>
>>>> Yes, setting that property is another way to disable stats. You'll need
>>>> to bounce your cluster after setting either of these, and stats won't be
>>>> updated until a major compaction occurs.
>>>>
>>>>
>>>> On Monday, December 7, 2015, Matt Kowalczyk <mattk@cloudability.com>
>>>> wrote:
>>>>
>>>>> I've set, phoenix.stats.guidepost.per.region to 1 and continue to see
>>>>> entries added to the system.stats table. I believe this should have the
>>>>> same effect? I'll try setting the guidepost width though.
>>>>>
>>>>>
>>>>> On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <jamestaylor@apache.org>
>>>>> wrote:
>>>>>
>>>>>> You can disable stats through setting
>>>>>> the phoenix.stats.guidepost.width config parameter to a larger value
in the
>>>>>> server side hbase-site.xml. The default is 104857600 (or 10MB). If
you set
>>>>>> it to your MAX_FILESIZE (the size you allow a region to grow to before
it
>>>>>> splits - default 20GB), then you're essentially disabling it. You
could
>>>>>> also try increasing it somewhere in between to maybe 5 or 10GB.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>> On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <
>>>>>> mattk@cloudability.com> wrote:
>>>>>>
>>>>>>> We're also encountering slow downs after bulk MR inserts. I've
only
>>>>>>> measured slow downs in the query path (since our bulk inserts
workloads
>>>>>>> vary in size it hasn't been clear that we see slow downs here
but i'll now
>>>>>>> measure this as well). The subject of my reported issue was titled,
"stats
>>>>>>> table causing slow queries".
>>>>>>>
>>>>>>> the stats table seems to be re-built during compactions and and
I
>>>>>>> have to actively purge the table to regain sane query times.
Would be sweet
>>>>>>> if the stats feature could be disabled.
>>>>>>>
>>>>>>> On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <thangar@cobalt.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> This is on hbase-1.1.1.2.3.0.0-2557 if that would make any
>>>>>>>> difference in analysis. Thanks
>>>>>>>>
>>>>>>>> From: Arun Thangamani <thangar@cobalt.com>
>>>>>>>> Date: Monday, December 7, 2015 at 12:13 AM
>>>>>>>> To: "user@phoenix.apache.org" <user@phoenix.apache.org>
>>>>>>>> Subject: system.catalog and system.stats entries slows down
bulk
>>>>>>>> MR inserts by 20-25X (Phoenix 4.4)
>>>>>>>>
>>>>>>>> Hello, I noticed an issue with bulk insert through map reduce
in
>>>>>>>> phoenix 4.4.0.2.3.0.0-2557, using outline of the code below
>>>>>>>>
>>>>>>>> Normally the inserts of about 25 million rows complete in
about 5
>>>>>>>> mins, there are 5 region servers and the phoenix table has
32 buckets
>>>>>>>> But sometimes (maybe after major compactions or region movement?),
>>>>>>>> writes simply slow down to 90 mins, when I truncate SYSTEM.STATS
hbase
>>>>>>>> table, the inserts get a little faster (60 mins), but when
I truncate both
>>>>>>>> SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the
phoenix table def(s)
>>>>>>>> the inserts go back to 5 mins, the workaround of truncating
SYSTEM tables
>>>>>>>> is not sustainable for long, can someone help and let me
know if there is a
>>>>>>>> patch available for this? Thanks in advance.
>>>>>>>>
>>>>>>>> Job job = Job.getInstance(conf, NAME);
>>>>>>>> // Set the target Phoenix table and the columns
>>>>>>>> PhoenixMapReduceUtil.setOutput(job, tableName,
>>>>>>>> "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +
>>>>>>>>
>>>>>>>>         "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT,"
>>>>>>>> +
>>>>>>>>         "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS,"
>>>>>>>> +
>>>>>>>>
>>>>>>>>         "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
>>>>>>>> FileInputFormat.setInputPaths(job, inputPath);
>>>>>>>> job.setMapperClass(WidgetPhoenixMapper.class);
>>>>>>>> job.setMapOutputKeyClass(NullWritable.class);
>>>>>>>> job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
>>>>>>>> job.setOutputFormatClass(PhoenixOutputFormat.class);
>>>>>>>> TableMapReduceUtil.addDependencyJars(job);
>>>>>>>> job.setNumReduceTasks(0);
>>>>>>>> job.waitForCompletion(true);
>>>>>>>>
>>>>>>>> public static class WidgetPhoenixMapper extends
>>>>>>>> Mapper<LongWritable, Text, NullWritable, WidgetPagesStatsWritable>
{
>>>>>>>>     @Override
>>>>>>>>     public void map(LongWritable longWritable, Text text,
Context
>>>>>>>> context) throws IOException, InterruptedException {
>>>>>>>>         Configuration conf = context.getConfiguration();
>>>>>>>>         String rundateString = conf.get("rundate");
>>>>>>>>         PagesSegmentWidgetLineParser parser = new
>>>>>>>> PagesSegmentWidgetLineParser();
>>>>>>>>         try {
>>>>>>>>             PagesSegmentWidget pagesSegmentWidget =
>>>>>>>> parser.parse(text.toString());
>>>>>>>>
>>>>>>>>             if (pagesSegmentWidget != null) {
>>>>>>>>                 WidgetPagesStatsWritable widgetPagesStatsWritable
=
>>>>>>>> new WidgetPagesStatsWritable();
>>>>>>>>                 WidgetPagesStats widgetPagesStats = new
>>>>>>>> WidgetPagesStats();
>>>>>>>>
>>>>>>>>
>>>>>>>>                 widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
>>>>>>>>
>>>>>>>>                 widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
>>>>>>>>
>>>>>>>>                 widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
>>>>>>>>                 …..
>>>>>>>>
>>>>>>>>
>>>>>>>>                 widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
>>>>>>>>                 context.write(NullWritable.get(),
>>>>>>>> widgetPagesStatsWritable);
>>>>>>>>             }
>>>>>>>>
>>>>>>>>         }catch (Exception e){
>>>>>>>>             e.printStackTrace();
>>>>>>>>         }
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> public final class WidgetPagesStats {
>>>>>>>>     private String webId;
>>>>>>>>     private String webPageLabel;
>>>>>>>>     private long widgetInstanceId;
>>>>>>>>     private String widgetType;
>>>>>>>>
>>>>>>>>         …
>>>>>>>>     @Override
>>>>>>>>     public boolean equals(Object o) {
>>>>>>>>
>>>>>>>>         ..
>>>>>>>>     }
>>>>>>>>     @Override
>>>>>>>>     public int hashCode() {
>>>>>>>>
>>>>>>>>         ..
>>>>>>>>     }
>>>>>>>>     @Override
>>>>>>>>     public String toString() {
>>>>>>>>         return "WidgetPhoenix{“….
>>>>>>>>                 '}';
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> public class WidgetPagesStatsWritable implements DBWritable,
>>>>>>>> Writable {
>>>>>>>>
>>>>>>>>     private WidgetPagesStats widgetPagesStats;
>>>>>>>>
>>>>>>>>     public void readFields(DataInput input) throws IOException
{
>>>>>>>>         widgetPagesStats.setWebId(input.readLine());
>>>>>>>>         widgetPagesStats.setWebPageLabel(input.readLine());
>>>>>>>>         widgetPagesStats.setWidgetInstanceId(input.readLong());
>>>>>>>>         widgetPagesStats.setWidgetType(input.readLine());
>>>>>>>>
>>>>>>>>         …
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     public void write(DataOutput output) throws IOException
{
>>>>>>>>         output.writeBytes(widgetPagesStats.getWebId());
>>>>>>>>         output.writeBytes(widgetPagesStats.getWebPageLabel());
>>>>>>>>
>>>>>>>>         output.writeLong(widgetPagesStats.getWidgetInstanceId());
>>>>>>>>         output.writeBytes(widgetPagesStats.getWidgetType());
>>>>>>>>
>>>>>>>>         ..
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     public void readFields(ResultSet rs) throws SQLException
{
>>>>>>>>         widgetPagesStats.setWebId(rs.getString("WEB_ID"));
>>>>>>>>
>>>>>>>>         widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
>>>>>>>>
>>>>>>>>         widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
>>>>>>>>         widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));
>>>>>>>>
>>>>>>>>         …
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     public void write(PreparedStatement pstmt) throws SQLException
{
>>>>>>>>         Connection connection = pstmt.getConnection();
>>>>>>>>         PhoenixConnection phoenixConnection = (PhoenixConnection)
>>>>>>>> connection;
>>>>>>>>         //connection.getClientInfo().setProperty("scn",
>>>>>>>> Long.toString(widgetPhoenix.getViewDateTimestamp()));
>>>>>>>>
>>>>>>>>         pstmt.setString(1, widgetPagesStats.getWebId());
>>>>>>>>         pstmt.setString(2, widgetPagesStats.getWebPageLabel());
>>>>>>>>         pstmt.setString(3, widgetPagesStats.getDeviceType());
>>>>>>>>
>>>>>>>>         pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());
>>>>>>>>
>>>>>>>>         …
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     public WidgetPagesStats getWidgetPagesStats() {
>>>>>>>>         return widgetPagesStats;
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     public void setWidgetPagesStats(WidgetPagesStats
>>>>>>>> widgetPagesStats) {
>>>>>>>>         this.widgetPagesStats = widgetPagesStats;
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------
>>>>>>>> This message and any attachments are intended only for the
use of
>>>>>>>> the addressee and may contain information that is privileged
and
>>>>>>>> confidential. If the reader of the message is not the intended
recipient or
>>>>>>>> an authorized representative of the intended recipient, you
are hereby
>>>>>>>> notified that any dissemination of this communication is
strictly
>>>>>>>> prohibited. If you have received this communication in error,
notify the
>>>>>>>> sender immediately by return email and delete the message
and any
>>>>>>>> attachments from your system.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Mime
View raw message