phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Kowalczyk <ma...@cloudability.com>
Subject Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)
Date Mon, 07 Dec 2015 22:52:50 GMT
I'm sorry I poorly communicated in the previous e-mail. I meant to provide
a list of things that I did. I bounced and then performed a major
compaction and then ran the select count(*) query.

On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <jamestaylor@apache.org> wrote:

> You need to bounce the cluster *before* major compaction or the region
> server will continue to use the old guideposts setting during compaction.
>
> On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <mattk@cloudability.com>
> wrote:
>
>> bounced, just after major compaction, with the setting as indicated
>> above. I'm unable to disable the stats table.
>>
>> select count(*) from system.stats where physical_name = 'XXXXX';
>> +------------------------------------------+
>> |                 COUNT(1)                 |
>> +------------------------------------------+
>> | 653                                      |
>> +------------------------------------------+
>> 1 row selected (0.036 seconds)
>>
>>
>> On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Yes, setting that property is another way to disable stats. You'll need
>>> to bounce your cluster after setting either of these, and stats won't be
>>> updated until a major compaction occurs.
>>>
>>>
>>> On Monday, December 7, 2015, Matt Kowalczyk <mattk@cloudability.com>
>>> wrote:
>>>
>>>> I've set, phoenix.stats.guidepost.per.region to 1 and continue to see
>>>> entries added to the system.stats table. I believe this should have the
>>>> same effect? I'll try setting the guidepost width though.
>>>>
>>>>
>>>> On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <jamestaylor@apache.org>
>>>> wrote:
>>>>
>>>>> You can disable stats through setting
>>>>> the phoenix.stats.guidepost.width config parameter to a larger value
in the
>>>>> server side hbase-site.xml. The default is 104857600 (or 10MB). If you
set
>>>>> it to your MAX_FILESIZE (the size you allow a region to grow to before
it
>>>>> splits - default 20GB), then you're essentially disabling it. You could
>>>>> also try increasing it somewhere in between to maybe 5 or 10GB.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <
>>>>> mattk@cloudability.com> wrote:
>>>>>
>>>>>> We're also encountering slow downs after bulk MR inserts. I've only
>>>>>> measured slow downs in the query path (since our bulk inserts workloads
>>>>>> vary in size it hasn't been clear that we see slow downs here but
i'll now
>>>>>> measure this as well). The subject of my reported issue was titled,
"stats
>>>>>> table causing slow queries".
>>>>>>
>>>>>> the stats table seems to be re-built during compactions and and I
>>>>>> have to actively purge the table to regain sane query times. Would
be sweet
>>>>>> if the stats feature could be disabled.
>>>>>>
>>>>>> On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <thangar@cobalt.com>
>>>>>> wrote:
>>>>>>
>>>>>>> This is on hbase-1.1.1.2.3.0.0-2557 if that would make any
>>>>>>> difference in analysis. Thanks
>>>>>>>
>>>>>>> From: Arun Thangamani <thangar@cobalt.com>
>>>>>>> Date: Monday, December 7, 2015 at 12:13 AM
>>>>>>> To: "user@phoenix.apache.org" <user@phoenix.apache.org>
>>>>>>> Subject: system.catalog and system.stats entries slows down bulk
MR
>>>>>>> inserts by 20-25X (Phoenix 4.4)
>>>>>>>
>>>>>>> Hello, I noticed an issue with bulk insert through map reduce
in
>>>>>>> phoenix 4.4.0.2.3.0.0-2557, using outline of the code below
>>>>>>>
>>>>>>> Normally the inserts of about 25 million rows complete in about
5
>>>>>>> mins, there are 5 region servers and the phoenix table has 32
buckets
>>>>>>> But sometimes (maybe after major compactions or region movement?),
>>>>>>> writes simply slow down to 90 mins, when I truncate SYSTEM.STATS
hbase
>>>>>>> table, the inserts get a little faster (60 mins), but when I
truncate both
>>>>>>> SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix
table def(s)
>>>>>>> the inserts go back to 5 mins, the workaround of truncating SYSTEM
tables
>>>>>>> is not sustainable for long, can someone help and let me know
if there is a
>>>>>>> patch available for this? Thanks in advance.
>>>>>>>
>>>>>>> Job job = Job.getInstance(conf, NAME);
>>>>>>> // Set the target Phoenix table and the columns
>>>>>>> PhoenixMapReduceUtil.setOutput(job, tableName,
>>>>>>> "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +
>>>>>>>
>>>>>>>         "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT,"
>>>>>>> +
>>>>>>>         "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS,"
>>>>>>> +
>>>>>>>
>>>>>>>         "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
>>>>>>> FileInputFormat.setInputPaths(job, inputPath);
>>>>>>> job.setMapperClass(WidgetPhoenixMapper.class);
>>>>>>> job.setMapOutputKeyClass(NullWritable.class);
>>>>>>> job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
>>>>>>> job.setOutputFormatClass(PhoenixOutputFormat.class);
>>>>>>> TableMapReduceUtil.addDependencyJars(job);
>>>>>>> job.setNumReduceTasks(0);
>>>>>>> job.waitForCompletion(true);
>>>>>>>
>>>>>>> public static class WidgetPhoenixMapper extends Mapper<LongWritable,
>>>>>>> Text, NullWritable, WidgetPagesStatsWritable> {
>>>>>>>     @Override
>>>>>>>     public void map(LongWritable longWritable, Text text, Context
>>>>>>> context) throws IOException, InterruptedException {
>>>>>>>         Configuration conf = context.getConfiguration();
>>>>>>>         String rundateString = conf.get("rundate");
>>>>>>>         PagesSegmentWidgetLineParser parser = new
>>>>>>> PagesSegmentWidgetLineParser();
>>>>>>>         try {
>>>>>>>             PagesSegmentWidget pagesSegmentWidget =
>>>>>>> parser.parse(text.toString());
>>>>>>>
>>>>>>>             if (pagesSegmentWidget != null) {
>>>>>>>                 WidgetPagesStatsWritable widgetPagesStatsWritable
=
>>>>>>> new WidgetPagesStatsWritable();
>>>>>>>                 WidgetPagesStats widgetPagesStats = new
>>>>>>> WidgetPagesStats();
>>>>>>>
>>>>>>>
>>>>>>>                 widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
>>>>>>>
>>>>>>>                 widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
>>>>>>>
>>>>>>>                 widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
>>>>>>>                 …..
>>>>>>>
>>>>>>>
>>>>>>>                 widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
>>>>>>>                 context.write(NullWritable.get(),
>>>>>>> widgetPagesStatsWritable);
>>>>>>>             }
>>>>>>>
>>>>>>>         }catch (Exception e){
>>>>>>>             e.printStackTrace();
>>>>>>>         }
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>> public final class WidgetPagesStats {
>>>>>>>     private String webId;
>>>>>>>     private String webPageLabel;
>>>>>>>     private long widgetInstanceId;
>>>>>>>     private String widgetType;
>>>>>>>
>>>>>>>         …
>>>>>>>     @Override
>>>>>>>     public boolean equals(Object o) {
>>>>>>>
>>>>>>>         ..
>>>>>>>     }
>>>>>>>     @Override
>>>>>>>     public int hashCode() {
>>>>>>>
>>>>>>>         ..
>>>>>>>     }
>>>>>>>     @Override
>>>>>>>     public String toString() {
>>>>>>>         return "WidgetPhoenix{“….
>>>>>>>                 '}';
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>> public class WidgetPagesStatsWritable implements DBWritable,
>>>>>>> Writable {
>>>>>>>
>>>>>>>     private WidgetPagesStats widgetPagesStats;
>>>>>>>
>>>>>>>     public void readFields(DataInput input) throws IOException
{
>>>>>>>         widgetPagesStats.setWebId(input.readLine());
>>>>>>>         widgetPagesStats.setWebPageLabel(input.readLine());
>>>>>>>         widgetPagesStats.setWidgetInstanceId(input.readLong());
>>>>>>>         widgetPagesStats.setWidgetType(input.readLine());
>>>>>>>
>>>>>>>         …
>>>>>>>     }
>>>>>>>
>>>>>>>     public void write(DataOutput output) throws IOException {
>>>>>>>         output.writeBytes(widgetPagesStats.getWebId());
>>>>>>>         output.writeBytes(widgetPagesStats.getWebPageLabel());
>>>>>>>
>>>>>>>         output.writeLong(widgetPagesStats.getWidgetInstanceId());
>>>>>>>         output.writeBytes(widgetPagesStats.getWidgetType());
>>>>>>>
>>>>>>>         ..
>>>>>>>     }
>>>>>>>
>>>>>>>     public void readFields(ResultSet rs) throws SQLException
{
>>>>>>>         widgetPagesStats.setWebId(rs.getString("WEB_ID"));
>>>>>>>
>>>>>>>         widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
>>>>>>>
>>>>>>>         widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
>>>>>>>         widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));
>>>>>>>
>>>>>>>         …
>>>>>>>     }
>>>>>>>
>>>>>>>     public void write(PreparedStatement pstmt) throws SQLException
{
>>>>>>>         Connection connection = pstmt.getConnection();
>>>>>>>         PhoenixConnection phoenixConnection = (PhoenixConnection)
>>>>>>> connection;
>>>>>>>         //connection.getClientInfo().setProperty("scn",
>>>>>>> Long.toString(widgetPhoenix.getViewDateTimestamp()));
>>>>>>>
>>>>>>>         pstmt.setString(1, widgetPagesStats.getWebId());
>>>>>>>         pstmt.setString(2, widgetPagesStats.getWebPageLabel());
>>>>>>>         pstmt.setString(3, widgetPagesStats.getDeviceType());
>>>>>>>
>>>>>>>         pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());
>>>>>>>
>>>>>>>         …
>>>>>>>     }
>>>>>>>
>>>>>>>     public WidgetPagesStats getWidgetPagesStats() {
>>>>>>>         return widgetPagesStats;
>>>>>>>     }
>>>>>>>
>>>>>>>     public void setWidgetPagesStats(WidgetPagesStats
>>>>>>> widgetPagesStats) {
>>>>>>>         this.widgetPagesStats = widgetPagesStats;
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------
>>>>>>> This message and any attachments are intended only for the use
of
>>>>>>> the addressee and may contain information that is privileged
and
>>>>>>> confidential. If the reader of the message is not the intended
recipient or
>>>>>>> an authorized representative of the intended recipient, you are
hereby
>>>>>>> notified that any dissemination of this communication is strictly
>>>>>>> prohibited. If you have received this communication in error,
notify the
>>>>>>> sender immediately by return email and delete the message and
any
>>>>>>> attachments from your system.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

Mime
View raw message