phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Kowalczyk <ma...@cloudability.com>
Subject Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)
Date Mon, 07 Dec 2015 22:31:57 GMT
I've set, phoenix.stats.guidepost.per.region to 1 and continue to see
entries added to the system.stats table. I believe this should have the
same effect? I'll try setting the guidepost width though.


On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <jamestaylor@apache.org>
wrote:

> You can disable stats through setting the phoenix.stats.guidepost.width
> config parameter to a larger value in the server side hbase-site.xml. The
> default is 104857600 (or 10MB). If you set it to your MAX_FILESIZE (the
> size you allow a region to grow to before it splits - default 20GB), then
> you're essentially disabling it. You could also try increasing it somewhere
> in between to maybe 5 or 10GB.
>
> Thanks,
> James
>
> On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <mattk@cloudability.com>
> wrote:
>
>> We're also encountering slow downs after bulk MR inserts. I've only
>> measured slow downs in the query path (since our bulk inserts workloads
>> vary in size it hasn't been clear that we see slow downs here but i'll now
>> measure this as well). The subject of my reported issue was titled, "stats
>> table causing slow queries".
>>
>> the stats table seems to be re-built during compactions and and I have to
>> actively purge the table to regain sane query times. Would be sweet if the
>> stats feature could be disabled.
>>
>> On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <thangar@cobalt.com>
>> wrote:
>>
>>> This is on hbase-1.1.1.2.3.0.0-2557 if that would make any difference in
>>> analysis. Thanks
>>>
>>> From: Arun Thangamani <thangar@cobalt.com>
>>> Date: Monday, December 7, 2015 at 12:13 AM
>>> To: "user@phoenix.apache.org" <user@phoenix.apache.org>
>>> Subject: system.catalog and system.stats entries slows down bulk MR
>>> inserts by 20-25X (Phoenix 4.4)
>>>
>>> Hello, I noticed an issue with bulk insert through map reduce in phoenix
>>> 4.4.0.2.3.0.0-2557, using outline of the code below
>>>
>>> Normally the inserts of about 25 million rows complete in about 5 mins,
>>> there are 5 region servers and the phoenix table has 32 buckets
>>> But sometimes (maybe after major compactions or region movement?),
>>> writes simply slow down to 90 mins, when I truncate SYSTEM.STATS hbase
>>> table, the inserts get a little faster (60 mins), but when I truncate both
>>> SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix table def(s)
>>> the inserts go back to 5 mins, the workaround of truncating SYSTEM tables
>>> is not sustainable for long, can someone help and let me know if there is a
>>> patch available for this? Thanks in advance.
>>>
>>> Job job = Job.getInstance(conf, NAME);
>>> // Set the target Phoenix table and the columns
>>> PhoenixMapReduceUtil.setOutput(job, tableName,
>>> "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +
>>>
>>>         "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT," +
>>>         "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS,"
>>> +
>>>
>>>         "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
>>> FileInputFormat.setInputPaths(job, inputPath);
>>> job.setMapperClass(WidgetPhoenixMapper.class);
>>> job.setMapOutputKeyClass(NullWritable.class);
>>> job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
>>> job.setOutputFormatClass(PhoenixOutputFormat.class);
>>> TableMapReduceUtil.addDependencyJars(job);
>>> job.setNumReduceTasks(0);
>>> job.waitForCompletion(true);
>>>
>>> public static class WidgetPhoenixMapper extends Mapper<LongWritable,
>>> Text, NullWritable, WidgetPagesStatsWritable> {
>>>     @Override
>>>     public void map(LongWritable longWritable, Text text, Context
>>> context) throws IOException, InterruptedException {
>>>         Configuration conf = context.getConfiguration();
>>>         String rundateString = conf.get("rundate");
>>>         PagesSegmentWidgetLineParser parser = new
>>> PagesSegmentWidgetLineParser();
>>>         try {
>>>             PagesSegmentWidget pagesSegmentWidget =
>>> parser.parse(text.toString());
>>>
>>>             if (pagesSegmentWidget != null) {
>>>                 WidgetPagesStatsWritable widgetPagesStatsWritable = new
>>> WidgetPagesStatsWritable();
>>>                 WidgetPagesStats widgetPagesStats = new
>>> WidgetPagesStats();
>>>
>>>                 widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
>>>
>>>                 widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
>>>
>>>                 widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
>>>                 …..
>>>
>>>
>>>                 widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
>>>                 context.write(NullWritable.get(),
>>> widgetPagesStatsWritable);
>>>             }
>>>
>>>         }catch (Exception e){
>>>             e.printStackTrace();
>>>         }
>>>     }
>>> }
>>>
>>> public final class WidgetPagesStats {
>>>     private String webId;
>>>     private String webPageLabel;
>>>     private long widgetInstanceId;
>>>     private String widgetType;
>>>
>>>         …
>>>     @Override
>>>     public boolean equals(Object o) {
>>>
>>>         ..
>>>     }
>>>     @Override
>>>     public int hashCode() {
>>>
>>>         ..
>>>     }
>>>     @Override
>>>     public String toString() {
>>>         return "WidgetPhoenix{“….
>>>                 '}';
>>>     }
>>> }
>>>
>>> public class WidgetPagesStatsWritable implements DBWritable, Writable {
>>>
>>>     private WidgetPagesStats widgetPagesStats;
>>>
>>>     public void readFields(DataInput input) throws IOException {
>>>         widgetPagesStats.setWebId(input.readLine());
>>>         widgetPagesStats.setWebPageLabel(input.readLine());
>>>         widgetPagesStats.setWidgetInstanceId(input.readLong());
>>>         widgetPagesStats.setWidgetType(input.readLine());
>>>
>>>         …
>>>     }
>>>
>>>     public void write(DataOutput output) throws IOException {
>>>         output.writeBytes(widgetPagesStats.getWebId());
>>>         output.writeBytes(widgetPagesStats.getWebPageLabel());
>>>
>>>         output.writeLong(widgetPagesStats.getWidgetInstanceId());
>>>         output.writeBytes(widgetPagesStats.getWidgetType());
>>>
>>>         ..
>>>     }
>>>
>>>     public void readFields(ResultSet rs) throws SQLException {
>>>         widgetPagesStats.setWebId(rs.getString("WEB_ID"));
>>>         widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
>>>
>>>         widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
>>>         widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));
>>>
>>>         …
>>>     }
>>>
>>>     public void write(PreparedStatement pstmt) throws SQLException {
>>>         Connection connection = pstmt.getConnection();
>>>         PhoenixConnection phoenixConnection = (PhoenixConnection)
>>> connection;
>>>         //connection.getClientInfo().setProperty("scn",
>>> Long.toString(widgetPhoenix.getViewDateTimestamp()));
>>>
>>>         pstmt.setString(1, widgetPagesStats.getWebId());
>>>         pstmt.setString(2, widgetPagesStats.getWebPageLabel());
>>>         pstmt.setString(3, widgetPagesStats.getDeviceType());
>>>
>>>         pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());
>>>
>>>         …
>>>     }
>>>
>>>     public WidgetPagesStats getWidgetPagesStats() {
>>>         return widgetPagesStats;
>>>     }
>>>
>>>     public void setWidgetPagesStats(WidgetPagesStats widgetPagesStats) {
>>>         this.widgetPagesStats = widgetPagesStats;
>>>     }
>>> }
>>>
>>>
>>> ------------------------------
>>> This message and any attachments are intended only for the use of the
>>> addressee and may contain information that is privileged and confidential.
>>> If the reader of the message is not the intended recipient or an authorized
>>> representative of the intended recipient, you are hereby notified that any
>>> dissemination of this communication is strictly prohibited. If you have
>>> received this communication in error, notify the sender immediately by
>>> return email and delete the message and any attachments from your system.
>>>
>>
>>
>

Mime
View raw message