You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "Thangamani, Arun" <Ar...@cdk.com> on 2016/01/16 08:14:22 UTC

Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

James/Nick D,

I just created https://issues.apache.org/jira/browse/PHOENIX-2607 with the details of the performance issue that I noticed with PhoenixMapReduceUtil.
In short, it looks like when we use a earlier timestamp in the connection of MapReduceUtil, compared to the latest data timestamp of the table after stats collection, we try to lookup/build the table definition repeatedly slowing us down tremendously

1) Also in phoenix 4.4 setting phoenix.stats.guidepost.per.region => 1 doesn’t stop the guide posts from being gathered when running MapReduceUtil
But it does stop when we just do plain upserts, so I do not have the same problem with multi threaded upserts, I have detailed that in the JIRA as well.
2) Without setting phoenix.stats.useCurrentTime => false, any form of timestamped MR upserts slow down after stats collection.

I will help in any way I can, to move this issue forward, I am a little new to the source code, so trying to work through the details of table definitions and its caching process to see, if I can spot any workaround/suggestions for a patch

Thanks
Arun


From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Wednesday, December 30, 2015 at 11:05 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hi Arun,
Yes, please file a JIRA with sample data and a program that'll demonstrate this issue. The smaller the data and simpler you can make it, the better. A unit test is ideal. Have you profiled it? With a 25x slowdown, there's bound to be something glaring.

A few questions and some comments for you:
1) Is PhoenixMapReduceUtil always the way you insert data?
2) Do your queries execute more slowly without stats being in place and what kinds of queries are you doing? You may want to leave the default settings for stats if queries become slower without them, as it doesn't sound like this is the root cause.

For the config parameters you're setting, if you want to stick with disabling stats, just set this one to 1 (I'm not sure what happens to be honest when setting it to 0):
    phoenix.stats.guidepost.per.region 1

If you set the above parameter, then phoenix.stats.guidepost.width won't be used, so there's no need to set any of these:
    phoenix.stats.useCurrentTime
    phoenix.stats.updateFrequency
    phoenix.stats.guidepost.width

Thanks,
James


On Wed, Dec 30, 2015 at 7:53 AM, Thangamani, Arun <th...@cobalt.com>> wrote:
Hey James, thanks for your previous replies, I am relatively new to phoenix codebase, so I was trying to understand my original problem a little bit better.
I can now be a little bit more specific and summary of the problem is as follows

  1.  I use PhoenixMapReduceUtil to insert data – the codebase that I use is listed below
  2.  While inserting - I am also setting a current timestamp for the connection to a specific date, the idea is to set the time-to-live in hbase table for 60 days and expire the data which is more than 60 days old
  3.  As long as I load data in an incremental fashion with respect to timestamps, I don’t have an issue
     *   That is load timestamp 100, then timestamp 110, then timestamp 120 etc the loading speed is fine (actual timestamps 1434870000000, 1435042800000, 1435215600000, 1435302000000)
     *
conf.set(PhoenixRuntime.CURRENT_SCN_ATTRIB, ""+(timestamp));
conf.set(PhoenixConfigurationUtil.CURRENT_SCN_VALUE, ""+timestamp);
  4.  But, if I end up going back in time even once that is after timestamp=120, insert timestamp=115, I end up slowing down 25x the original speed of the inserts and I am not able to recover from it (unless I delete the stats table and recreate table definition)
  5.  Executing delete on SYSTEM.STATS helps bring back some speed but it is still 20x slower without the table definition recreation
  6.  I looked through the code and set phoenix.stats.useCurrentTime to false, this is supposed to help stats ignore all the current timestamp and assume –1 as timestamp, that didn’t help either..

Overall we are using phoenix as a caching layer to hold data for 60 days to do some analytics and the querying part works well for us.
But sometimes the data doesn’t arrive in time and we may have to use a older time to insert the data, so if we can take a look at the issue that would be great.

My current settings on the region servers (I tried to avoid system.stats altogether with the settings, but couldn’t)
phoenix.stats.useCurrentTime false
phoenix.stats.updateFrequency 315360000000
phoenix.stats.guidepost.width 536870912000
phoenix.stats.guidepost.per.region 0

I can upload my code and sample data set and the table structure to a JIRA if that would help.

Thanks in advance
Arun



From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 3:54 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I can't explain that. Seems like the new guidepost width is not taking effect. Maybe there's some "special" means of updating a region server config property in HDP? When you update this parameter, do you see less guideposts created after a major compaction occurs?



On Mon, Dec 7, 2015 at 3:46 PM, Thangamani, Arun <th...@cobalt.com>> wrote:
I bounced the region servers with phoenix.stats.guidepost.width = 10737418240 (which is the max file size set from ambari)

Like Matt, I am seeing entries created in the SYSTEM.STATS table as well. Any other suggestions James?

From: Matt Kowalczyk <ma...@cloudability.com>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 2:52 PM
To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I'm sorry I poorly communicated in the previous e-mail. I meant to provide a list of things that I did. I bounced and then performed a major compaction and then ran the select count(*) query.

On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <ja...@apache.org>> wrote:
You need to bounce the cluster *before* major compaction or the region server will continue to use the old guideposts setting during compaction.

On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <ma...@cloudability.com>> wrote:
bounced, just after major compaction, with the setting as indicated above. I'm unable to disable the stats table.

select count(*) from system.stats where physical_name = 'XXXXX';
+------------------------------------------+
|                 COUNT(1)                 |
+------------------------------------------+
| 653                                      |
+------------------------------------------+
1 row selected (0.036 seconds)


On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <ja...@apache.org>> wrote:
Yes, setting that property is another way to disable stats. You'll need to bounce your cluster after setting either of these, and stats won't be updated until a major compaction occurs.


On Monday, December 7, 2015, Matt Kowalczyk <ma...@cloudability.com>> wrote:
I've set, phoenix.stats.guidepost.per.region to 1 and continue to see entries added to the system.stats table. I believe this should have the same effect? I'll try setting the guidepost width though.


On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <ja...@apache.org> wrote:
You can disable stats through setting the phoenix.stats.guidepost.width config parameter to a larger value in the server side hbase-site.xml. The default is 104857600 (or 10MB). If you set it to your MAX_FILESIZE (the size you allow a region to grow to before it splits - default 20GB), then you're essentially disabling it. You could also try increasing it somewhere in between to maybe 5 or 10GB.

Thanks,
James

On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <ma...@cloudability.com> wrote:
We're also encountering slow downs after bulk MR inserts. I've only measured slow downs in the query path (since our bulk inserts workloads vary in size it hasn't been clear that we see slow downs here but i'll now measure this as well). The subject of my reported issue was titled, "stats table causing slow queries".

the stats table seems to be re-built during compactions and and I have to actively purge the table to regain sane query times. Would be sweet if the stats feature could be disabled.

On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <th...@cobalt.com> wrote:
This is on hbase-1.1.1.2.3.0.0-2557 if that would make any difference in analysis. Thanks

From: Arun Thangamani <th...@cobalt.com>
Date: Monday, December 7, 2015 at 12:13 AM
To: "user@phoenix.apache.org" <us...@phoenix.apache.org>
Subject: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hello, I noticed an issue with bulk insert through map reduce in phoenix 4.4.0.2.3.0.0-2557, using outline of the code below

Normally the inserts of about 25 million rows complete in about 5 mins, there are 5 region servers and the phoenix table has 32 buckets
But sometimes (maybe after major compactions or region movement?), writes simply slow down to 90 mins, when I truncate SYSTEM.STATS hbase table, the inserts get a little faster (60 mins), but when I truncate both SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix table def(s) the inserts go back to 5 mins, the workaround of truncating SYSTEM tables is not sustainable for long, can someone help and let me know if there is a patch available for this? Thanks in advance.

Job job = Job.getInstance(conf, NAME);
// Set the target Phoenix table and the columns
PhoenixMapReduceUtil.setOutput(job, tableName, "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +

        "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT," +
        "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS," +
        "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
FileInputFormat.setInputPaths(job, inputPath);
job.setMapperClass(WidgetPhoenixMapper.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
job.setOutputFormatClass(PhoenixOutputFormat.class);
TableMapReduceUtil.addDependencyJars(job);
job.setNumReduceTasks(0);
job.waitForCompletion(true);

public static class WidgetPhoenixMapper extends Mapper<LongWritable, Text, NullWritable, WidgetPagesStatsWritable> {
    @Override
    public void map(LongWritable longWritable, Text text, Context context) throws IOException, InterruptedException {
        Configuration conf = context.getConfiguration();
        String rundateString = conf.get("rundate");
        PagesSegmentWidgetLineParser parser = new PagesSegmentWidgetLineParser();
        try {
            PagesSegmentWidget pagesSegmentWidget = parser.parse(text.toString());

            if (pagesSegmentWidget != null) {
                WidgetPagesStatsWritable widgetPagesStatsWritable = new WidgetPagesStatsWritable();
                WidgetPagesStats widgetPagesStats = new WidgetPagesStats();

                widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
                widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
                widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
                …..

                widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
                context.write(NullWritable.get(), widgetPagesStatsWritable);
            }

        }catch (Exception e){
            e.printStackTrace();
        }
    }
}

public final class WidgetPagesStats {
    private String webId;
    private String webPageLabel;
    private long widgetInstanceId;
    private String widgetType;

        …
    @Override
    public boolean equals(Object o) {

        ..
    }
    @Override
    public int hashCode() {

        ..
    }
    @Override
    public String toString() {
        return "WidgetPhoenix{“….
                '}';
    }
}

public class WidgetPagesStatsWritable implements DBWritable, Writable {

    private WidgetPagesStats widgetPagesStats;

    public void readFields(DataInput input) throws IOException {
        widgetPagesStats.setWebId(input.readLine());
        widgetPagesStats.setWebPageLabel(input.readLine());
        widgetPagesStats.setWidgetInstanceId(input.readLong());
        widgetPagesStats.setWidgetType(input.readLine());

        …
    }

    public void write(DataOutput output) throws IOException {
        output.writeBytes(widgetPagesStats.getWebId());
        output.writeBytes(widgetPagesStats.getWebPageLabel());

        output.writeLong(widgetPagesStats.getWidgetInstanceId());
        output.writeBytes(widgetPagesStats.getWidgetType());

        ..
    }

    public void readFields(ResultSet rs) throws SQLException {
        widgetPagesStats.setWebId(rs.getString("WEB_ID"));
        widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
        widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
        widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));

        …
    }

    public void write(PreparedStatement pstmt) throws SQLException {
        Connection connection = pstmt.getConnection();
        PhoenixConnection phoenixConnection = (PhoenixConnection) connection;
        //connection.getClientInfo().setProperty("scn", Long.toString(widgetPhoenix.getViewDateTimestamp()));

        pstmt.setString(1, widgetPagesStats.getWebId());
        pstmt.setString(2, widgetPagesStats.getWebPageLabel());
        pstmt.setString(3, widgetPagesStats.getDeviceType());

        pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());

        …
    }

    public WidgetPagesStats getWidgetPagesStats() {
        return widgetPagesStats;
    }

    public void setWidgetPagesStats(WidgetPagesStats widgetPagesStats) {
        this.widgetPagesStats = widgetPagesStats;
    }
}


________________________________
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Posted by "Thangamani, Arun" <Ar...@cdk.com>.

James/Nick D,

I just created https://issues.apache.org/jira/browse/PHOENIX-2607 with the details of the performance issue that I noticed with PhoenixMapReduceUtil.
In short, it looks like when we use a earlier timestamp in the connection of MapReduceUtil, compared to the latest data timestamp of the table after stats collection, we try to lookup/build the table definition repeatedly slowing us down tremendously

1) Also in phoenix 4.4 setting phoenix.stats.guidepost.per.region => 1 doesn’t stop the guide posts from being gathered when running MapReduceUtil
But it does stop when we just do plain upserts, so I do not have the same problem when testing with multi threaded upserts, I have detailed that in the JIRA as well.
2) Without setting phoenix.stats.useCurrentTime => false, any form of timestamped MR upserts slow down after stats collection.

I will help in any way I can, to move this issue forward, I am a little new to the source code, so trying to work through the details of table definitions and its caching process to see, if I can spot any workaround/suggestions for a patch

Thanks
Arun


From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Wednesday, December 30, 2015 at 11:05 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hi Arun,
Yes, please file a JIRA with sample data and a program that'll demonstrate this issue. The smaller the data and simpler you can make it, the better. A unit test is ideal. Have you profiled it? With a 25x slowdown, there's bound to be something glaring.

A few questions and some comments for you:
1) Is PhoenixMapReduceUtil always the way you insert data?
2) Do your queries execute more slowly without stats being in place and what kinds of queries are you doing? You may want to leave the default settings for stats if queries become slower without them, as it doesn't sound like this is the root cause.

For the config parameters you're setting, if you want to stick with disabling stats, just set this one to 1 (I'm not sure what happens to be honest when setting it to 0):
    phoenix.stats.guidepost.per.region 1

If you set the above parameter, then phoenix.stats.guidepost.width won't be used, so there's no need to set any of these:
    phoenix.stats.useCurrentTime
    phoenix.stats.updateFrequency
    phoenix.stats.guidepost.width

Thanks,
James


On Wed, Dec 30, 2015 at 7:53 AM, Thangamani, Arun <th...@cobalt.com>> wrote:
Hey James, thanks for your previous replies, I am relatively new to phoenix codebase, so I was trying to understand my original problem a little bit better.
I can now be a little bit more specific and summary of the problem is as follows

  1.  I use PhoenixMapReduceUtil to insert data – the codebase that I use is listed below
  2.  While inserting - I am also setting a current timestamp for the connection to a specific date, the idea is to set the time-to-live in hbase table for 60 days and expire the data which is more than 60 days old
  3.  As long as I load data in an incremental fashion with respect to timestamps, I don’t have an issue
     *   That is load timestamp 100, then timestamp 110, then timestamp 120 etc the loading speed is fine (actual timestamps 1434870000000, 1435042800000, 1435215600000, 1435302000000)
     *
conf.set(PhoenixRuntime.CURRENT_SCN_ATTRIB, ""+(timestamp));
conf.set(PhoenixConfigurationUtil.CURRENT_SCN_VALUE, ""+timestamp);
  4.  But, if I end up going back in time even once that is after timestamp=120, insert timestamp=115, I end up slowing down 25x the original speed of the inserts and I am not able to recover from it (unless I delete the stats table and recreate table definition)
  5.  Executing delete on SYSTEM.STATS helps bring back some speed but it is still 20x slower without the table definition recreation
  6.  I looked through the code and set phoenix.stats.useCurrentTime to false, this is supposed to help stats ignore all the current timestamp and assume –1 as timestamp, that didn’t help either..

Overall we are using phoenix as a caching layer to hold data for 60 days to do some analytics and the querying part works well for us.
But sometimes the data doesn’t arrive in time and we may have to use a older time to insert the data, so if we can take a look at the issue that would be great.

My current settings on the region servers (I tried to avoid system.stats altogether with the settings, but couldn’t)
phoenix.stats.useCurrentTime false
phoenix.stats.updateFrequency 315360000000
phoenix.stats.guidepost.width 536870912000
phoenix.stats.guidepost.per.region 0

I can upload my code and sample data set and the table structure to a JIRA if that would help.

Thanks in advance
Arun



From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 3:54 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I can't explain that. Seems like the new guidepost width is not taking effect. Maybe there's some "special" means of updating a region server config property in HDP? When you update this parameter, do you see less guideposts created after a major compaction occurs?



On Mon, Dec 7, 2015 at 3:46 PM, Thangamani, Arun <th...@cobalt.com>> wrote:
I bounced the region servers with phoenix.stats.guidepost.width = 10737418240 (which is the max file size set from ambari)

Like Matt, I am seeing entries created in the SYSTEM.STATS table as well. Any other suggestions James?

From: Matt Kowalczyk <ma...@cloudability.com>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 2:52 PM
To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I'm sorry I poorly communicated in the previous e-mail. I meant to provide a list of things that I did. I bounced and then performed a major compaction and then ran the select count(*) query.

On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <ja...@apache.org>> wrote:
You need to bounce the cluster *before* major compaction or the region server will continue to use the old guideposts setting during compaction.

On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <ma...@cloudability.com>> wrote:
bounced, just after major compaction, with the setting as indicated above. I'm unable to disable the stats table.

select count(*) from system.stats where physical_name = 'XXXXX';
+------------------------------------------+
|                 COUNT(1)                 |
+------------------------------------------+
| 653                                      |
+------------------------------------------+
1 row selected (0.036 seconds)


On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <ja...@apache.org>> wrote:
Yes, setting that property is another way to disable stats. You'll need to bounce your cluster after setting either of these, and stats won't be updated until a major compaction occurs.


On Monday, December 7, 2015, Matt Kowalczyk <ma...@cloudability.com>> wrote:
I've set, phoenix.stats.guidepost.per.region to 1 and continue to see entries added to the system.stats table. I believe this should have the same effect? I'll try setting the guidepost width though.


On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <ja...@apache.org> wrote:
You can disable stats through setting the phoenix.stats.guidepost.width config parameter to a larger value in the server side hbase-site.xml. The default is 104857600 (or 10MB). If you set it to your MAX_FILESIZE (the size you allow a region to grow to before it splits - default 20GB), then you're essentially disabling it. You could also try increasing it somewhere in between to maybe 5 or 10GB.

Thanks,
James

On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <ma...@cloudability.com> wrote:
We're also encountering slow downs after bulk MR inserts. I've only measured slow downs in the query path (since our bulk inserts workloads vary in size it hasn't been clear that we see slow downs here but i'll now measure this as well). The subject of my reported issue was titled, "stats table causing slow queries".

the stats table seems to be re-built during compactions and and I have to actively purge the table to regain sane query times. Would be sweet if the stats feature could be disabled.

On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <th...@cobalt.com> wrote:
This is on hbase-1.1.1.2.3.0.0-2557 if that would make any difference in analysis. Thanks

From: Arun Thangamani <th...@cobalt.com>
Date: Monday, December 7, 2015 at 12:13 AM
To: "user@phoenix.apache.org" <us...@phoenix.apache.org>
Subject: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hello, I noticed an issue with bulk insert through map reduce in phoenix 4.4.0.2.3.0.0-2557, using outline of the code below

Normally the inserts of about 25 million rows complete in about 5 mins, there are 5 region servers and the phoenix table has 32 buckets
But sometimes (maybe after major compactions or region movement?), writes simply slow down to 90 mins, when I truncate SYSTEM.STATS hbase table, the inserts get a little faster (60 mins), but when I truncate both SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix table def(s) the inserts go back to 5 mins, the workaround of truncating SYSTEM tables is not sustainable for long, can someone help and let me know if there is a patch available for this? Thanks in advance.

Job job = Job.getInstance(conf, NAME);
// Set the target Phoenix table and the columns
PhoenixMapReduceUtil.setOutput(job, tableName, "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +

        "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT," +
        "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS," +
        "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
FileInputFormat.setInputPaths(job, inputPath);
job.setMapperClass(WidgetPhoenixMapper.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
job.setOutputFormatClass(PhoenixOutputFormat.class);
TableMapReduceUtil.addDependencyJars(job);
job.setNumReduceTasks(0);
job.waitForCompletion(true);

public static class WidgetPhoenixMapper extends Mapper<LongWritable, Text, NullWritable, WidgetPagesStatsWritable> {
    @Override
    public void map(LongWritable longWritable, Text text, Context context) throws IOException, InterruptedException {
        Configuration conf = context.getConfiguration();
        String rundateString = conf.get("rundate");
        PagesSegmentWidgetLineParser parser = new PagesSegmentWidgetLineParser();
        try {
            PagesSegmentWidget pagesSegmentWidget = parser.parse(text.toString());

            if (pagesSegmentWidget != null) {
                WidgetPagesStatsWritable widgetPagesStatsWritable = new WidgetPagesStatsWritable();
                WidgetPagesStats widgetPagesStats = new WidgetPagesStats();

                widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
                widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
                widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
                …..

                widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
                context.write(NullWritable.get(), widgetPagesStatsWritable);
            }

        }catch (Exception e){
            e.printStackTrace();
        }
    }
}

public final class WidgetPagesStats {
    private String webId;
    private String webPageLabel;
    private long widgetInstanceId;
    private String widgetType;

        …
    @Override
    public boolean equals(Object o) {

        ..
    }
    @Override
    public int hashCode() {

        ..
    }
    @Override
    public String toString() {
        return "WidgetPhoenix{“….
                '}';
    }
}

public class WidgetPagesStatsWritable implements DBWritable, Writable {

    private WidgetPagesStats widgetPagesStats;

    public void readFields(DataInput input) throws IOException {
        widgetPagesStats.setWebId(input.readLine());
        widgetPagesStats.setWebPageLabel(input.readLine());
        widgetPagesStats.setWidgetInstanceId(input.readLong());
        widgetPagesStats.setWidgetType(input.readLine());

        …
    }

    public void write(DataOutput output) throws IOException {
        output.writeBytes(widgetPagesStats.getWebId());
        output.writeBytes(widgetPagesStats.getWebPageLabel());

        output.writeLong(widgetPagesStats.getWidgetInstanceId());
        output.writeBytes(widgetPagesStats.getWidgetType());

        ..
    }

    public void readFields(ResultSet rs) throws SQLException {
        widgetPagesStats.setWebId(rs.getString("WEB_ID"));
        widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
        widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
        widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));

        …
    }

    public void write(PreparedStatement pstmt) throws SQLException {
        Connection connection = pstmt.getConnection();
        PhoenixConnection phoenixConnection = (PhoenixConnection) connection;
        //connection.getClientInfo().setProperty("scn", Long.toString(widgetPhoenix.getViewDateTimestamp()));

        pstmt.setString(1, widgetPagesStats.getWebId());
        pstmt.setString(2, widgetPagesStats.getWebPageLabel());
        pstmt.setString(3, widgetPagesStats.getDeviceType());

        pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());

        …
    }

    public WidgetPagesStats getWidgetPagesStats() {
        return widgetPagesStats;
    }

    public void setWidgetPagesStats(WidgetPagesStats widgetPagesStats) {
        this.widgetPagesStats = widgetPagesStats;
    }
}


________________________________
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Posted by "Thangamani, Arun" <Ar...@cdk.com>.

James/Nick D,

I just created https://issues.apache.org/jira/browse/PHOENIX-2607 with the details of the performance issue that I noticed with PhoenixMapReduceUtil.
In short, it looks like when we use a earlier timestamp in the connection of MapReduceUtil, compared to the latest data timestamp of the table after stats collection, we try to lookup/build the table definition repeatedly slowing us down tremendously

1) Also in phoenix 4.4 setting phoenix.stats.guidepost.per.region => 1 doesn’t stop the guide posts from being gathered when running MapReduceUtil
But it does stop when we just do plain upserts, so I do not have the same problem when testing with multi threaded upserts, I have detailed that in the JIRA as well.
2) Without setting phoenix.stats.useCurrentTime => false, any form of timestamped MR upserts slow down after stats collection.

I will help in any way I can, to move this issue forward, I am a little new to the source code, so trying to work through the details of table definitions and its caching process to see, if I can spot any workaround/suggestions for a patch

Thanks
Arun


From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Wednesday, December 30, 2015 at 11:05 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hi Arun,
Yes, please file a JIRA with sample data and a program that'll demonstrate this issue. The smaller the data and simpler you can make it, the better. A unit test is ideal. Have you profiled it? With a 25x slowdown, there's bound to be something glaring.

A few questions and some comments for you:
1) Is PhoenixMapReduceUtil always the way you insert data?
2) Do your queries execute more slowly without stats being in place and what kinds of queries are you doing? You may want to leave the default settings for stats if queries become slower without them, as it doesn't sound like this is the root cause.

For the config parameters you're setting, if you want to stick with disabling stats, just set this one to 1 (I'm not sure what happens to be honest when setting it to 0):
    phoenix.stats.guidepost.per.region 1

If you set the above parameter, then phoenix.stats.guidepost.width won't be used, so there's no need to set any of these:
    phoenix.stats.useCurrentTime
    phoenix.stats.updateFrequency
    phoenix.stats.guidepost.width

Thanks,
James


On Wed, Dec 30, 2015 at 7:53 AM, Thangamani, Arun <th...@cobalt.com>> wrote:
Hey James, thanks for your previous replies, I am relatively new to phoenix codebase, so I was trying to understand my original problem a little bit better.
I can now be a little bit more specific and summary of the problem is as follows

  1.  I use PhoenixMapReduceUtil to insert data – the codebase that I use is listed below
  2.  While inserting - I am also setting a current timestamp for the connection to a specific date, the idea is to set the time-to-live in hbase table for 60 days and expire the data which is more than 60 days old
  3.  As long as I load data in an incremental fashion with respect to timestamps, I don’t have an issue
     *   That is load timestamp 100, then timestamp 110, then timestamp 120 etc the loading speed is fine (actual timestamps 1434870000000, 1435042800000, 1435215600000, 1435302000000)
     *
conf.set(PhoenixRuntime.CURRENT_SCN_ATTRIB, ""+(timestamp));
conf.set(PhoenixConfigurationUtil.CURRENT_SCN_VALUE, ""+timestamp);
  4.  But, if I end up going back in time even once that is after timestamp=120, insert timestamp=115, I end up slowing down 25x the original speed of the inserts and I am not able to recover from it (unless I delete the stats table and recreate table definition)
  5.  Executing delete on SYSTEM.STATS helps bring back some speed but it is still 20x slower without the table definition recreation
  6.  I looked through the code and set phoenix.stats.useCurrentTime to false, this is supposed to help stats ignore all the current timestamp and assume –1 as timestamp, that didn’t help either..

Overall we are using phoenix as a caching layer to hold data for 60 days to do some analytics and the querying part works well for us.
But sometimes the data doesn’t arrive in time and we may have to use a older time to insert the data, so if we can take a look at the issue that would be great.

My current settings on the region servers (I tried to avoid system.stats altogether with the settings, but couldn’t)
phoenix.stats.useCurrentTime false
phoenix.stats.updateFrequency 315360000000
phoenix.stats.guidepost.width 536870912000
phoenix.stats.guidepost.per.region 0

I can upload my code and sample data set and the table structure to a JIRA if that would help.

Thanks in advance
Arun



From: James Taylor <ja...@apache.org>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 3:54 PM
To: user <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I can't explain that. Seems like the new guidepost width is not taking effect. Maybe there's some "special" means of updating a region server config property in HDP? When you update this parameter, do you see less guideposts created after a major compaction occurs?



On Mon, Dec 7, 2015 at 3:46 PM, Thangamani, Arun <th...@cobalt.com>> wrote:
I bounced the region servers with phoenix.stats.guidepost.width = 10737418240 (which is the max file size set from ambari)

Like Matt, I am seeing entries created in the SYSTEM.STATS table as well. Any other suggestions James?

From: Matt Kowalczyk <ma...@cloudability.com>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, December 7, 2015 at 2:52 PM
To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Subject: Re: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

I'm sorry I poorly communicated in the previous e-mail. I meant to provide a list of things that I did. I bounced and then performed a major compaction and then ran the select count(*) query.

On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <ja...@apache.org>> wrote:
You need to bounce the cluster *before* major compaction or the region server will continue to use the old guideposts setting during compaction.

On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <ma...@cloudability.com>> wrote:
bounced, just after major compaction, with the setting as indicated above. I'm unable to disable the stats table.

select count(*) from system.stats where physical_name = 'XXXXX';
+------------------------------------------+
|                 COUNT(1)                 |
+------------------------------------------+
| 653                                      |
+------------------------------------------+
1 row selected (0.036 seconds)


On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <ja...@apache.org>> wrote:
Yes, setting that property is another way to disable stats. You'll need to bounce your cluster after setting either of these, and stats won't be updated until a major compaction occurs.


On Monday, December 7, 2015, Matt Kowalczyk <ma...@cloudability.com>> wrote:
I've set, phoenix.stats.guidepost.per.region to 1 and continue to see entries added to the system.stats table. I believe this should have the same effect? I'll try setting the guidepost width though.


On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <ja...@apache.org> wrote:
You can disable stats through setting the phoenix.stats.guidepost.width config parameter to a larger value in the server side hbase-site.xml. The default is 104857600 (or 10MB). If you set it to your MAX_FILESIZE (the size you allow a region to grow to before it splits - default 20GB), then you're essentially disabling it. You could also try increasing it somewhere in between to maybe 5 or 10GB.

Thanks,
James

On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk <ma...@cloudability.com> wrote:
We're also encountering slow downs after bulk MR inserts. I've only measured slow downs in the query path (since our bulk inserts workloads vary in size it hasn't been clear that we see slow downs here but i'll now measure this as well). The subject of my reported issue was titled, "stats table causing slow queries".

the stats table seems to be re-built during compactions and and I have to actively purge the table to regain sane query times. Would be sweet if the stats feature could be disabled.

On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <th...@cobalt.com> wrote:
This is on hbase-1.1.1.2.3.0.0-2557 if that would make any difference in analysis. Thanks

From: Arun Thangamani <th...@cobalt.com>
Date: Monday, December 7, 2015 at 12:13 AM
To: "user@phoenix.apache.org" <us...@phoenix.apache.org>
Subject: system.catalog and system.stats entries slows down bulk MR inserts by 20-25X (Phoenix 4.4)

Hello, I noticed an issue with bulk insert through map reduce in phoenix 4.4.0.2.3.0.0-2557, using outline of the code below

Normally the inserts of about 25 million rows complete in about 5 mins, there are 5 region servers and the phoenix table has 32 buckets
But sometimes (maybe after major compactions or region movement?), writes simply slow down to 90 mins, when I truncate SYSTEM.STATS hbase table, the inserts get a little faster (60 mins), but when I truncate both SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix table def(s) the inserts go back to 5 mins, the workaround of truncating SYSTEM tables is not sustainable for long, can someone help and let me know if there is a patch available for this? Thanks in advance.

Job job = Job.getInstance(conf, NAME);
// Set the target Phoenix table and the columns
PhoenixMapReduceUtil.setOutput(job, tableName, "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," +

        "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT," +
        "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS," +
        "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER");
FileInputFormat.setInputPaths(job, inputPath);
job.setMapperClass(WidgetPhoenixMapper.class);
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(WidgetPagesStatsWritable.class);
job.setOutputFormatClass(PhoenixOutputFormat.class);
TableMapReduceUtil.addDependencyJars(job);
job.setNumReduceTasks(0);
job.waitForCompletion(true);

public static class WidgetPhoenixMapper extends Mapper<LongWritable, Text, NullWritable, WidgetPagesStatsWritable> {
    @Override
    public void map(LongWritable longWritable, Text text, Context context) throws IOException, InterruptedException {
        Configuration conf = context.getConfiguration();
        String rundateString = conf.get("rundate");
        PagesSegmentWidgetLineParser parser = new PagesSegmentWidgetLineParser();
        try {
            PagesSegmentWidget pagesSegmentWidget = parser.parse(text.toString());

            if (pagesSegmentWidget != null) {
                WidgetPagesStatsWritable widgetPagesStatsWritable = new WidgetPagesStatsWritable();
                WidgetPagesStats widgetPagesStats = new WidgetPagesStats();

                widgetPagesStats.setWebId(pagesSegmentWidget.getWebId());
                widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel());
                widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId());
                …..

                widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats);
                context.write(NullWritable.get(), widgetPagesStatsWritable);
            }

        }catch (Exception e){
            e.printStackTrace();
        }
    }
}

public final class WidgetPagesStats {
    private String webId;
    private String webPageLabel;
    private long widgetInstanceId;
    private String widgetType;

        …
    @Override
    public boolean equals(Object o) {

        ..
    }
    @Override
    public int hashCode() {

        ..
    }
    @Override
    public String toString() {
        return "WidgetPhoenix{“….
                '}';
    }
}

public class WidgetPagesStatsWritable implements DBWritable, Writable {

    private WidgetPagesStats widgetPagesStats;

    public void readFields(DataInput input) throws IOException {
        widgetPagesStats.setWebId(input.readLine());
        widgetPagesStats.setWebPageLabel(input.readLine());
        widgetPagesStats.setWidgetInstanceId(input.readLong());
        widgetPagesStats.setWidgetType(input.readLine());

        …
    }

    public void write(DataOutput output) throws IOException {
        output.writeBytes(widgetPagesStats.getWebId());
        output.writeBytes(widgetPagesStats.getWebPageLabel());

        output.writeLong(widgetPagesStats.getWidgetInstanceId());
        output.writeBytes(widgetPagesStats.getWidgetType());

        ..
    }

    public void readFields(ResultSet rs) throws SQLException {
        widgetPagesStats.setWebId(rs.getString("WEB_ID"));
        widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL"));
        widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID"));
        widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE"));

        …
    }

    public void write(PreparedStatement pstmt) throws SQLException {
        Connection connection = pstmt.getConnection();
        PhoenixConnection phoenixConnection = (PhoenixConnection) connection;
        //connection.getClientInfo().setProperty("scn", Long.toString(widgetPhoenix.getViewDateTimestamp()));

        pstmt.setString(1, widgetPagesStats.getWebId());
        pstmt.setString(2, widgetPagesStats.getWebPageLabel());
        pstmt.setString(3, widgetPagesStats.getDeviceType());

        pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId());

        …
    }

    public WidgetPagesStats getWidgetPagesStats() {
        return widgetPagesStats;
    }

    public void setWidgetPagesStats(WidgetPagesStats widgetPagesStats) {
        this.widgetPagesStats = widgetPagesStats;
    }
}


________________________________
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.