You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by foryou2030 <gi...@git.apache.org> on 2016/09/02 10:29:42 UTC

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear query statist...

GitHub user foryou2030 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/123

    [CARBONDATA-204] Clear query statistics map when timeout

    # Why raise this pr?
    I found Query statistics issue:
    1. some query statistics that never be printed will be keeped into querystatisticsMap, so it will cause "out of memory" for long time running
    2. in some sceniaro, the driver can't record "sql_parse_time" , the driver statistics logs will not be output, we should output  block_allocation_time and block_identification_time always.
    # How to solve?
    1. add function to check querystatistics timeout , once timeout, remove the queryId from the map.
    2.add conditional detection for queryStatisticsMap size, if the queryStatistic only contain block_allocation_time and block_identification_time, then ouput them.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/foryou2030/incubator-carbondata fix_stat

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/123.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #123
    
----
commit e10868a5154ccc15196b23428db09005c3affc85
Author: foryou2030 <fo...@126.com>
Date:   2016-09-02T10:22:03Z

    clear query statistics map

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by foryou2030 <gi...@git.apache.org>.
Github user foryou2030 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77511283
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +83,148 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      Iterator<Map.Entry<String, List<QueryStatistic>>> entries =
    +              queryStatisticsMap.entrySet().iterator();
    +      while (entries.hasNext()) {
    +        Map.Entry<String, List<QueryStatistic>> entry = entries.next();
    +        String queryId = entry.getKey();
    +        // clear the unknown query statistics
    +        if(StringUtils.isEmpty(queryId)) {
    +          queryStatisticsMap.remove(queryId);
    --- End diff --
    
    ok, handled


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/123


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by foryou2030 <gi...@git.apache.org>.
Github user foryou2030 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77506707
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +83,148 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      Iterator<Map.Entry<String, List<QueryStatistic>>> entries =
    +              queryStatisticsMap.entrySet().iterator();
    +      while (entries.hasNext()) {
    +        Map.Entry<String, List<QueryStatistic>> entry = entries.next();
    +        String queryId = entry.getKey();
    +        // clear the unknown query statistics
    +        if(StringUtils.isEmpty(queryId)) {
    +          queryStatisticsMap.remove(queryId);
    --- End diff --
    
    I tried this, but caused some exceptions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by foryou2030 <gi...@git.apache.org>.
Github user foryou2030 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77488614
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +82,142 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      for (String key: queryStatisticsMap.keySet()) {
    +        // print sql_parse_t,load_meta_t,block_allocation_t,block_identification_t
    +        // or just print block_allocation_t,block_identification_t
    +        if (queryStatisticsMap.get(key).size() >= 2) {
    +          String tableInfo = collectDriverStatistics(key);
    +          if (null != tableInfo) {
    +            LOGGER.statistic(tableInfo);
    +          }
    +        }
    +        // clear timeout query statistics
    +        if(StringUtils.isEmpty(key)) {
    +          queryStatisticsMap.remove(key);
    +        } else {
    +          long interval = System.nanoTime() - Long.parseLong(key);
    +          if (interval > QueryStatisticsConstants.CLEAR_STATISTICS_TIMEOUT) {
    +            queryStatisticsMap.remove(key);
    +          }
    +        }
           }
         }
       }
     
       /**
        * Below method will parse queryStatisticsMap and put time into table
        */
    -  public String collectDriverStatistics() {
    -    for (String key: queryStatisticsMap.keySet()) {
    -      try {
    -        // TODO: get the finished query, and print Statistics
    -        if (queryStatisticsMap.get(key).size() > 3) {
    -          String sql_parse_time = "";
    -          String load_meta_time = "";
    -          String block_allocation_time = "";
    -          String block_identification_time = "";
    -          Double driver_part_time_tmp = 0.0;
    -          String splitChar = " ";
    -          // get statistic time from the QueryStatistic
    -          for (QueryStatistic statistic : queryStatisticsMap.get(key)) {
    -            switch (statistic.getMessage()) {
    -              case QueryStatisticsConstants.SQL_PARSE:
    -                sql_parse_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.LOAD_META:
    -                load_meta_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.BLOCK_ALLOCATION:
    -                block_allocation_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.BLOCK_IDENTIFICATION:
    -                block_identification_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              default:
    -                break;
    -            }
    -          }
    -          String driver_part_time = driver_part_time_tmp + splitChar;
    -          // structure the query statistics info table
    -          StringBuilder tableInfo = new StringBuilder();
    -          int len1 = 8;
    -          int len2 = 20;
    -          int len3 = 21;
    -          int len4 = 22;
    -          String line = "+" + printLine("-", len1) + "+" + printLine("-", len2) + "+" +
    -              printLine("-", len3) + "+" + printLine("-", len4) + "+";
    -          String line2 = "|" + printLine(" ", len1) + "+" + printLine("-", len2) + "+" +
    -              printLine(" ", len3) + "+" + printLine("-", len4) + "+";
    -          // table header
    -          tableInfo.append(line).append("\n");
    -          tableInfo.append("|" + printLine(" ", (len1 - "Module".length())) + "Module" + "|" +
    -              printLine(" ", (len2 - "Operation Step".length())) + "Operation Step" + "|" +
    -              printLine(" ", (len3 + len4 + 1 - "Query Cost".length())) +
    -              "Query Cost" + "|" + "\n");
    -          // driver part
    -          tableInfo.append(line).append("\n");
    -          tableInfo.append("|" + printLine(" ", len1) + "|" +
    -              printLine(" ", (len2 - "SQL parse".length())) + "SQL parse" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - sql_parse_time.length())) + sql_parse_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +printLine(" ", (len1 - "Driver".length())) + "Driver" + "|" +
    -              printLine(" ", (len2 - "Load meta data".length())) + "Load meta data" + "|" +
    -              printLine(" ", (len3 - driver_part_time.length())) + driver_part_time + "|" +
    -              printLine(" ", (len4 - load_meta_time.length())) +
    -              load_meta_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +
    -              printLine(" ", (len1 - "Part".length())) + "Part" + "|" +
    -              printLine(" ", (len2 - "Block allocation".length())) +
    -              "Block allocation" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - block_allocation_time.length())) +
    -              block_allocation_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +
    -              printLine(" ", len1) + "|" +
    -              printLine(" ", (len2 - "Block identification".length())) +
    -              "Block identification" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - block_identification_time.length())) +
    -              block_identification_time + "|" + "\n");
    -          tableInfo.append(line).append("\n");
    -
    -          // once the statistics be printed, remove it from the map
    -          queryStatisticsMap.remove(key);
    -          // show query statistic as "query id" + "table"
    -          return "Print query statistic for query id: " + key + "\n" + tableInfo.toString();
    +  public String collectDriverStatistics(String key) {
    +    String sql_parse_time = "";
    +    String load_meta_time = "";
    +    String block_allocation_time = "";
    +    String block_identification_time = "";
    +    Double driver_part_time_tmp = 0.0;
    +    Double driver_part_time_tmp2 = 0.0;
    +    String splitChar = " ";
    +    try {
    +      // get statistic time from the QueryStatistic
    +      for (QueryStatistic statistic : queryStatisticsMap.get(key)) {
    --- End diff --
    
    ok, handled, pls check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by Vimal-Das <gi...@git.apache.org>.
Github user Vimal-Das commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77435539
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +82,142 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      for (String key: queryStatisticsMap.keySet()) {
    +        // print sql_parse_t,load_meta_t,block_allocation_t,block_identification_t
    +        // or just print block_allocation_t,block_identification_t
    +        if (queryStatisticsMap.get(key).size() >= 2) {
    --- End diff --
    
    The call can return null, because the keyset obtained reflects the state of map when the .keySet() method was called, later changes in the map will not be reflected. So, if the element is removed in the meantime, get() can return null.
    
    Solution: use an iterator over entrySet()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by Vimal-Das <gi...@git.apache.org>.
Github user Vimal-Das commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77435578
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +82,142 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      for (String key: queryStatisticsMap.keySet()) {
    +        // print sql_parse_t,load_meta_t,block_allocation_t,block_identification_t
    +        // or just print block_allocation_t,block_identification_t
    +        if (queryStatisticsMap.get(key).size() >= 2) {
    +          String tableInfo = collectDriverStatistics(key);
    +          if (null != tableInfo) {
    +            LOGGER.statistic(tableInfo);
    +          }
    +        }
    +        // clear timeout query statistics
    +        if(StringUtils.isEmpty(key)) {
    +          queryStatisticsMap.remove(key);
    +        } else {
    +          long interval = System.nanoTime() - Long.parseLong(key);
    +          if (interval > QueryStatisticsConstants.CLEAR_STATISTICS_TIMEOUT) {
    +            queryStatisticsMap.remove(key);
    +          }
    +        }
           }
         }
       }
     
       /**
        * Below method will parse queryStatisticsMap and put time into table
        */
    -  public String collectDriverStatistics() {
    -    for (String key: queryStatisticsMap.keySet()) {
    -      try {
    -        // TODO: get the finished query, and print Statistics
    -        if (queryStatisticsMap.get(key).size() > 3) {
    -          String sql_parse_time = "";
    -          String load_meta_time = "";
    -          String block_allocation_time = "";
    -          String block_identification_time = "";
    -          Double driver_part_time_tmp = 0.0;
    -          String splitChar = " ";
    -          // get statistic time from the QueryStatistic
    -          for (QueryStatistic statistic : queryStatisticsMap.get(key)) {
    -            switch (statistic.getMessage()) {
    -              case QueryStatisticsConstants.SQL_PARSE:
    -                sql_parse_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.LOAD_META:
    -                load_meta_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.BLOCK_ALLOCATION:
    -                block_allocation_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              case QueryStatisticsConstants.BLOCK_IDENTIFICATION:
    -                block_identification_time += statistic.getTimeTaken() + splitChar;
    -                driver_part_time_tmp += statistic.getTimeTaken();
    -                break;
    -              default:
    -                break;
    -            }
    -          }
    -          String driver_part_time = driver_part_time_tmp + splitChar;
    -          // structure the query statistics info table
    -          StringBuilder tableInfo = new StringBuilder();
    -          int len1 = 8;
    -          int len2 = 20;
    -          int len3 = 21;
    -          int len4 = 22;
    -          String line = "+" + printLine("-", len1) + "+" + printLine("-", len2) + "+" +
    -              printLine("-", len3) + "+" + printLine("-", len4) + "+";
    -          String line2 = "|" + printLine(" ", len1) + "+" + printLine("-", len2) + "+" +
    -              printLine(" ", len3) + "+" + printLine("-", len4) + "+";
    -          // table header
    -          tableInfo.append(line).append("\n");
    -          tableInfo.append("|" + printLine(" ", (len1 - "Module".length())) + "Module" + "|" +
    -              printLine(" ", (len2 - "Operation Step".length())) + "Operation Step" + "|" +
    -              printLine(" ", (len3 + len4 + 1 - "Query Cost".length())) +
    -              "Query Cost" + "|" + "\n");
    -          // driver part
    -          tableInfo.append(line).append("\n");
    -          tableInfo.append("|" + printLine(" ", len1) + "|" +
    -              printLine(" ", (len2 - "SQL parse".length())) + "SQL parse" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - sql_parse_time.length())) + sql_parse_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +printLine(" ", (len1 - "Driver".length())) + "Driver" + "|" +
    -              printLine(" ", (len2 - "Load meta data".length())) + "Load meta data" + "|" +
    -              printLine(" ", (len3 - driver_part_time.length())) + driver_part_time + "|" +
    -              printLine(" ", (len4 - load_meta_time.length())) +
    -              load_meta_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +
    -              printLine(" ", (len1 - "Part".length())) + "Part" + "|" +
    -              printLine(" ", (len2 - "Block allocation".length())) +
    -              "Block allocation" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - block_allocation_time.length())) +
    -              block_allocation_time + "|" + "\n");
    -          tableInfo.append(line2).append("\n");
    -          tableInfo.append("|" +
    -              printLine(" ", len1) + "|" +
    -              printLine(" ", (len2 - "Block identification".length())) +
    -              "Block identification" + "|" +
    -              printLine(" ", len3) + "|" +
    -              printLine(" ", (len4 - block_identification_time.length())) +
    -              block_identification_time + "|" + "\n");
    -          tableInfo.append(line).append("\n");
    -
    -          // once the statistics be printed, remove it from the map
    -          queryStatisticsMap.remove(key);
    -          // show query statistic as "query id" + "table"
    -          return "Print query statistic for query id: " + key + "\n" + tableInfo.toString();
    +  public String collectDriverStatistics(String key) {
    +    String sql_parse_time = "";
    +    String load_meta_time = "";
    +    String block_allocation_time = "";
    +    String block_identification_time = "";
    +    Double driver_part_time_tmp = 0.0;
    +    Double driver_part_time_tmp2 = 0.0;
    +    String splitChar = " ";
    +    try {
    +      // get statistic time from the QueryStatistic
    +      for (QueryStatistic statistic : queryStatisticsMap.get(key)) {
    --- End diff --
    
    null possibility here also, pass entry.getValue() instead, once you iterate over entrySet() in the calling method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by foryou2030 <gi...@git.apache.org>.
Github user foryou2030 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77488586
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +82,142 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      for (String key: queryStatisticsMap.keySet()) {
    +        // print sql_parse_t,load_meta_t,block_allocation_t,block_identification_t
    +        // or just print block_allocation_t,block_identification_t
    +        if (queryStatisticsMap.get(key).size() >= 2) {
    --- End diff --
    
    ok, thanks
    handled, pls check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #123: [CARBONDATA-204] Clear queryStatisti...

Posted by Vimal-Das <gi...@git.apache.org>.
Github user Vimal-Das commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/123#discussion_r77496731
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/carbon/querystatistics/DriverQueryStatisticsRecorder.java ---
    @@ -78,106 +83,148 @@ public synchronized void recordStatisticsForDriver(QueryStatistic statistic, Str
        */
       public void logStatisticsAsTableDriver() {
         synchronized (lock) {
    -      String tableInfo = collectDriverStatistics();
    -      if (null != tableInfo) {
    -        LOGGER.statistic(tableInfo);
    +      Iterator<Map.Entry<String, List<QueryStatistic>>> entries =
    +              queryStatisticsMap.entrySet().iterator();
    +      while (entries.hasNext()) {
    +        Map.Entry<String, List<QueryStatistic>> entry = entries.next();
    +        String queryId = entry.getKey();
    +        // clear the unknown query statistics
    +        if(StringUtils.isEmpty(queryId)) {
    +          queryStatisticsMap.remove(queryId);
    --- End diff --
    
    use the Iterator.remove() for better safety


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---