You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by hbdeshmukh <gi...@git.apache.org> on 2016/06/16 19:13:00 UTC

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

GitHub user hbdeshmukh opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/36

    QUICKSTEP-24 Report generation support for work order execution statistics

    This PR provides an initial version of the support required to report individual work order execution performance. Some highlights are described below: 
    
    - A flag to enable work order profiling report generation.
    - At the end of each query, a report is generated which includes worker
      ID, its NUMA socket, the operator that produced the WorkOrder and the
      execution time in microseconds.
    - The output is printed on stdout in CSV format as of now.
    
    As this is a rudimentary support for the functionality, there is a lot of future work in this regards, which includes printing of CPU core information, printing operator name, allowing user to specify a file where the output can be written, extend the support for rebuild WorkOrder statistics etc.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-quickstep workorder-time-reporting

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/36.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #36
    
----
commit 3d1a4be275d84ae10236877e1a3e85006fe70159
Author: Harshad Deshmukh <hb...@apache.org>
Date:   2016-06-16T19:03:34Z

    Basic support to report individual work order profiling results
    
    - A flag to enable work order profiling report generation.
    - At the end of each query, a report is generated which includes worker
      ID, its NUMA socket, the operator that produced the WorkOrder and the
      execution time in microseconds.
    - The output is printed on stdout in CSV format as of now.
    - As this is a rudimentary support for the functionality, there is a lot of
      future work in this regards, which includes printing of CPU core information,
      printing operator name, allowing user to specify a file where the output can
      be written etc.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by zuyu <gi...@git.apache.org>.
Github user zuyu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/36#discussion_r67460146
  
    --- Diff: CMakeLists.txt ---
    @@ -185,6 +185,11 @@ if (ENABLE_GOOGLE_PROFILER)
       set(LIBS ${LIBS} profiler)
     endif()
     
    +# This option enables profiling individual work order performances. If enabled, 
    +# at the end of each query execution, we print on stdout in CSV format the 
    +# statistics about every work order that was executed. 
    +option(ENABLE_WORKORDER_PROFILING "Enable profiling individual WorkOrder execution" OFF)
    --- End diff --
    
    Cool.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-quickstep/pull/36


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by zuyu <gi...@git.apache.org>.
Github user zuyu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/36#discussion_r67461549
  
    --- Diff: query_execution/Foreman.cpp ---
    @@ -229,4 +229,22 @@ void Foreman::sendWorkerMessage(const size_t worker_thread_index,
           << worker_directory_->getClientID(worker_thread_index);
     }
     
    +#ifdef QUICKSTEP_ENABLE_WORKORDER_PROFILING
    +void Foreman::printWorkOrderProfilingResults(std::FILE *out) const {
    +  const std::vector<
    +      std::tuple<std::size_t, std::size_t, std::size_t, std::size_t>>
    +      &recorded_times = policy_enforcer_->getProfilingResults();
    +  fputs("Worker ID, NUMA Socket, Query ID, Operator ID, Time (microseconds)\n", out);
    +  for (auto workorder_entry : recorded_times) {
    +    // Note: Index of the "worker thread index" in the tuple is 0.
    +    const std::size_t worker_id = std::get<0>(workorder_entry);
    +    fprintf(out, "%lu,", worker_id);
    +    fprintf(out, "%d,", worker_directory_->getNUMANode(worker_id));
    +    fprintf(out, "%lu,", std::get<1>(workorder_entry));
    +    fprintf(out, "%lu,", std::get<2>(workorder_entry));
    +    fprintf(out, "%lu\n", std::get<3>(workorder_entry));
    --- End diff --
    
    How about merging these `fprintf`s into one?
    
    Although this is not in a critical path, we still save tons of system calls.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/36#discussion_r67450274
  
    --- Diff: CMakeLists.txt ---
    @@ -185,6 +185,11 @@ if (ENABLE_GOOGLE_PROFILER)
       set(LIBS ${LIBS} profiler)
     endif()
     
    +# This option enables profiling individual work order performances. If enabled, 
    +# at the end of each query execution, we print on stdout in CSV format the 
    +# statistics about every work order that was executed. 
    +option(ENABLE_WORKORDER_PROFILING "Enable profiling individual WorkOrder execution" OFF)
    --- End diff --
    
    Hi @zuyu 
    
    Jianqiao and I had a chat about this and we agreed that gflags should be a better option instead of compile time flags. I will make that change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by pateljm <gi...@git.apache.org>.
Github user pateljm commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/36#discussion_r67605754
  
    --- Diff: query_execution/Foreman.cpp ---
    @@ -229,4 +229,22 @@ void Foreman::sendWorkerMessage(const size_t worker_thread_index,
           << worker_directory_->getClientID(worker_thread_index);
     }
     
    +#ifdef QUICKSTEP_ENABLE_WORKORDER_PROFILING
    +void Foreman::printWorkOrderProfilingResults(std::FILE *out) const {
    +  const std::vector<
    +      std::tuple<std::size_t, std::size_t, std::size_t, std::size_t>>
    +      &recorded_times = policy_enforcer_->getProfilingResults();
    +  fputs("Worker ID, NUMA Socket, Query ID, Operator ID, Time (microseconds)\n", out);
    +  for (auto workorder_entry : recorded_times) {
    +    // Note: Index of the "worker thread index" in the tuple is 0.
    +    const std::size_t worker_id = std::get<0>(workorder_entry);
    +    fprintf(out, "%lu,", worker_id);
    +    fprintf(out, "%d,", worker_directory_->getNUMANode(worker_id));
    +    fprintf(out, "%lu,", std::get<1>(workorder_entry));
    +    fprintf(out, "%lu,", std::get<2>(workorder_entry));
    +    fprintf(out, "%lu\n", std::get<3>(workorder_entry));
    --- End diff --
    
    @zuyu makes a good point. Would be ok to do that in a separate PR, if it is too much to do in this one. Thanks @hbdeshmukh for this feature! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #36: QUICKSTEP-24 Report generation support for wo...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/36
  
    Hi @pateljm 
    
    Rebased with master. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #36: QUICKSTEP-24 Report generation support...

Posted by zuyu <gi...@git.apache.org>.
Github user zuyu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/36#discussion_r67445043
  
    --- Diff: CMakeLists.txt ---
    @@ -185,6 +185,11 @@ if (ENABLE_GOOGLE_PROFILER)
       set(LIBS ${LIBS} profiler)
     endif()
     
    +# This option enables profiling individual work order performances. If enabled, 
    +# at the end of each query execution, we print on stdout in CSV format the 
    +# statistics about every work order that was executed. 
    +option(ENABLE_WORKORDER_PROFILING "Enable profiling individual WorkOrder execution" OFF)
    --- End diff --
    
    What do you think of using `gflags` in the run-time, instead of this compilation time flag? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #36: QUICKSTEP-24 Report generation support for wo...

Posted by pateljm <gi...@git.apache.org>.
Github user pateljm commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/36
  
    @hbdeshmukh This is a neat feature. LGTM. Will merge once Travis comes back green. (BTW, we don't need to edit the copyright files anymore, but don't worry about it in this PR.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #36: QUICKSTEP-24 Report generation support for wo...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/36
  
    Hi @pateljm 
    
    I have made the requested changes. The API to get the stats is changed a bit. Now the functions ``printWorkOrderProfilingResults()`` and ``getProfilingResults()`` require a query ID to be passed, so that we can print results for queries with specific ID. Typically, in our experiments we ignore the first run of the query, so we can use the profiling results for subsequent runs. 
    
    I have performed basic testing on the command line with more than one queries as input and it worked alright. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #36: QUICKSTEP-24 Report generation support for wo...

Posted by pateljm <gi...@git.apache.org>.
Github user pateljm commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/36
  
    @hbdeshmukh Ready to merge, but need to rebase this ahead of the master. Let's close this one first before PR #38.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---