You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by GitBox <gi...@apache.org> on 2022/08/17 15:49:07 UTC

[GitHub] [orc] coderex2522 opened a new pull request, #1223: ORC-1252:[C++] Expose IO metrics for write operation

coderex2522 opened a new pull request, #1223:
URL: https://github.com/apache/orc/pull/1223

   
   ### What changes were proposed in this pull request?
   The pull request will add IO metrics for the write operation. The csv-import tool can display relevant IO metrics.
   
   ### Why are the changes needed?
   This patch can expose the metrics information related to the process of writing ORC files.
   
   
   ### How was this patch tested?
   The csv-import tool with option -m  can be used for testing.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun closed pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation
URL: https://github.com/apache/orc/pull/1223


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] coderex2522 commented on pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
coderex2522 commented on PR #1223:
URL: https://github.com/apache/orc/pull/1223#issuecomment-1218997602

   > * Do we need to consider the perf regression in this PR?
   
   IO metrics for write operation reuse the macro definition SCOPED_STOPWATCH. By default the build option BUILD_CPP_ENABLE_METRICS is off, so there is no perf regression here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] coderex2522 commented on a diff in pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
coderex2522 commented on code in PR #1223:
URL: https://github.com/apache/orc/pull/1223#discussion_r950107415


##########
c++/src/OrcFile.cc:
##########
@@ -161,6 +163,7 @@ DIAGNOSTIC_POP
     }
 
     void write(const void* buf, size_t length) override {
+      SCOPED_STOPWATCH(metrics, IOBlockingLatencyUs, IOCount);

Review Comment:
   To support all file systems, the macro SCOPED_STOPWATCH for IO metrics is place in class BufferedOutputStream(OutputStream.cc).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] coderex2522 commented on a diff in pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
coderex2522 commented on code in PR #1223:
URL: https://github.com/apache/orc/pull/1223#discussion_r950104690


##########
c++/include/orc/Writer.hh:
##########
@@ -28,6 +28,7 @@
 #include <set>
 #include <string>
 #include <vector>
+#include <atomic>

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] wgtmac commented on a diff in pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
wgtmac commented on code in PR #1223:
URL: https://github.com/apache/orc/pull/1223#discussion_r948658168


##########
c++/include/orc/Writer.hh:
##########
@@ -46,6 +47,13 @@ namespace orc {
 
   class Timezone;
 
+  /**
+   * Expose the IO metrics for write operation.
+   */
+  struct WriterMetrics {
+    std::atomic<uint64_t> IOCount{0};

Review Comment:
   Add some comment to explain them



##########
c++/src/OrcFile.cc:
##########
@@ -161,6 +163,7 @@ DIAGNOSTIC_POP
     }
 
     void write(const void* buf, size_t length) override {
+      SCOPED_STOPWATCH(metrics, IOBlockingLatencyUs, IOCount);

Review Comment:
   This may not be the right place. We should measure where it is called. Otherwise we cannot support HDFS and other file systems.



##########
c++/include/orc/Writer.hh:
##########
@@ -28,6 +28,7 @@
 #include <set>
 #include <string>
 #include <vector>
+#include <atomic>

Review Comment:
   please sort alphabetically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #1223:
URL: https://github.com/apache/orc/pull/1223#issuecomment-1221778697

   Thank you, @coderex2522 and @wgtmac !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] coderex2522 commented on a diff in pull request #1223: ORC-1252:[C++] Expose IO metrics for write operation

Posted by GitBox <gi...@apache.org>.
coderex2522 commented on code in PR #1223:
URL: https://github.com/apache/orc/pull/1223#discussion_r950107415


##########
c++/src/OrcFile.cc:
##########
@@ -161,6 +163,7 @@ DIAGNOSTIC_POP
     }
 
     void write(const void* buf, size_t length) override {
+      SCOPED_STOPWATCH(metrics, IOBlockingLatencyUs, IOCount);

Review Comment:
   To support all file systems, the macro SCOPED_STOPWATCH for IO metrics is placed in class BufferedOutputStream(OutputStream.cc).



##########
c++/include/orc/Writer.hh:
##########
@@ -46,6 +47,13 @@ namespace orc {
 
   class Timezone;
 
+  /**
+   * Expose the IO metrics for write operation.
+   */
+  struct WriterMetrics {
+    std::atomic<uint64_t> IOCount{0};

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org