You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/08 19:24:04 UTC

[GitHub] [hudi] alexeykudinkin opened a new pull request, #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

alexeykudinkin opened a new pull request, #5266:
URL: https://github.com/apache/hudi/pull/5266

   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Fixing performance hits in reading Column Stats Index:
   
   1. [HUDI-3834] There's substantial performance degradation in Avro 1.10 default generated `Builder` classes: they by default rely on `SpecificData.getForSchema` that load corresponding model's class using reflection, which takes a hit when executed on the hot-path (this was bringing overall runtime to read full Column Stats Index of 800k records to **60s**, whereas now it's taking mere **3s**)
   
   2. Addressing memory churn by over-used Hadoop's `Path` creation: `Path` ctor is not a lightweight sequence and produces quite a bit of memory churn adding pressure on GC. Cleaning such avoidable allocations up to make sure there's no unnecessarily added pressure on GC.
   
   ## Brief change log
   
   See above
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094115388

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093545549

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093690865

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * 723c55a7638c2110494863da74d56da06dd261d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093425691

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094125899

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947",
       "triggerID" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7951",
       "triggerID" : "1094115388",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7951) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093363965

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922) 
   * abf82b3871b7b65708932fd4e7115b279f64964b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093990776

   @alexeykudinkin : there are some CI failures. can you please check it out


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093586902

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 658ba51ecffd10771161141864da7e7e4435a6e0 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932) 
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * 723c55a7638c2110494863da74d56da06dd261d0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094094057

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947",
       "triggerID" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * 723c55a7638c2110494863da74d56da06dd261d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936) 
   * a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094115670

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947",
       "triggerID" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7951",
       "triggerID" : "1094115388",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7951) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan merged pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
nsivabalan merged PR #5266:
URL: https://github.com/apache/hudi/pull/5266


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093283339

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093518493

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924) 
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094108241

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947",
       "triggerID" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7947) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093357009

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093365714

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922) 
   * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
alexeykudinkin commented on code in PR #5266:
URL: https://github.com/apache/hudi/pull/5266#discussion_r846555002


##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -159,18 +166,34 @@ public static HoodieTableMetaClient reload(HoodieTableMetaClient oldMetaClient)
    */
   private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException {
     in.defaultReadObject();
+    String basePathStr = in.readUTF();
+    String metaPathStr = in.readUTF();
+
     fs = null; // will be lazily initialized
+    basePath = new LazyCachingPath(basePathStr);
+    metaPath = new LazyCachingPath(metaPathStr);
   }
 
   private void writeObject(java.io.ObjectOutputStream out) throws IOException {
     out.defaultWriteObject();
+    out.writeBytes(basePath.toString());
+    out.writeBytes(metaPath.toString());
+  }
+
+  /**
+   * Returns base path of the table
+   */
+  public Path getBasePathV2() {
+    return basePath;
   }
 
   /**
    * @return Base path
+   * @deprecated please use {@link #getBasePathV2()}
    */
+  @Deprecated

Review Comment:
   We should slowly rollover all uses of `getBasePath` into `getBasePathV2` and then rename it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093279292

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093576216

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930) 
   * 658ba51ecffd10771161141864da7e7e4435a6e0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932) 
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093547669

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930) 
   * 658ba51ecffd10771161141864da7e7e4435a6e0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on a diff in pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on code in PR #5266:
URL: https://github.com/apache/hudi/pull/5266#discussion_r846516629


##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -159,18 +166,34 @@ public static HoodieTableMetaClient reload(HoodieTableMetaClient oldMetaClient)
    */
   private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException {
     in.defaultReadObject();
+    String basePathStr = in.readUTF();
+    String metaPathStr = in.readUTF();
+
     fs = null; // will be lazily initialized
+    basePath = new LazyCachingPath(basePathStr);
+    metaPath = new LazyCachingPath(metaPathStr);
   }
 
   private void writeObject(java.io.ObjectOutputStream out) throws IOException {
     out.defaultWriteObject();
+    out.writeBytes(basePath.toString());
+    out.writeBytes(metaPath.toString());
+  }
+
+  /**
+   * Returns base path of the table
+   */
+  public Path getBasePathV2() {
+    return basePath;
   }
 
   /**
    * @return Base path
+   * @deprecated please use {@link #getBasePathV2()}
    */
+  @Deprecated

Review Comment:
   I see we are using getBasePath() in our baseRelation classes. do we need to fix them to getBasePathV2() ? 



##########
hudi-common/src/main/java/org/apache/hudi/hadoop/LazyCachingPath.java:
##########
@@ -20,19 +20,46 @@
 
 import org.apache.hadoop.fs.Path;
 
+import javax.annotation.concurrent.ThreadSafe;
 import java.net.URI;
 
 /**
+ * This is an extension of the {@code Path} class allowing to avoid repetitive
+ * computations (like {@code getFileName}, {@code toString}) which are secured
+ * by its immutability
+ *
  * NOTE: This class is thread-safe
  */
-public class FileNameCachingPath extends Path {
+@ThreadSafe
+public class LazyCachingPath extends Path {
 
-  // NOTE: volatile keyword is redundant here and put mostly for reader notice, since all
+  // NOTE: `volatile` keyword is redundant here and put mostly for reader notice, since all
   //       reads/writes to references are always atomic (including 64-bit JVMs)
   //       https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.7
   private volatile String fileName;
+  private volatile String s;

Review Comment:
   minor. s -> path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093516227

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * abf82b3871b7b65708932fd4e7115b279f64964b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924) 
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1094092978

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * 723c55a7638c2110494863da74d56da06dd261d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936) 
   * a87c363fbd49ab8ce17a5f1d77d92205b7f9ada4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093560316

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ba21b8b51bb088b83a159392fc4b6e6a9115fa04 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930) 
   * 658ba51ecffd10771161141864da7e7e4435a6e0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093600536

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "723c55a7638c2110494863da74d56da06dd261d0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936",
       "triggerID" : "723c55a7638c2110494863da74d56da06dd261d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 658ba51ecffd10771161141864da7e7e4435a6e0 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932) 
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   * 723c55a7638c2110494863da74d56da06dd261d0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7936) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5266: [HUDI-3834] Fixing performance hits in reading Column Stats Index

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5266:
URL: https://github.com/apache/hudi/pull/5266#issuecomment-1093578147

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7922",
       "triggerID" : "0ca9aaa1ef3eca6faa9431722cbafba2dd2f2603",
       "triggerType" : "PUSH"
     }, {
       "hash" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7924",
       "triggerID" : "abf82b3871b7b65708932fd4e7115b279f64964b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7930",
       "triggerID" : "ba21b8b51bb088b83a159392fc4b6e6a9115fa04",
       "triggerType" : "PUSH"
     }, {
       "hash" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932",
       "triggerID" : "658ba51ecffd10771161141864da7e7e4435a6e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "79639f70c86407d2cce694670fad2170e3f59ee7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 658ba51ecffd10771161141864da7e7e4435a6e0 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=7932) 
   * 79639f70c86407d2cce694670fad2170e3f59ee7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org