You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by GitBox <gi...@apache.org> on 2022/09/29 11:45:06 UTC

[GitHub] [orc] deshanxiao opened a new pull request, #1267: [ORC-1283] Bugfix: ENABLE_INDEXES does not take effect

deshanxiao opened a new pull request, #1267:
URL: https://github.com/apache/orc/pull/1267

   ### What changes were proposed in this pull request?
   This PR aims to fix the problem that ENABLE_INDEXES does not take effect.
   
   ### Why are the changes needed?
   Now, if the orc config `ENABLE_INDEXES` is set to `false`. Orc will still write index because orc writes to index or not is only related to the configure of `ROW_INDEX_STRIDE`.
   
   
   ### How was this patch tested?
   EXISTING UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1372498490

   Thank you, @PengleiShi and @deshanxiao .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] deshanxiao commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
deshanxiao commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1263028041

   Sure, will add test later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #1267:
URL: https://github.com/apache/orc/pull/1267#discussion_r1065591998


##########
java/core/src/java/org/apache/orc/OrcFile.java:
##########
@@ -905,6 +907,10 @@ public int getRowIndexStride() {
       return rowIndexStrideValue;
     }
 
+    public boolean isBuildIndex() {

Review Comment:
   Here is the PR with your authorship.
   - https://github.com/apache/orc/pull/1371



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] PengleiShi commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
PengleiShi commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1371897450

   > Could you file a JIRA please, @PengleiShi ?
   
   @dongjoon-hyun  https://issues.apache.org/jira/browse/ORC-1343


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] deshanxiao commented on a diff in pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
deshanxiao commented on code in PR #1267:
URL: https://github.com/apache/orc/pull/1267#discussion_r1065590489


##########
java/core/src/java/org/apache/orc/OrcFile.java:
##########
@@ -905,6 +907,10 @@ public int getRowIndexStride() {
       return rowIndexStrideValue;
     }
 
+    public boolean isBuildIndex() {

Review Comment:
   Yeah. we could re-implement it by rowIndexStrideValue> 0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] guiyanakuang commented on a diff in pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on code in PR #1267:
URL: https://github.com/apache/orc/pull/1267#discussion_r1065603218


##########
java/core/src/java/org/apache/orc/OrcFile.java:
##########
@@ -905,6 +907,10 @@ public int getRowIndexStride() {
       return rowIndexStrideValue;
     }
 
+    public boolean isBuildIndex() {

Review Comment:
   > @deshanxiao It seems that we cannot revert this because this is a `public` API.
   
   Agree, it looks like we can only set `orc.row.index.stride` to 0 when `orc.create.index=false`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on pull request #1267: [ORC-1283] Bugfix: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1262726020

   Also, cc @williamhyun , @guiyanakuang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #1267:
URL: https://github.com/apache/orc/pull/1267#discussion_r1065555056


##########
java/core/src/java/org/apache/orc/OrcFile.java:
##########
@@ -905,6 +907,10 @@ public int getRowIndexStride() {
       return rowIndexStrideValue;
     }
 
+    public boolean isBuildIndex() {

Review Comment:
   @deshanxiao It seems that we cannot revert this because this is a `public` API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] guiyanakuang commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1263011750

   > Thank you for making a PR, @deshanxiao . Do you think you can make a test case for this one?
   
   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] PengleiShi commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
PengleiShi commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1371736730

   @deshanxiao hi, have you tested that spark reads ORC files without index? It seems that ORC filter pushdown should be set to false otherwise it will occur error. 
   ```
   if (indexes[columnIx] == null) {
     throw new AssertionError("Index is not populated for " + columnIx);
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] deshanxiao commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
deshanxiao commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1271325439

   Thank you! @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1371866194

   Could you file a JIRA please, @PengleiShi ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] deshanxiao commented on pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
deshanxiao commented on PR #1267:
URL: https://github.com/apache/orc/pull/1267#issuecomment-1372253655

   Thank you for reporting the issue. Let's discuss it in [ORC-1343](https://issues.apache.org/jira/browse/ORC-1343) 
   
   Thank you @PengleiShi @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1267: ORC-1283: ENABLE_INDEXES does not take effect

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #1267:
URL: https://github.com/apache/orc/pull/1267#discussion_r1065607654


##########
java/core/src/java/org/apache/orc/OrcFile.java:
##########
@@ -905,6 +907,10 @@ public int getRowIndexStride() {
       return rowIndexStrideValue;
     }
 
+    public boolean isBuildIndex() {

Review Comment:
   Thank you, @guiyanakuang .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org