You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/08 05:16:26 UTC

[GitHub] [iceberg] huaxingao opened a new pull request, #5227: backport #5204 to Spark3.2

huaxingao opened a new pull request, #5227:
URL: https://github.com/apache/iceberg/pull/5227

   This PR backports the changes in https://github.com/apache/iceberg/pull/5204 to Spark 3.2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917177221


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   Yea there was a similar comment on : https://github.com/apache/iceberg/pull/5204#discussion_r914817448, but seemed the positive filters.  I'm ok to log unsuccessful filters too (could also add the exprs not successfully bound as well in the next statement)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
huaxingao commented on PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#issuecomment-1179100180

   @szehon-ho Do we need to back-port to 3.1, 3.0 and 2.4 too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#issuecomment-1179260602

   Thanks @huaxingao for backport and @hililiwei for additional review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917177221


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   Yea there was a similar comment on : https://github.com/apache/iceberg/pull/5204#discussion_r914817448, but I think the point was the pushed filters are logged.  
   
   I'm ok to log unsuccessful filters too if it helps (could also add the exprs unsuccessfully bound as well in the next statement)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] bryanck commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
bryanck commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917098256


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   We should probably log something here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
huaxingao commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917196630


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   I am OK to have a follow-up to log the unsuccessful filters and also the unsuccessfully bound exprs in the next statement



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#issuecomment-1179194873

   @huaxingao yea I think its always beneficial if the issue exists in Spark, I was more eager about fixing 3.2 personally but I think we can do it to the others as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
huaxingao commented on PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#issuecomment-1179265583

   Thanks a lot @szehon-ho @hililiwei 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] bryanck commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
bryanck commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917194149


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   It could be useful to know that a filter was intentionally not pushed down for a particular reason



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5227:
URL: https://github.com/apache/iceberg/pull/5227#discussion_r917177221


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -100,7 +100,13 @@ public Filter[] pushFilters(Filter[] filters) {
     List<Filter> pushed = Lists.newArrayListWithExpectedSize(filters.length);
 
     for (Filter filter : filters) {
-      Expression expr = SparkFilters.convert(filter);
+      Expression expr = null;
+      try {
+        expr = SparkFilters.convert(filter);
+      } catch (IllegalArgumentException e) {
+        // converting to Iceberg Expression failed, so this expression cannot be pushed down

Review Comment:
   Yea there was a similar comment on : https://github.com/apache/iceberg/pull/5204#discussion_r914817448, but I think the point was the pushed filters are logged.  
   
   I'm ok to log unsuccessful filters too if it helps (could also add the exprs not successfully bound as well in the next statement)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho merged pull request #5227: backport #5204 to Spark3.2

Posted by GitBox <gi...@apache.org>.
szehon-ho merged PR #5227:
URL: https://github.com/apache/iceberg/pull/5227


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org