You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/02 15:29:39 UTC

[GitHub] [iceberg] Omega359 opened a new issue, #6106: SparkBatchQueryScan logs too much

Omega359 opened a new issue, #6106:
URL: https://github.com/apache/iceberg/issues/6106

   ### Apache Iceberg version
   
   1.0.0 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   SparkBatchQueryScan has 2 info log lines that can generate enormous log lines.  For example, here is a log line that is generated:
   
   `22/11/02 15:22:56 INFO SparkBatchQueryScan: Trying to filter 2500 files using runtime filter (ref(name="<field name>") in ("020999", "610602", "003585", "004740", "008878", "011891", "012528", "020107", "600428", "610014", "015905", "012833", "610020", "610502", "004336", "015581", "020545", "005285", "009893", "015995", "004682", "004915", "006631", "017134", "610480", "007895", "610415", "003858", "014211", "019595", "610378", "610524", "009430", "017688", "400023", "601341", "610144", "610455", "610468", "004766", "019587", "610272", "610084", "610170", "004527", "017010", "018951", "021684", "610097", "019025",...`
   
   The above log line has a few thousand items in that list. If runtimeFilterExpr is to be logged it should be trimmed to a reasonable length prior to logging, or the log level should be changed to debug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue closed issue #6106: SparkBatchQueryScan logs too much

Posted by GitBox <gi...@apache.org>.
rdblue closed issue #6106: SparkBatchQueryScan logs too much
URL: https://github.com/apache/iceberg/issues/6106


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on issue #6106: SparkBatchQueryScan logs too much

Posted by GitBox <gi...@apache.org>.
nastra commented on issue #6106:
URL: https://github.com/apache/iceberg/issues/6106#issuecomment-1300721503

   I think those 2 log lines in `SparkBatchQueryScan` should be using `ExpressionUtil.sanitize(..)`. @Omega359 would you like to open up a PR for this and I'll make sure to review it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org