You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/18 12:44:17 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

HeartSaVioR opened a new pull request #30411:
URL: https://github.com/apache/spark/pull/30411


   ### What changes were proposed in this pull request?
   
   Two new options, _modifiiedBefore_  and _modifiedAfter_, is provided expecting a value in 'YYYY-MM-DDTHH:mm:ss' format.  _PartioningAwareFileIndex_ considers these options during the process of checking for files, just before considering applied _PathFilters_ such as `pathGlobFilter.`  In order to filter file results, a new PathFilter class was derived for this purpose.  General house-keeping around classes extending PathFilter was performed for neatness.  It became apparent support was needed to handle multiple potential path filters.  Logic was introduced for this purpose and the associated tests written.  
   
   ### Why are the changes needed?
   
   When loading files from a data source, there can often times be thousands of file within a respective file path.  In many cases I've seen, we want to start loading from a folder path and ideally be able to begin loading files having modification dates past a certain point.  This would mean out of thousands of potential files, only the ones with modification dates greater than the specified timestamp would be considered.  This saves a ton of time automatically and reduces significant complexity managing this in code.
   
   ### Does this PR introduce _any_ user-facing change?
   
   This PR introduces an option that can be used with batch-based Spark file data sources.  A documentation update was made to reflect an example and usage of the new data source option.
   
   **Example Usages**  
   _Load all CSV files modified after date:_ 
   `spark.read.format("csv").option("modifiedAfter","2020-06-15T05:00:00").load()`
   
   _Load all CSV files modified before date:_
   `spark.read.format("csv").option("modifiedBefore","2020-06-15T05:00:00").load()`  
   
   _Load all CSV files modified between two dates:_ 
   `spark.read.format("csv").option("modifiedAfter","2019-01-15T05:00:00").option("modifiedBefore","2020-06-15T05:00:00").load()  
   `
   
   ### How was this patch tested?
   
   A handful of unit tests were added to support the positive, negative, and edge case code paths. It's also live in a handful of our Databricks dev environments.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731816785


   **[Test build #131506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131506/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729751429


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35891/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730863022


   > The origin PR has been open for months, and I only refactored a bit & fixed the doc. I'll merge this in early next week if there's no further comment.
   
   Yea, it looks fine to me. Thanks for the take-over, @HeartSaVioR and thanks a lot for the valuable contribution, @cchighman !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730849515


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35996/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730920895






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729684868


   cc. @maropu @gengliangwang @cloud-fan @bart-samwel @zsxwing @dongjoon-hyun @HyukjinKwon 
   
   I've cc-ed all reviewers who left at least one review comment in original PR (#28841)
   
   Just FYI to @cchighman as he's an author of original PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR edited a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR edited a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731866114


   Thanks all for reviewing, and thanks again @cchighman for providing great stuff!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729822162






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731865270


   **[Test build #131506 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131506/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730836902


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35996/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729728567


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35886/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730852959


   Thanks for the review, @maropu !
   
   The origin PR has been open for months, and I only refactored a bit & fixed the doc. I'll merge this in early next week if there's no further comment.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730817075


   **[Test build #131392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131392/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729820414


   **[Test build #131281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131281/testReport)** for PR 30411 at commit [`80170b4`](https://github.com/apache/spark/commit/80170b4feb2fe9ab24ccfc694b7d7361bc4eec66).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729741922






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729711255


   **[Test build #131287 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131287/testReport)** for PR 30411 at commit [`d216fcb`](https://github.com/apache/spark/commit/d216fcb9562e05116b6d01e29255cc64e8a10d4c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729653671


   **[Test build #131281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131281/testReport)** for PR 30411 at commit [`80170b4`](https://github.com/apache/spark/commit/80170b4feb2fe9ab24ccfc694b7d7361bc4eec66).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729711255


   **[Test build #131287 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131287/testReport)** for PR 30411 at commit [`d216fcb`](https://github.com/apache/spark/commit/d216fcb9562e05116b6d01e29255cc64e8a10d4c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729741889


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35890/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amadav commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
amadav commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-767906694


   Does it still need to be merged?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730849501


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35996/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730920919


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131392/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729692226






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729685754


   **[Test build #131284 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131284/testReport)** for PR 30411 at commit [`088b630`](https://github.com/apache/spark/commit/088b6309c60a624f61e989dce416c248ce7c2858).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-733440592


   I am supportive of this change FWIW. I think it's good to have this feature.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729876666






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729685754


   **[Test build #131284 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131284/testReport)** for PR 30411 at commit [`088b630`](https://github.com/apache/spark/commit/088b6309c60a624f61e989dce416c248ce7c2858).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729893684


   **[Test build #131287 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131287/testReport)** for PR 30411 at commit [`d216fcb`](https://github.com/apache/spark/commit/d216fcb9562e05116b6d01e29255cc64e8a10d4c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729766257


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35891/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729875388


   **[Test build #131285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131285/testReport)** for PR 30411 at commit [`71456f2`](https://github.com/apache/spark/commit/71456f25dab7c283a6de821606b8259261da82fa).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731866114


   Thanks all for reviewing and merging, and thanks again @cchighman for providing great stuff!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731839149






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729728598






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729692201


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35885/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731865503






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731865448


   OK. No further comments, and test passed. Merging to master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729895116






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729653671


   **[Test build #131281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131281/testReport)** for PR 30411 at commit [`80170b4`](https://github.com/apache/spark/commit/80170b4feb2fe9ab24ccfc694b7d7361bc4eec66).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729687177






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731815952


   retest this, please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-767929291


   No this is merged and will be available in new minor version, release 3.1.1.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729725979


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35890/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR closed pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR closed pull request #30411:
URL: https://github.com/apache/spark/pull/30411


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #30411:
URL: https://github.com/apache/spark/pull/30411#discussion_r526424771



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PathFilterSuite.scala
##########
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import java.io.File
+import java.time.{LocalDateTime, ZoneId, ZoneOffset}
+import java.time.format.DateTimeFormatter
+
+import scala.util.Random
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalyst.util.{stringToFile, DateTimeUtils}
+import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types.{StringType, StructField, StructType}
+
+class PathFilterSuite extends QueryTest with SharedSparkSession {
+  import testImplicits._
+
+  test("SPARK-31962: modifiedBefore specified" +
+      " and sharing same timestamp with file last modified time.") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      executeTest(dir, Seq(curTime), 0, modifiedBefore = Some(formatTime(curTime)))
+    }
+  }
+
+  test("SPARK-31962: modifiedAfter specified" +
+      " and sharing same timestamp with file last modified time.") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      executeTest(dir, Seq(curTime), 0, modifiedAfter = Some(formatTime(curTime)))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore and modifiedAfter option" +
+      " share same timestamp with file last modified time.") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val formattedTime = formatTime(curTime)
+      executeTest(dir, Seq(curTime), 0, modifiedBefore = Some(formattedTime),
+        modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore and modifiedAfter option" +
+      " share same timestamp with earlier file last modified time.") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val fileTime = curTime.minusDays(3)
+      val formattedTime = formatTime(curTime)
+      executeTest(dir, Seq(fileTime), 0, modifiedBefore = Some(formattedTime),
+        modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore and modifiedAfter option" +
+      " share same timestamp with later file last modified time.") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val formattedTime = formatTime(curTime)
+      executeTest(dir, Seq(curTime), 0, modifiedBefore = Some(formattedTime),
+        modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: when modifiedAfter specified with a past date") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val pastTime = curTime.minusYears(1)
+      val formattedTime = formatTime(pastTime)
+      executeTest(dir, Seq(curTime), 1, modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: when modifiedBefore specified with a future date") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val futureTime = curTime.plusYears(1)
+      val formattedTime = formatTime(futureTime)
+      executeTest(dir, Seq(curTime), 1, modifiedBefore = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: with modifiedBefore option provided using a past date") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val pastTime = curTime.minusYears(1)
+      val formattedTime = formatTime(pastTime)
+      executeTest(dir, Seq(curTime), 0, modifiedBefore = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedAfter specified with a past date, multiple files, one valid") {
+    withTempDir { dir =>
+      val fileTime1 = LocalDateTime.now(ZoneOffset.UTC)
+      val fileTime2 = LocalDateTime.ofEpochSecond(0, 0, ZoneOffset.UTC)
+      val pastTime = fileTime1.minusYears(1)
+      val formattedTime = formatTime(pastTime)
+      executeTest(dir, Seq(fileTime1, fileTime2), 1, modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedAfter specified with a past date, multiple files, both valid") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val pastTime = curTime.minusYears(1)
+      val formattedTime = formatTime(pastTime)
+      executeTest(dir, Seq(curTime, curTime), 2, modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedAfter specified with a past date, multiple files, none valid") {
+    withTempDir { dir =>
+      val fileTime = LocalDateTime.ofEpochSecond(0, 0, ZoneOffset.UTC)
+      val pastTime = LocalDateTime.now(ZoneOffset.UTC).minusYears(1)
+      val formattedTime = formatTime(pastTime)
+      executeTest(dir, Seq(fileTime, fileTime), 0, modifiedAfter = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore specified with a future date, multiple files, both valid") {
+    withTempDir { dir =>
+      val fileTime = LocalDateTime.ofEpochSecond(0, 0, ZoneOffset.UTC)
+      val futureTime = LocalDateTime.now(ZoneOffset.UTC).plusYears(1)
+      val formattedTime = formatTime(futureTime)
+      executeTest(dir, Seq(fileTime, fileTime), 2, modifiedBefore = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore specified with a future date, multiple files, one valid") {
+    withTempDir { dir =>
+      val curTime = LocalDateTime.now(ZoneOffset.UTC)
+      val fileTime1 = LocalDateTime.ofEpochSecond(0, 0, ZoneOffset.UTC)
+      val fileTime2 = curTime.plusDays(3)
+      val formattedTime = formatTime(curTime)
+      executeTest(dir, Seq(fileTime1, fileTime2), 1, modifiedBefore = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore specified with a future date, multiple files, none valid") {
+    withTempDir { dir =>
+      val fileTime = LocalDateTime.now(ZoneOffset.UTC).minusDays(1)
+      val formattedTime = formatTime(fileTime)
+      executeTest(dir, Seq(fileTime, fileTime), 0, modifiedBefore = Some(formattedTime))
+    }
+  }
+
+  test("SPARK-31962: modifiedBefore/modifiedAfter is specified with an invalid date") {
+    executeTestWithBadOption(
+      Map("modifiedBefore" -> "2024-05+1 01:00:00"),
+      Seq("The timestamp provided", "modifiedbefore", "2024-05+1 01:00:00"))
+
+    executeTestWithBadOption(
+      Map("modifiedAfter" -> "2024-05+1 01:00:00"),
+      Seq("The timestamp provided", "modifiedafter", "2024-05+1 01:00:00"))
+  }
+
+  test("SPARK-31962: modifiedBefore/modifiedAfter - empty option") {
+    executeTestWithBadOption(
+      Map("modifiedBefore" -> ""),
+      Seq("The timestamp provided", "modifiedbefore"))
+
+    executeTestWithBadOption(
+      Map("modifiedAfter" -> ""),
+      Seq("The timestamp provided", "modifiedafter"))
+  }
+
+  test("SPARK-31962: modifiedBefore/modifiedAfter filter takes into account local timezone " +
+      "when specified as an option.") {
+    Seq("modifiedbefore", "modifiedafter").foreach { filterName =>
+      // CET = UTC + 1 hour, HST = UTC - 10 hours
+      Seq("CET", "HST").foreach { tzId =>
+        testModifiedDateFilterWithTimezone(tzId, filterName)
+      }
+    }
+  }
+
+  test("Option pathGlobFilter: filter files correctly") {

Review comment:
       Note to reviewers: two tests were moved from FileBasedDataSourceSuite as this PR adds the specific suite for filtering option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730917327


   **[Test build #131392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131392/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729822162






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #30411:
URL: https://github.com/apache/spark/pull/30411#discussion_r527325885



##########
File path: examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala
##########
@@ -81,6 +81,27 @@ object SQLDataSourceExample {
     // |file1.parquet|
     // +-------------+
     // $example off:load_with_path_glob_filter$
+    // $example on:load_with_modified_time_filter$
+    val beforeFilterDF = spark.read.format("parquet")
+        // Files modified before 07/01/2020 at 05:30 are allowed
+        .option("modifiedBefore", "2020-07-01T05:30:00")

Review comment:
       nit: two indents to follow the other examples.

##########
File path: python/pyspark/sql/readwriter.py
##########
@@ -765,7 +832,8 @@ def jdbc(self, url, table, column=None, lowerBound=None, upperBound=None, numPar
         """
         if properties is None:
             properties = dict()
-        jprop = JavaClass("java.util.Properties", self._spark._sc._gateway._gateway_client)()
+        jprop = JavaClass("java.util.Properties",
+                          self._spark._sc._gateway._gateway_client)()

Review comment:
       a unnecessary change?

##########
File path: docs/sql-data-sources-generic-options.md
##########
@@ -119,3 +119,40 @@ To load all files recursively, you can use:
 {% include_example recursive_file_lookup r/RSparkSQLExample.R %}
 </div>
 </div>
+
+### Modification Time Path Filters
+
+`modifiedBefore` and `modifiedAfter` are options that can be 
+applied together or separately in order to achieve greater
+granularity over which files may load during a Spark batch query.
+(Structured Streaming file source doesn't support these options.)

Review comment:
       nit: `(Structured Streaming file source doesn't support these options.)` -> `Note that Structured Streaming file sources don't support these options.`?

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamOptions.scala
##########
@@ -32,6 +33,16 @@ class FileStreamOptions(parameters: CaseInsensitiveMap[String]) extends Logging
 
   def this(parameters: Map[String, String]) = this(CaseInsensitiveMap(parameters))
 
+  checkDisallowedOptions(parameters)
+
+  private def checkDisallowedOptions(options: Map[String, String]): Unit = {
+    Seq(ModifiedBeforeFilter.PARAM_NAME, ModifiedAfterFilter.PARAM_NAME).foreach { param =>
+      if (parameters.contains(param)) {
+        throw new IllegalArgumentException(s"option '$param' is not allowed in file stream source")

Review comment:
       nit: `file stream source` -> `file stream sources`?

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala
##########
@@ -57,13 +57,10 @@ abstract class PartitioningAwareFileIndex(
   protected def leafDirToChildrenFiles: Map[Path, Array[FileStatus]]
 
   private val caseInsensitiveMap = CaseInsensitiveMap(parameters)
+  protected val pathFilters = PathFilterFactory.create(caseInsensitiveMap)

Review comment:
       `protected` -> `private`?

##########
File path: examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala
##########
@@ -81,6 +81,27 @@ object SQLDataSourceExample {
     // |file1.parquet|
     // +-------------+
     // $example off:load_with_path_glob_filter$
+    // $example on:load_with_modified_time_filter$
+    val beforeFilterDF = spark.read.format("parquet")
+        // Files modified before 07/01/2020 at 05:30 are allowed
+        .option("modifiedBefore", "2020-07-01T05:30:00")
+        .load("examples/src/main/resources/dir1");
+    beforeFilterDF.show();
+    // +-------------+
+    // |         file|
+    // +-------------+
+    // |file1.parquet|
+    // +-------------+
+    val afterFilterDF = spark.read.format("parquet")
+         // Files modified after 06/01/2020 at 05:30 are allowed
+        .option("modifiedAfter", "2020-06-01T05:30:00")

Review comment:
       ditto

##########
File path: python/pyspark/sql/readwriter.py
##########
@@ -777,7 +845,8 @@ def jdbc(self, url, table, column=None, lowerBound=None, upperBound=None, numPar
                                                int(numPartitions), jprop))
         if predicates is not None:
             gateway = self._spark._sc._gateway
-            jpredicates = utils.toJArray(gateway, gateway.jvm.java.lang.String, predicates)
+            jpredicates = utils.toJArray(
+                gateway, gateway.jvm.java.lang.String, predicates)

Review comment:
       ditto (I have the same comments on the changes below, too).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729687077


   First commit squashed the commits in origin PR. Remaining commits are my own to address my own review comments. So if you've done with the origin PR and it looked good to you, you may probably want to only look at remaining commits.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731839149






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730920895


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729876666






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729695526


   **[Test build #131285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131285/testReport)** for PR 30411 at commit [`71456f2`](https://github.com/apache/spark/commit/71456f25dab7c283a6de821606b8259261da82fa).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729678786


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35885/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730817075


   **[Test build #131392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131392/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729766283






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731832826


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36110/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729687177






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729766283






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729728598






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729741922






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730849509


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731865503






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729695526


   **[Test build #131285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131285/testReport)** for PR 30411 at commit [`71456f2`](https://github.com/apache/spark/commit/71456f25dab7c283a6de821606b8259261da82fa).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730849509






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-730949018


   Build failure is not related. I'll let it go and check just before the merge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729687167


   **[Test build #131284 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131284/testReport)** for PR 30411 at commit [`088b630`](https://github.com/apache/spark/commit/088b6309c60a624f61e989dce416c248ce7c2858).
    * This patch **fails Scala style tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729708500


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35886/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-729895116






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731839141


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36110/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30411:
URL: https://github.com/apache/spark/pull/30411#issuecomment-731816785


   **[Test build #131506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131506/testReport)** for PR 30411 at commit [`ce00b6d`](https://github.com/apache/spark/commit/ce00b6dc54ed681e7172db2856fb444d51d3f75c).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org