You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/06 00:38:58 UTC
[GitHub] [spark] HyukjinKwon opened a new pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
HyukjinKwon opened a new pull request #34187:
URL: https://github.com/apache/spark/pull/34187
### What changes were proposed in this pull request?
This PR fixes the test failure:
```
Running tests...
----------------------------------------------------------------------
test_read_images (pyspark.ml.tests.test_image.ImageFileFormatTest) ... ERROR (12.050s)
======================================================================
ERROR [12.050s]: test_read_images (pyspark.ml.tests.test_image.ImageFileFormatTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/tests/test_image.py", line 35, in test_read_images
self.assertEqual(df.count(), 4)
File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 507, in count
return int(self._jdf.count())
File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/utils.py", line 98, in deco
return f(*a, **kw)
File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o32.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, amp-jenkins-worker-05.amp, executor driver): javax.imageio.IIOException: Unsupported Image Type
at com.sun.imageio.plugins.jpeg.JPEGImageReader.readInternal(JPEGImageReader.java:1079)
at com.sun.imageio.plugins.jpeg.JPEGImageReader.read(JPEGImageReader.java:1050)
at javax.imageio.ImageIO.read(ImageIO.java:1448)
at javax.imageio.ImageIO.read(ImageIO.java:1352)
```
This exception happens apparently when handling malformed invalid images when `dropInvalid` option is on. However, `ImageIO.read` fails to catch `javax.imageio.IIOException` for an invalid image that is not `RuntimeException`.
In fact, `javax.imageio.IIOException` signals "run-time failure of reading" (see also https://docs.oracle.com/javase/8/docs/api/javax/imageio/IIOException.html).
Therefore, this PR adds `javax.imageio.IIOException` when catching the exception when reading image to properly handle malformed images.
For the reason why it's flaky instead of consistently failing, I am not yet sure. However, the fix should be correct.
### Why are the changes needed?
To fix the flaky tests, see https://github.com/apache/spark/runs/3802639160 as an example.
### Does this PR introduce _any_ user-facing change?
Users would be able to read malformed data even for the cases of `javax.imageio.IIOException` is thrown when `dropInvalid` option is enabled.
### How was this patch tested?
Existing unittests. We should track if the tests are still flaky or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935245516
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143862/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935301463
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48375/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935164968
**[Test build #143862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
> The diff coverage is `89.50%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
==========================================
- Coverage 90.23% 82.08% -8.16%
==========================================
Files 288 249 -39
Lines 61113 55769 -5344
Branches 8994 8505 -489
==========================================
- Hits 55146 45778 -9368
- Misses 4625 8812 +4187
+ Partials 1342 1179 -163
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
| ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34187:
URL: https://github.com/apache/spark/pull/34187
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937794941
**[Test build #143974 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937811482
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143974/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888
Let me get this in in few more days if there are no more comments. I believe this PR fixes the source to follow the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937811482
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143974/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937839892
**[Test build #143989 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935244629
**[Test build #143862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937662618
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937867149
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143989/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937662618
Why the previous test is flaky ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `17.74%`.
> The diff coverage is `88.80%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
===========================================
- Coverage 90.23% 72.48% -17.75%
===========================================
Files 288 240 -48
Lines 61113 46182 -14931
Branches 8994 7669 -1325
===========================================
- Hits 55146 33477 -21669
- Misses 4625 11352 +6727
- Partials 1342 1353 +11
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `72.48% <88.80%> (-17.73%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `89.68% <ø> (-1.88%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `33.70% <16.66%> (-62.48%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `76.43% <63.63%> (-1.03%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `88.65% <100.00%> (-3.50%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `75.47% <100.00%> (-18.85%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `53.48% <100.00%> (-40.45%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfbmFtZXNwYWNlLnB5) | `98.87% <100.00%> (+0.10%)` | :arrow_up: |
| ... and [122 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711
**[Test build #143966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-938237567
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937760289
**[Test build #143989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937832074
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48458/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721900
That seems reasonable to me yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888
Let me get this in in few more days if there are no more comments. I believe this PR fixes the source with the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935245516
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143862/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935179555
If it fails to read for whatever reason, it will be stored in the invalid image in the row: https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala#L87
So, previously, it would fail just in the middle for `javax.imageio.IIOException`.
After this change, the image (bytes) will be stored in the returned row for invalid image.
The test: https://github.com/apache/spark/blob/master/python/pyspark/ml/tests/test_image.py#L27-L32 actually reads https://github.com/apache/spark/blob/master/data/mllib/images/origin/kittens/not-image.txt that is not an image file (so I think the code should be able to handle `javax.imageio.IIOException` anyway).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937760289
**[Test build #143989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935301463
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48375/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881313
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48458/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721501
**[Test build #143974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter edited a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
> The diff coverage is `89.50%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
==========================================
- Coverage 90.23% 82.08% -8.16%
==========================================
Files 288 249 -39
Lines 61113 55769 -5344
Branches 8994 8505 -489
==========================================
- Hits 55146 45778 -9368
- Misses 4625 8812 +4187
+ Partials 1342 1179 -163
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
| ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935216250
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48375/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721900
That seems reasonable to me yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721501
**[Test build #143974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143966/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937668269
According to the code here:
https://github.com/apache/spark/blob/2e9b698d3100f18595dadbf35abb06502c3d6123/mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala#L134
`ImageIO.read(new ByteArrayInputStream(bytes))` the `bytes` is an array in memory, so will the real IO exception happen ? Seems like real IO exception is impossible. If so, we can catch all kinds of exception raised by `ImageIO.read` and regards all of them as invalid image cases. @srowen @HyukjinKwon What do you think ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935164968
**[Test build #143862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on a change in pull request #34187:
URL: https://github.com/apache/spark/pull/34187#discussion_r724113748
##########
File path: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala
##########
@@ -133,9 +133,7 @@ object ImageSchema {
val img = try {
ImageIO.read(new ByteArrayInputStream(bytes))
} catch {
- // Catch runtime exception because `ImageIO` may throw unexpected `RuntimeException`.
- // But do not catch the declared `IOException` (regarded as FileSystem failure)
- case _: RuntimeException => null
+ case _: Throwable => null
Review comment:
Let's add comment say because we read from memory bytes, so no real IO exceptions will happen, then we can catch all exceptions as invalid image exception. also mention the IIOException may be raised if hit invalid image
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
> The diff coverage is `89.50%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
==========================================
- Coverage 90.23% 82.08% -8.16%
==========================================
Files 288 249 -39
Lines 61113 55769 -5344
Branches 8994 8505 -489
==========================================
- Hits 55146 45778 -9368
- Misses 4625 8812 +4187
+ Partials 1342 1179 -163
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
| ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881357
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48458/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143966/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937740011
**[Test build #143966 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937867149
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143989/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter edited a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
> The diff coverage is `89.50%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
==========================================
- Coverage 90.23% 82.08% -8.16%
==========================================
Files 288 249 -39
Lines 61113 55769 -5344
Branches 8994 8505 -489
==========================================
- Hits 55146 45778 -9368
- Misses 4625 8812 +4187
+ Partials 1342 1179 -163
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
| ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711
**[Test build #143966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] codecov-commenter commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994
# [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `17.74%`.
> The diff coverage is `88.80%`.
> :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #34187 +/- ##
===========================================
- Coverage 90.23% 72.48% -17.75%
===========================================
Files 288 240 -48
Lines 61113 46182 -14931
Branches 8994 7669 -1325
===========================================
- Hits 55146 33477 -21669
- Misses 4625 11352 +6727
- Partials 1342 1353 +11
```
| Flag | Coverage Δ | |
|---|---|---|
| unittests | `72.48% <88.80%> (-17.73%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `89.68% <ø> (-1.88%)` | :arrow_down: |
| [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `33.70% <16.66%> (-62.48%)` | :arrow_down: |
| [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `76.43% <63.63%> (-1.03%)` | :arrow_down: |
| [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
| [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
| [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
| [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `88.65% <100.00%> (-3.50%)` | :arrow_down: |
| [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `75.47% <100.00%> (-18.85%)` | :arrow_down: |
| [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `53.48% <100.00%> (-40.45%)` | :arrow_down: |
| [python/pyspark/pandas/tests/test\_namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfbmFtZXNwYWNlLnB5) | `98.87% <100.00%> (+0.10%)` | :arrow_up: |
| ... and [122 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935183053
BTW, just for extra classification, the input data is already fully read from the file at that point: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala#L77-L83
So it won't likely a failure from reading it from actual FS. Yes, it is still possible that the disk holds a corrupt data but .. I don't think there are other things we can do in this case .. other sources like CSV, Text or JSON sources won't likely be able to them either.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881357
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48458/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937671010
Yeah actually thats what I thought too. Probably that's better.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source
Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on a change in pull request #34187:
URL: https://github.com/apache/spark/pull/34187#discussion_r724113748
##########
File path: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala
##########
@@ -133,9 +133,7 @@ object ImageSchema {
val img = try {
ImageIO.read(new ByteArrayInputStream(bytes))
} catch {
- // Catch runtime exception because `ImageIO` may throw unexpected `RuntimeException`.
- // But do not catch the declared `IOException` (regarded as FileSystem failure)
- case _: RuntimeException => null
+ case _: Throwable => null
Review comment:
Let's add comment say because we read from memory bytes, so no real IO exceptions will happen, then we can catch all exceptions as invalid image exception. also mention the IIOException may be raised if hit invalid image
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935285977
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48375/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888
Let me get this in in few more days if there are no more comments. I believe this PR fixes the source to follow the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source
Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935162666
Tough one. I assume we want to return 'null' when trying to read an invalid or unsupported file, but fail outright in case of an I/O problem. This sounds like a type of "I/O problem", but, it does appear to arise when the data was read just fine but the data isn't supported or doesn't make sense: https://github.com/frohoff/jdk8u-dev-jdk/blob/master/src/share/classes/com/sun/imageio/plugins/jpeg/JPEGImageReader.java#L1068 So yes this seems reasonable.
yes, weird that it happens only sometimes in a test though ... maybe a corrupted checkout on a worker?
After this, we'd be unexpectedly getting a null for some image in this test - is that just irrelevant for the test?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org