You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/06 00:38:58 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

HyukjinKwon opened a new pull request #34187:
URL: https://github.com/apache/spark/pull/34187


   ### What changes were proposed in this pull request?
   
   This PR fixes the test failure:
   
   ```
   Running tests...
   ----------------------------------------------------------------------
   test_read_images (pyspark.ml.tests.test_image.ImageFileFormatTest) ... ERROR (12.050s)
   
   ======================================================================
   ERROR [12.050s]: test_read_images (pyspark.ml.tests.test_image.ImageFileFormatTest)
   ----------------------------------------------------------------------
   Traceback (most recent call last):
   File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/tests/test_image.py", line 35, in test_read_images
   self.assertEqual(df.count(), 4)
   File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 507, in count
   return int(self._jdf.count())
   File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in _call_
   answer, self.gateway_client, self.target_id, self.name)
   File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/utils.py", line 98, in deco
   return f(*a, **kw)
   File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 328, in get_return_value
   format(target_id, ".", name), value)
   py4j.protocol.Py4JJavaError: An error occurred while calling o32.count.
   : org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, amp-jenkins-worker-05.amp, executor driver): javax.imageio.IIOException: Unsupported Image Type
   at com.sun.imageio.plugins.jpeg.JPEGImageReader.readInternal(JPEGImageReader.java:1079)
   at com.sun.imageio.plugins.jpeg.JPEGImageReader.read(JPEGImageReader.java:1050)
   at javax.imageio.ImageIO.read(ImageIO.java:1448)
   at javax.imageio.ImageIO.read(ImageIO.java:1352)
   ```
   
   This exception happens apparently when handling malformed invalid images when `dropInvalid` option is on. However, `ImageIO.read` fails to catch `javax.imageio.IIOException` for an invalid image that is not `RuntimeException`.
   
   In fact, `javax.imageio.IIOException` signals "run-time failure of reading" (see also https://docs.oracle.com/javase/8/docs/api/javax/imageio/IIOException.html).
   
   Therefore, this PR adds `javax.imageio.IIOException` when catching the exception when reading image to properly handle malformed images.
   
   For the reason why it's flaky instead of consistently failing, I am not yet sure. However, the fix should be correct.
   
   ### Why are the changes needed?
   
   To fix the flaky tests, see https://github.com/apache/spark/runs/3802639160 as an example.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Users would be able to read malformed data even for the cases of `javax.imageio.IIOException` is thrown when `dropInvalid`  option is enabled.
   
   ### How was this patch tested?
   
   Existing unittests. We should track if the tests are still flaky or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935245516


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143862/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935301463


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48375/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935164968


   **[Test build #143862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
   > The diff coverage is `89.50%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #34187      +/-   ##
   ==========================================
   - Coverage   90.23%   82.08%   -8.16%     
   ==========================================
     Files         288      249      -39     
     Lines       61113    55769    -5344     
     Branches     8994     8505     -489     
   ==========================================
   - Hits        55146    45778    -9368     
   - Misses       4625     8812    +4187     
   + Partials     1342     1179     -163     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
   | ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34187:
URL: https://github.com/apache/spark/pull/34187


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937794941


   **[Test build #143974 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937811482


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143974/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888


   Let me get this in in few more days if there are no more comments. I believe this PR fixes the source to follow the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937811482


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143974/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937839892


   **[Test build #143989 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935244629


   **[Test build #143862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937662618






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937867149


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143989/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937662618


   Why the previous test is flaky ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `17.74%`.
   > The diff coverage is `88.80%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #34187       +/-   ##
   ===========================================
   - Coverage   90.23%   72.48%   -17.75%     
   ===========================================
     Files         288      240       -48     
     Lines       61113    46182    -14931     
     Branches     8994     7669     -1325     
   ===========================================
   - Hits        55146    33477    -21669     
   - Misses       4625    11352     +6727     
   - Partials     1342     1353       +11     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `72.48% <88.80%> (-17.73%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `89.68% <ø> (-1.88%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `33.70% <16.66%> (-62.48%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `76.43% <63.63%> (-1.03%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `88.65% <100.00%> (-3.50%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `75.47% <100.00%> (-18.85%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `53.48% <100.00%> (-40.45%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfbmFtZXNwYWNlLnB5) | `98.87% <100.00%> (+0.10%)` | :arrow_up: |
   | ... and [122 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711


   **[Test build #143966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-938237567


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937760289


   **[Test build #143989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937832074


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48458/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721900


   That seems reasonable to me yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888


   Let me get this in in few more days if there are no more comments. I believe this PR fixes the source with the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935245516


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143862/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935179555


   If it fails to read for whatever reason, it will be stored in the invalid image in the row: https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala#L87
   
   So, previously, it would fail just in the middle for `javax.imageio.IIOException`.
   After this change, the image (bytes) will be stored in the returned row for invalid image.
   
   The test: https://github.com/apache/spark/blob/master/python/pyspark/ml/tests/test_image.py#L27-L32 actually reads https://github.com/apache/spark/blob/master/data/mllib/images/origin/kittens/not-image.txt that is not an image file (so I think the code should be able to handle `javax.imageio.IIOException` anyway).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937760289


   **[Test build #143989 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143989/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935301463


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48375/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881313


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48458/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721501


   **[Test build #143974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter edited a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
   > The diff coverage is `89.50%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #34187      +/-   ##
   ==========================================
   - Coverage   90.23%   82.08%   -8.16%     
   ==========================================
     Files         288      249      -39     
     Lines       61113    55769    -5344     
     Branches     8994     8505     -489     
   ==========================================
   - Hits        55146    45778    -9368     
   - Misses       4625     8812    +4187     
   + Partials     1342     1179     -163     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
   | ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935216250


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48375/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721900


   That seems reasonable to me yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937721501


   **[Test build #143974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143974/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143966/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WeichenXu123 commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937668269


   According to the code here:
   https://github.com/apache/spark/blob/2e9b698d3100f18595dadbf35abb06502c3d6123/mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala#L134
   
   `ImageIO.read(new ByteArrayInputStream(bytes))` the `bytes` is an array in memory, so will the real IO exception happen ? Seems like real IO exception is impossible. If so, we can catch all kinds of exception raised by `ImageIO.read` and regards all of them as invalid image cases. @srowen @HyukjinKwon What do you think ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935164968


   **[Test build #143862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143862/testReport)** for PR 34187 at commit [`ec82a1f`](https://github.com/apache/spark/commit/ec82a1fd0f8d5a35e5efa4023cea01aea2e44c2f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WeichenXu123 commented on a change in pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on a change in pull request #34187:
URL: https://github.com/apache/spark/pull/34187#discussion_r724113748



##########
File path: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala
##########
@@ -133,9 +133,7 @@ object ImageSchema {
     val img = try {
       ImageIO.read(new ByteArrayInputStream(bytes))
     } catch {
-      // Catch runtime exception because `ImageIO` may throw unexpected `RuntimeException`.
-      // But do not catch the declared `IOException` (regarded as FileSystem failure)
-      case _: RuntimeException => null
+      case _: Throwable => null

Review comment:
       Let's add comment say because we read from memory bytes, so no real IO exceptions will happen, then we can catch all exceptions as invalid image exception. also mention the IIOException may be raised if hit invalid image 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
   > The diff coverage is `89.50%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #34187      +/-   ##
   ==========================================
   - Coverage   90.23%   82.08%   -8.16%     
   ==========================================
     Files         288      249      -39     
     Lines       61113    55769    -5344     
     Branches     8994     8505     -489     
   ==========================================
   - Hits        55146    45778    -9368     
   - Misses       4625     8812    +4187     
   + Partials     1342     1179     -163     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
   | ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881357


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48458/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143966/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937740011


   **[Test build #143966 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937759908






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937867149


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143989/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter edited a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `8.15%`.
   > The diff coverage is `89.50%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #34187      +/-   ##
   ==========================================
   - Coverage   90.23%   82.08%   -8.16%     
   ==========================================
     Files         288      249      -39     
     Lines       61113    55769    -5344     
     Branches     8994     8505     -489     
   ==========================================
   - Hits        55146    45778    -9368     
   - Misses       4625     8812    +4187     
   + Partials     1342     1179     -163     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `82.06% <89.50%> (-8.15%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `90.39% <ø> (-1.18%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `77.17% <63.63%> (-0.30%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_dataframe.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfZGF0YWZyYW1lLnB5) | `94.75% <78.57%> (-0.16%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `92.01% <100.00%> (-0.14%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `96.17% <100.00%> (-0.01%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `94.30% <100.00%> (-0.02%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `93.98% <100.00%> (+0.05%)` | :arrow_up: |
   | ... and [93 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937676711


   **[Test build #143966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143966/testReport)** for PR 34187 at commit [`90957ff`](https://github.com/apache/spark/commit/90957ff093a7dff6fbaeb0c5fae90c5c6cd8bff4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] codecov-commenter commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937685994


   # [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#34187](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2e9b698) into [master](https://codecov.io/gh/apache/spark/commit/38d39812c176e4b52a08397f7936f87ea32930e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (38d3981) will **decrease** coverage by `17.74%`.
   > The diff coverage is `88.80%`.
   
   > :exclamation: Current head 2e9b698 differs from pull request most recent head 90957ff. Consider uploading reports for the commit 90957ff to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/spark/pull/34187/graphs/tree.svg?width=650&height=150&src=pr&token=R9pHLWgWi8&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #34187       +/-   ##
   ===========================================
   - Coverage   90.23%   72.48%   -17.75%     
   ===========================================
     Files         288      240       -48     
     Lines       61113    46182    -14931     
     Branches     8994     7669     -1325     
   ===========================================
   - Hits        55146    33477    -21669     
   - Misses       4625    11352     +6727     
   - Partials     1342     1353       +11     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests | `72.48% <88.80%> (-17.73%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [python/pyspark/sql/functions.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2Z1bmN0aW9ucy5weQ==) | `89.68% <ø> (-1.88%)` | :arrow_down: |
   | [python/pyspark/pandas/frame.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2ZyYW1lLnB5) | `33.70% <16.66%> (-62.48%)` | :arrow_down: |
   | [python/pyspark/pandas/namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL25hbWVzcGFjZS5weQ==) | `76.43% <63.63%> (-1.03%)` | :arrow_down: |
   | [python/pyspark/sql/session.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3Nlc3Npb24ucHk=) | `81.25% <83.65%> (-1.65%)` | :arrow_down: |
   | [python/pyspark/sql/catalog.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL2NhdGFsb2cucHk=) | `90.29% <90.90%> (-1.30%)` | :arrow_down: |
   | [python/pyspark/sql/window.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3Bhcmsvc3FsL3dpbmRvdy5weQ==) | `92.50% <92.00%> (-1.95%)` | :arrow_down: |
   | [python/pyspark/pandas/accessors.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2FjY2Vzc29ycy5weQ==) | `88.65% <100.00%> (-3.50%)` | :arrow_down: |
   | [python/pyspark/pandas/groupby.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2dyb3VwYnkucHk=) | `75.47% <100.00%> (-18.85%)` | :arrow_down: |
   | [python/pyspark/pandas/indexes/multi.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL2luZGV4ZXMvbXVsdGkucHk=) | `53.48% <100.00%> (-40.45%)` | :arrow_down: |
   | [python/pyspark/pandas/tests/test\_namespace.py](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cHl0aG9uL3B5c3BhcmsvcGFuZGFzL3Rlc3RzL3Rlc3RfbmFtZXNwYWNlLnB5) | `98.87% <100.00%> (+0.10%)` | :arrow_up: |
   | ... and [122 more](https://codecov.io/gh/apache/spark/pull/34187/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d6786e0...90957ff](https://codecov.io/gh/apache/spark/pull/34187?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935183053


   BTW, just for extra classification, the input data is already fully read from the file at that point: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala#L77-L83
   
   So it won't likely a failure from reading it from actual FS. Yes, it is still possible that the disk holds a corrupt data but .. I don't think there are other things we can do in this case .. other sources like CSV, Text or JSON sources won't likely be able to them either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937881357


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48458/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937671010


   Yeah actually thats what I thought too. Probably that's better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WeichenXu123 commented on a change in pull request #34187: [SPARK-29871][ML] Catch all exceptions for handling invalid images in image source

Posted by GitBox <gi...@apache.org>.
WeichenXu123 commented on a change in pull request #34187:
URL: https://github.com/apache/spark/pull/34187#discussion_r724113748



##########
File path: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala
##########
@@ -133,9 +133,7 @@ object ImageSchema {
     val img = try {
       ImageIO.read(new ByteArrayInputStream(bytes))
     } catch {
-      // Catch runtime exception because `ImageIO` may throw unexpected `RuntimeException`.
-      // But do not catch the declared `IOException` (regarded as FileSystem failure)
-      case _: RuntimeException => null
+      case _: Throwable => null

Review comment:
       Let's add comment say because we read from memory bytes, so no real IO exceptions will happen, then we can catch all exceptions as invalid image exception. also mention the IIOException may be raised if hit invalid image 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935285977


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48375/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon edited a comment on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
HyukjinKwon edited a comment on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-937360888


   Let me get this in in few more days if there are no more comments. I believe this PR fixes the source to follow the original intention (reading a malformed file permissively or show which image is malformed when `dropInvalid` option is on), and it won't silently ignore the exception from filesystem since this is already read at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #34187: [SPARK-29871][ML] Catch IIOException for signaling run-time failure of reading in image source

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #34187:
URL: https://github.com/apache/spark/pull/34187#issuecomment-935162666


   Tough one. I assume we want to return 'null' when trying to read an invalid or unsupported file, but fail outright in case of an I/O problem. This sounds like a type of "I/O problem", but, it does appear to arise when the data was read just fine but the data isn't supported or doesn't make sense: https://github.com/frohoff/jdk8u-dev-jdk/blob/master/src/share/classes/com/sun/imageio/plugins/jpeg/JPEGImageReader.java#L1068 So yes this seems reasonable.
   
   yes, weird that it happens only sometimes in a test though ... maybe a corrupted checkout on a worker?
   
   After this, we'd be unexpectedly getting a null for some image in this test - is that just irrelevant for the test?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org