You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/07 11:14:44 UTC

[GitHub] [spark] ELHoussineT opened a new pull request, #37817: Avoid Numpy deprecation warning

ELHoussineT opened a new pull request, #37817:
URL: https://github.com/apache/spark/pull/37817

   Using `np.bool` generates this warning: 
   
   ```
   UserWarning: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.pyspark.enabled' is set to true, but has reached the error below and can not continue. Note that 'spark.sql.execution.arrow.pyspark.fallback.enabled' does not have an effect on failures in the middle of computation.
   3070E                     `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
   3071E                   Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
   ```
   
   
   ### What changes were proposed in this pull request?
   
   Use `bool` instead of `np.bool` as `np.bool` will be deprecated (see: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations)
   
   
   ### Why are the changes needed?
   Deprecation soon: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations.
   
   
   ### Does this PR introduce _any_ user-facing change?
   The warning will be suppressed
   
   
   ### How was this patch tested?
   As per Numpy's recommendation. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ELHoussineT commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
ELHoussineT commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1239433373

   > Seems fine, though is this just an alias for previous versions of numpy that are currently supported too?
   
   Correct. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240140742

   Would you check the "Workflow run detection failed" in https://github.com/apache/spark/pull/37817/checks?check_run_id=8226916282 for enabling Github Actions ??


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1239398099

   Seems fine, though is this just an alias for previous versions of numpy that are currently supported too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning
URL: https://github.com/apache/spark/pull/37817


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] joaoleveiga commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "joaoleveiga (via GitHub)" <gi...@apache.org>.
joaoleveiga commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1439944324

   Hello all. Is this only going to be released in PySpark 3.4?
   
   I looked at the branch-3.3 code and failed to see this change, if I saw correctly.
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444561437

   Well, I think we're talking about numpy 1.20 here, not >1.20. You're correct that you therefore would not use the latest versions of numpy with Spark 3.3, but would work with 3.4. If that presents a significant problem during the lifetime of Spark 3.3, sure I think that's a decent argument to back-port. Do you know what version of numpy actually removed this ? if it not a recent removal, yeah I think we should back port this simple change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240145197

   The PR description usually started from "What changes were proposed in this pull request?"
   
   So, can we put the description
   
   """
   Using np.bool generates this warning:
   
   ```
   UserWarning: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.pyspark.enabled' is set to true, but has reached the error below and can not continue. Note that 'spark.sql.execution.arrow.pyspark.fallback.enabled' does not have an effect on failures in the middle of computation.
   3070E                     `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
   3071E                   Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
   ```
   """
   
   into "What changes were proposed in this pull request?" ??
   
   e.g.
   
   ### What changes were proposed in this pull request?
   
   Using np.bool generates this warning:
   
   ```
   UserWarning: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.pyspark.enabled' is set to true, but has reached the error below and can not continue. Note that 'spark.sql.execution.arrow.pyspark.fallback.enabled' does not have an effect on failures in the middle of computation.
   3070E                     `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
   3071E                   Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
   ```
   
   Use bool instead of np.bool as np.bool will be deprecated (see: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations)
   
   ### Why are the changes needed?
   
   ...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1440141233

   It's in 3.4, not 3.3, yes: https://github.com/apache/spark/blob/v3.4.0-rc1/python/pyspark/sql/pandas/conversion.py#L301
   See the JIRA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1244796187

   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240145633

   Looks fine otherwise, thanks for the your first contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444533942

   This is just a deprecation warning, not an error, right? I don't see a particular urgency here.
   I don't think this is related to Databricks, particularly, either - Databricks can do what it likes with patches, etc. It will have a runtime based on 3.4 shortly after it's released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1241842534

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1244825691

   Cool !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1243681870

   Ping @ELHoussineT 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240139388

   Can we add a `[PYTHON]` tag for the title ?
   
   Also check the https://spark.apache.org/contributing.html out when you find some time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444598032

   Looks like 1.22 removed it actually. That's still not recent. Yeah I think this is worth back porting


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "srowen (via GitHub)" <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1445157668

   Also merged to 3.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] aimtsou commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "aimtsou (via GitHub)" <gi...@apache.org>.
aimtsou commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1445189813

   Thank you @srowen, really appreciated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] aimtsou commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "aimtsou (via GitHub)" <gi...@apache.org>.
aimtsou commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444586606

   Yes we agree that users can limit their numpy system installation to < 1.20.0, if they use Spark 3.3
   
   I will have to check and test the different versions but I believe according to the notes from numpy should be from the numpy 1.20.0 according to [1](https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations),[2](https://github.com/numpy/numpy/pull/14882). I will have to verify it to be sure though.
   
   Well numpy 1.20.0 was released in 01/2021 which makes it 2 year old but the final decision is up to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1241962781

   Oh, remove the type ignore comment:
   
   ```
   annotations failed mypy checks:
   python/pyspark/sql/pandas/conversion.py:298: error: unused "type: ignore" comment
   Found 1 error in 1 file (checked 339 source files)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37817: [SPARK-40376] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240141787

   I think `How was this patch tested?` in the PR description should contain how do we verify this patch within Apache Spark code base.
   
   In this case, we can just simply mention like: `Using the existing test` for example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ELHoussineT commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
ELHoussineT commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1243834129

   @srowen Sorry for the late reply. 
   
   Updated, let's see if it will go through. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] aimtsou commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "aimtsou (via GitHub)" <gi...@apache.org>.
aimtsou commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444515612

   @srowen: Although this is causing an issue:
   
   If you try to build your own docker image of Spark including pyspark while trying to be compliant with Databricks you will observe that Databricks Runtime 12.1 and 12.2(which is currently in beta), both support officially until Spark 3.3.1 (while current version is 3.3.2). 
   
   Actually all of the LTS versions in the [support matrix](https://docs.databricks.com/release-notes/runtime/releases.html) are not EOLed and since numpy 1.20.0 was released in 01/2021, which means that most spark compliant versions carry this bug. If you try to use Pandas by using toPandas() you end up with the numpy error, consequently being blocked from upgrading your spark versions. 
   
   Is there any chance of back-porting this commit into previous pyspark versions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] aimtsou commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "aimtsou (via GitHub)" <gi...@apache.org>.
aimtsou commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1444549904

   Hi @srowen,
   
   Thank you for your very prompt reply.
   
   You are not correct about the error, after 1.20.0 it creates an attribute error       
   ```
             if attr in __former_attrs__:
   >           raise AttributeError(__former_attrs__[attr])
   E           AttributeError: module 'numpy' has no attribute 'bool'.
   E           `np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
   E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
   E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
   
   /usr/local/lib/python3.9/site-packages/numpy/__init__.py:305: AttributeError
   ```
   
   This is the end of an error, coming after calling the function toPandas() from my tests:
   
   ```
   /usr/local/lib/python3.9/site-packages/<my-pkg>/unit/test_case_runner.py:26: in run_test
       self.assert_df_are_equal(expected_df, actual)
   /usr/local/lib/python3.9/site-packages/<my-pkg>/unit/test_case_runner.py:58: in assert_df_are_equal
       self.handler.compare_df(result, expected, config=self.compare_config)
   /usr/local/lib/python3.9/site-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
       actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
   /usr/local/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py:216: in toPandas
       pandas_type = PandasConversionMixin._to_corrected_pandas_type(field.dataType)
   /usr/local/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py:298: in _to_corrected_pandas_type
       return np.bool  # type: ignore[attr-defined]
   ```
   
   And the error does not come from the numpy in the system but by the numpy inside pyspark
   
   I agree about the comments on databricks but as shown above this does not work on Spark 3.3.1 independently if you want to be compliant with Databricks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] joaoleveiga commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by "joaoleveiga (via GitHub)" <gi...@apache.org>.
joaoleveiga commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1445218024

   > Also merged to 3.3
   
   Thank you so much! Here I was assuming I would pick up this thread on monday but you delivered it 😄 
   
   Cheers


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1241324000

   Hm, try pushing an empty commit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ELHoussineT commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

Posted by GitBox <gi...@apache.org>.
ELHoussineT commented on PR #37817:
URL: https://github.com/apache/spark/pull/37817#issuecomment-1240446402

   @itholic 
   
   > Can we add a [PYTHON] tag for the title ?
   
   Done 
   
   > Also check the https://spark.apache.org/contributing.html out when you find some time.
   
   I am sorry, you are right. 
   
   > Would you check the "Workflow run detection failed" in https://github.com/apache/spark/pull/37817/checks?check_run_id=8226916282 for enabling Github Actions ??
   
   I did that and the build in my folk went through: https://github.com/ELHoussineT/spark/actions/workflows/build_main.yml
   But the actions in the PR are still red, thoughts? 
   
   > So, can we put the description into "What changes were proposed in this pull request?" ??
   
   Done
   
   > Thanks for the your first contribution to Apache Spark!
   
   Its a tine one! You're welcome :)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org