You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/03/21 13:09:11 UTC

[GitHub] [arrow] AlenkaF opened a new pull request, #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

AlenkaF opened a new pull request, #34658:
URL: https://github.com/apache/arrow/pull/34658

   ### Rationale for this change
   
   Currently, `pyarrow.array` doesn't accept list of pyarrow Scalars and this PR adds a check to allow that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] danepitkin commented on a diff in pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "danepitkin (via GitHub)" <gi...@apache.org>.
danepitkin commented on code in PR #34658:
URL: https://github.com/apache/arrow/pull/34658#discussion_r1172794081


##########
python/pyarrow/array.pxi:
##########
@@ -318,11 +318,29 @@ def array(object obj, type=None, mask=None, size=None, from_pandas=None,
             if pandas_api.have_pandas:
                 values, type = pandas_api.compat.get_datetimetz_type(
                     values, obj.dtype, type)
-            result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
-                                       pool)
+            try:
+                result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
+                                           pool)
+            except ArrowInvalid as err:
+                if "Scalar" in str(err):

Review Comment:
   Relying on a substring within the error message seems a bit hacky. Is there any other way to check by chance?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on PR #34658:
URL: https://github.com/apache/arrow/pull/34658#issuecomment-1486466657

   This seems to work? 
   It won't work for generators of pyarrow scalars (because they will already be consumed the first time we call `_sequence_to_array`. I am not sure we have to worry about that case (since this PR just adding some extra convenience anyway). But we should maybe at least test those cases to ensure it errors (and eg not silently return an empty array)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] AlenkaF commented on a diff in pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on code in PR #34658:
URL: https://github.com/apache/arrow/pull/34658#discussion_r1172823110


##########
python/pyarrow/array.pxi:
##########
@@ -318,11 +318,29 @@ def array(object obj, type=None, mask=None, size=None, from_pandas=None,
             if pandas_api.have_pandas:
                 values, type = pandas_api.compat.get_datetimetz_type(
                     values, obj.dtype, type)
-            result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
-                                       pool)
+            try:
+                result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
+                                           pool)
+            except ArrowInvalid as err:
+                if "Scalar" in str(err):

Review Comment:
   Oh yeah, it's very hacky! =)
   I have tried multiple ways of catching the specific `ArrowInvalid` error as others are expected here also, but all of them have some issue. Can try to remember them all and list them, but Joris advised me to go into the Python C++ code and use `ConvertToSequenceAndInferSize` so this will be changed. I should convert this PR into a draft to make this clear.
   
   Thank you for reviewing!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] AlenkaF commented on pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on PR #34658:
URL: https://github.com/apache/arrow/pull/34658#issuecomment-1486505102

   > This seems to work?
   
   Yes, it is working for sequences with `pa.Scalars`.
   
   > It won't work for generators of pyarrow scalars (because they will already be consumed the first time we call `_sequence_to_array`.
   
   Oh yes, that is true (missed it).
   
   > I am not sure we have to worry about that case (since this PR just adding some extra convenience anyway). But we should maybe at least test those cases to ensure it errors (and eg not silently return an empty array)
   
   Agree, will add a test 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] AlenkaF commented on pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF commented on PR #34658:
URL: https://github.com/apache/arrow/pull/34658#issuecomment-1596929749

   Closing this PR in favour of https://github.com/apache/arrow/pull/36162.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34658:
URL: https://github.com/apache/arrow/pull/34658#issuecomment-1477811009

   * Closes: #21761


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on code in PR #34658:
URL: https://github.com/apache/arrow/pull/34658#discussion_r1150255232


##########
python/pyarrow/tests/test_array.py:
##########
@@ -3409,6 +3409,17 @@ def test_array_accepts_pyarrow_array():
     assert arr == result
 
 
+def test_array_accepts_pyarrow_scalar():

Review Comment:
   In `test_convert_builtin.py`, there is a `parametrize_with_collections_types` that can be used to test with multiple sequence types. 
   (maybe move the test there to make it easier to use that, while scalars are not "built-in" python objects, it's still converting from a built-in sequence type of scalars ... )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] AlenkaF closed pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?

Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF closed pull request #34658: GH-21761: [Python] accept pyarrow values / scalars in constructor functions ?
URL: https://github.com/apache/arrow/pull/34658


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org