You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/21 09:39:09 UTC

[GitHub] [arrow] AlenkaF opened a new pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

AlenkaF opened a new pull request #12010:
URL: https://github.com/apache/arrow/pull/12010


   Add `to_pylist` and `from_pylist` to `Table` and `RecordBatch`.
   
   `to_pylist` returns a list of dicts
   `from_pylist` returns Table/RecordBatch from a list of dicts (named mapping in the code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1009970810


   Thanks @AlenkaF !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781224600



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2589,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of rows / dictionaries.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the first row of the
+        mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(v)
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]

Review comment:
       ```suggestion
                   v = [row[n] if n in row else None for row in mapping]
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] ursabot edited a comment on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

ursabot edited a comment on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1010075009


   Benchmark runs are scheduled for baseline = 7a0141a8cc867e5b406ed97e5decc227923eb3f5 and contender = ccffcea3fd383c448aa9da292baf2d0805ecab4d. ccffcea3fd383c448aa9da292baf2d0805ecab4d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/424d64b6b5e34533bc7f1c63332e9d3c...b05a94a3e4ed4e4ba18a4eb9b2dedc9b/)
   [Failed] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/697f12ce4d4f48a5a2b79bb42c7fbfb4...b512989219cc4aa8a8b220f739994144/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0bd1e5b6de5c4a08b2109f8f98951dfd...a254108559fa4cc88bca8b69c9cd951f/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r773251719



##########
File path: python/pyarrow/table.pxi
##########
@@ -1016,6 +1058,21 @@ cdef class RecordBatch(_PandasConvertible):
             entries.append((name, column))
         return ordered_dict(entries)
 
+    def to_pylist(self):
+        """
+        Convert the RecordBatch to a list of dictionaries.
+
+        Returns
+        -------
+        list
+        """
+        entries = []
+        for i in range(self.batch.num_columns()):
+            name = bytes(self.batch.column_name(i)).decode('utf8')
+            column = self[i].to_pylist()
+            entries.append({name: column})

Review comment:
       Given this might involve thousand of columns I think there is value in building it as a list comprehension, it's generally faster than multiple appends.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r778175902



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(asarray(v))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]
+                n_type = schema.types[schema.get_field_index(n)]
+                arrays.append(asarray(v, type=n_type))

Review comment:
       I think this will actually crash when `v` is `None`. `asarray` seems to crash when invoked `asarray(None)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-998622442


   https://issues.apache.org/jira/browse/ARROW-6001


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r778812702



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(asarray(v))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]
+                n_type = schema.types[schema.get_field_index(n)]
+                arrays.append(asarray(v, type=n_type))

Review comment:
       Right, my misunderstanding was that we could end up with `v=None`, but rereading the code block I see that we would actually end up with `v=[None, None, None]` which is fine.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781008612



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.

Review comment:
       Maybe you can specify here that it will be inferred from the _first_ row (also in the actual user facing docstrings above)

##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,61 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from list of dictionary of rows.

Review comment:
       ```suggestion
           Construct a RecordBatch from list of rows / dictionaries.
   ```
   
   Each dictionary represents a row, so "dictionary of rows" sounds a bit strange ("dictionary of row values" could be strictly speaking more correct, but I still find that not super clear)

##########
File path: python/pyarrow/table.pxi
##########
@@ -1016,6 +1064,28 @@ cdef class RecordBatch(_PandasConvertible):
             entries.append((name, column))
         return ordered_dict(entries)
 
+    def to_pylist(self, index=None):
+        """
+        Convert the RecordBatch to a list of dictionaries of rows.
+
+        Parameters
+        ----------
+        index: list
+            A list of column names to index.

Review comment:
       Is this `index` keyword needed? (it's to select a subset of columns to export) Eg `to_pydict` doesn't have it (we should probably add it there as well if we want to keep it)
   
   We have nowadays the `select()` method, so it is relatively straightforward to do `table.select([...]-.to_pylist()` instead of `table.to_pylist(index=[...])`. 
   Or if we keep it, I would call it something else as `index`, but for example rather `columns=[..]`.

##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(asarray(v))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]
+                n_type = schema.types[schema.get_field_index(n)]
+                arrays.append(asarray(v, type=n_type))

Review comment:
       The `asarray` with the type from the schema also gets done inside `from_arrays`, so it might be unnecessary to do it here as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r773255388



##########
File path: python/pyarrow/table.pxi
##########
@@ -1838,6 +1953,36 @@ cdef class Table(_PandasConvertible):
 
         return ordered_dict(entries)
 
+    def to_pylist(self):
+        """
+        Convert the Table to a list of dictionaries.
+
+        Returns
+        -------
+        list
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> table = pa.table([
+        ...     pa.array([1, 2]),
+        ...     pa.array(["a", "b"])
+        ... ], names=["int", "str"])
+        >>> table.to_pylist()
+        [{'int': [1, 2]}, {'str': ['a', 'b']}]
+        """
+        cdef:
+            size_t i
+            size_t num_columns = self.table.num_columns()
+            list entries = []
+            ChunkedArray column
+
+        for i in range(num_columns):
+            column = self.column(i)
+            entries.append({self.field(i).name: column.to_pylist()})

Review comment:
       `self.itercolumns` should do the trickk for you, also the returned columns has `_name` attribute, so you don't need to look into the field.
   
   Last I suggest we use a list comprehension to speed up the building




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781223239



##########
File path: python/pyarrow/table.pxi
##########
@@ -1016,6 +1065,21 @@ cdef class RecordBatch(_PandasConvertible):
             entries.append((name, column))
         return ordered_dict(entries)
 
+    def to_pylist(self):
+        """
+        Convert the RecordBatch to a list of rows / dictionaries.
+
+        Returns
+        -------
+        list
+        """
+
+        pydict = self.to_pydict()
+        names = self.schema.names
+        pylist = [{column: pydict[column][row] if column in pydict else None for column in names}
+                  for row in range(self.num_rows)]

Review comment:
       ```suggestion
           pylist = [{column: pydict[column][row] for column in names} 
                     for row in range(self.num_rows)]
   ```
   
   I think this can be simplified now, as each `column` value of `names` is guaranteed to be in the `pydict`, since `names` is coming from the schema.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche closed pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche closed pull request #12010:
URL: https://github.com/apache/arrow/pull/12010


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] ursabot commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

ursabot commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1010075009


   Benchmark runs are scheduled for baseline = 7a0141a8cc867e5b406ed97e5decc227923eb3f5 and contender = ccffcea3fd383c448aa9da292baf2d0805ecab4d. ccffcea3fd383c448aa9da292baf2d0805ecab4d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/424d64b6b5e34533bc7f1c63332e9d3c...b05a94a3e4ed4e4ba18a4eb9b2dedc9b/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/697f12ce4d4f48a5a2b79bb42c7fbfb4...b512989219cc4aa8a8b220f739994144/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0bd1e5b6de5c4a08b2109f8f98951dfd...a254108559fa4cc88bca8b69c9cd951f/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1004884636


   I have corrected the code so that `pylist` is meant to be structured as a list of dicts, one dict per row. If the column name is missing from the schema or from the data, `None` is put in.
   
   If the schema is not given in `from_pylist(),` the keys from the first dictionary are used to define the names of the columns in the Table/RecordBatch.
   
   Also I saw that there are only a few examples added in the [docstrings](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table). I think they are a great way to understand the codebase as they are easy to locate and to use. I would be happy to add them as a part of the [Documentation Improvements](https://issues.apache.org/jira/browse/ARROW-13407) or as a separate JIRA issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r778695741



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(asarray(v))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]
+                n_type = schema.types[schema.get_field_index(n)]
+                arrays.append(asarray(v, type=n_type))

Review comment:
       The test for the case where schema names are missing from the pylist is added in this PR and it passes:
   https://github.com/apache/arrow/blob/19f212f4f7af0f7720d79a927c4850849da77678/python/pyarrow/tests/test_table.py#L1525-L1538




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1009767550


   @AlenkaF there is also a linter error (probably due to my suggestions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781171676



##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,61 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from list of dictionary of rows.

Review comment:
       Totally agree. +1 for the suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r774506470



##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,55 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from Arrow arrays or columns.
+
+        Parameters
+        ----------
+        mapping : list of dicts or Mappings
+            A mapping of strings to Arrays or Python lists.
+        schema : Schema, default None
+            If not passed, will be inferred from the Mapping values.
+        metadata : dict or Mapping, default None
+            Optional metadata for the schema (if inferred).
+
+        Returns
+        -------
+        RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pylist = [{'int': [1, 2]}, {'str': ['a', 'b']}]
+        >>> pa.RecordBatch.from_pylist(pylist)

Review comment:
       Agree, As a user I would personally find more obvious that it accepts a list of rows whhen building a RecordBatch or Table. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r774453135



##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,55 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from Arrow arrays or columns.
+
+        Parameters
+        ----------
+        mapping : list of dicts or Mappings
+            A mapping of strings to Arrays or Python lists.
+        schema : Schema, default None
+            If not passed, will be inferred from the Mapping values.
+        metadata : dict or Mapping, default None
+            Optional metadata for the schema (if inferred).
+
+        Returns
+        -------
+        RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pylist = [{'int': [1, 2]}, {'str': ['a', 'b']}]
+        >>> pa.RecordBatch.from_pylist(pylist)

Review comment:
       This is the same example as for `from_pydict`, that's maybe a copy paste leftover?

##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,55 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from Arrow arrays or columns.
+
+        Parameters
+        ----------
+        mapping : list of dicts or Mappings
+            A mapping of strings to Arrays or Python lists.
+        schema : Schema, default None
+            If not passed, will be inferred from the Mapping values.
+        metadata : dict or Mapping, default None
+            Optional metadata for the schema (if inferred).
+
+        Returns
+        -------
+        RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pylist = [{'int': [1, 2]}, {'str': ['a', 'b']}]
+        >>> pa.RecordBatch.from_pylist(pylist)

Review comment:
       Actually, I see the difference now (this is a list of separate dict per column instead of a single dict). But, I think this is actually not exactly what we want here. Or at least, my expectation was to be able to handle an input like:
   
   ```
   pylist = [{'int': 1, 'str'; 'a'}, {'int': 2, 'str': 'b'}]
   ```
   
   So also a list of dict, but differently organized (one dict per row, instead of one dict per column)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781224257



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2589,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of rows / dictionaries.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the first row of the
+        mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]

Review comment:
       ```suggestion
               v = [row[n] if n in row else None for row in mapping]
   ```
   
   (small suggestion that makes it maybe a bit more readable; in general I think one-letter variables names are best avoided, except for general counting iterator variables like `i` in a loop (but here `i` is not a integer count))




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781327109



##########
File path: python/pyarrow/table.pxi
##########
@@ -1844,6 +1967,30 @@ cdef class Table(_PandasConvertible):
 
         return ordered_dict(entries)
 
+    def to_pylist(self):
+        """
+        Convert the Table to a list of rows / dictionaries.
+
+        Returns
+        -------
+        list
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> table = pa.table([
+        ...     pa.array([1, 2]),
+        ...     pa.array(["a", "b"])
+        ... ], names=["int", "str"])
+        >>> table.to_pylist()
+        [{'int': 1, 'str': 'a'}, {'int': 2, 'str': 'b'}]
+        """
+        pydict = self.to_pydict()
+        names = self.schema.names
+        pylist = [{column: pydict[column][row] if column in pydict else None for column in names}

Review comment:
       Ah, missed on, the same simplification can be done here as well
   ```suggestion
           pylist = [{column: pydict[column][row] for column in names}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1009768805


   Will pull the changes and correct, thanks for the ping!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781106452



##########
File path: python/pyarrow/table.pxi
##########
@@ -1016,6 +1064,28 @@ cdef class RecordBatch(_PandasConvertible):
             entries.append((name, column))
         return ordered_dict(entries)
 
+    def to_pylist(self, index=None):
+        """
+        Convert the RecordBatch to a list of dictionaries of rows.
+
+        Parameters
+        ----------
+        index: list
+            A list of column names to index.

Review comment:
       Thanks for all the info! Will remove it - I thought it would be good to have it before =)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r781224257



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2589,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of rows / dictionaries.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the first row of the
+        mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]

Review comment:
       ```suggestion
               v = [row[n] if n in row else None for row in mapping]
   ```
   
   (small suggestion that makes it maybe a bit more readable)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r773257529



##########
File path: python/pyarrow/table.pxi
##########
@@ -2436,6 +2581,52 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from Arrow arrays or columns.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts or Mappings
+        A mapping of strings to Arrays or Python lists.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        for item in mapping:
+            name = list(item.keys())[0]
+            names.append(name)
+            arrays.append(asarray(item[name]))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    elif isinstance(schema, Schema):
+        for field in schema:
+            value = [v.get(field.name) for v in mapping]
+            v = next((i for i in value if i is not None), None)
+            if v is None:
+                present = [list(v.keys())[0] for v in mapping]
+                missing = [n for n in schema.names if n not in present]
+                raise KeyError(
+                    "The passed mapping doesn't contain the "
+                    "following field(s) of the schema: {}".
+                    format(', '.join(missing))
+                )

Review comment:
       I think this is a check that we should do in `from_arrays` if we want to do it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r773716559



##########
File path: python/pyarrow/table.pxi
##########
@@ -2436,6 +2581,52 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from Arrow arrays or columns.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts or Mappings
+        A mapping of strings to Arrays or Python lists.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        for item in mapping:
+            name = list(item.keys())[0]
+            names.append(name)
+            arrays.append(asarray(item[name]))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    elif isinstance(schema, Schema):
+        for field in schema:
+            value = [v.get(field.name) for v in mapping]
+            v = next((i for i in value if i is not None), None)
+            if v is None:
+                present = [list(v.keys())[0] for v in mapping]
+                missing = [n for n in schema.names if n not in present]
+                raise KeyError(
+                    "The passed mapping doesn't contain the "
+                    "following field(s) of the schema: {}".
+                    format(', '.join(missing))
+                )

Review comment:
       In the `_sanitize_arrays` which is called from `from_arrays` there is a check for the length of `schema` and `arrays`. So in any case an error would be raised but I think it is good we keep this one, same to [`_from_pydict`](https://github.com/apache/arrow/blob/14c04906fdf143d893f8c9d484cb0cd2b6f3f42b/python/pyarrow/table.pxi#L2572-L2576), as it is more informative.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] AlenkaF commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

AlenkaF commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r774511196



##########
File path: python/pyarrow/table.pxi
##########
@@ -671,13 +671,55 @@ cdef class RecordBatch(_PandasConvertible):
         Returns
         -------
         RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pydict = {'int': [1, 2], 'str': ['a', 'b']}
+        >>> pa.RecordBatch.from_pydict(pydict)
+        pyarrow.RecordBatch
+        int: int64
+        str: string
         """
 
         return _from_pydict(cls=RecordBatch,
                             mapping=mapping,
                             schema=schema,
                             metadata=metadata)
 
+    @staticmethod
+    def from_pylist(mapping, schema=None, metadata=None):
+        """
+        Construct a RecordBatch from Arrow arrays or columns.
+
+        Parameters
+        ----------
+        mapping : list of dicts or Mappings
+            A mapping of strings to Arrays or Python lists.
+        schema : Schema, default None
+            If not passed, will be inferred from the Mapping values.
+        metadata : dict or Mapping, default None
+            Optional metadata for the schema (if inferred).
+
+        Returns
+        -------
+        RecordBatch
+
+        Examples
+        --------
+        >>> import pyarrow as pa
+        >>> pylist = [{'int': [1, 2]}, {'str': ['a', 'b']}]
+        >>> pa.RecordBatch.from_pylist(pylist)

Review comment:
       Thanks for the additional explanation! Will correct (in the 2022 I think =)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

amol- commented on a change in pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#discussion_r778175902



##########
File path: python/pyarrow/table.pxi
##########
@@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata):
         raise TypeError('Schema must be an instance of pyarrow.Schema')
 
 
+def _from_pylist(cls, mapping, schema, metadata):
+    """
+    Construct a Table/RecordBatch from list of dictionary of rows.
+
+    Parameters
+    ----------
+    cls : Class Table/RecordBatch
+    mapping : list of dicts of rows
+        A mapping of strings to row values.
+    schema : Schema, default None
+        If not passed, will be inferred from the Mapping values.
+    metadata : dict or Mapping, default None
+        Optional metadata for the schema (if inferred).
+
+    Returns
+    -------
+    Table/RecordBatch
+    """
+
+    arrays = []
+    if schema is None:
+        names = []
+        if mapping:
+            names = list(mapping[0].keys())
+        for n in names:
+            v = [i[n] if n in i else None for i in mapping]
+            arrays.append(asarray(v))
+        return cls.from_arrays(arrays, names, metadata=metadata)
+    else:
+        if isinstance(schema, Schema):
+            for n in schema.names:
+                v = [i[n] if n in i else None for i in mapping]
+                n_type = schema.types[schema.get_field_index(n)]
+                arrays.append(asarray(v, type=n_type))

Review comment:
       I think this will actually crash when `v` is `None`. `asarray` seems to crash when invoked `asarray(None)`.
   
   Guess the same applies to https://github.com/apache/arrow/pull/12010/files#diff-cede36e8e2e0eb6e6e1ee21745db9687174527f463520c6e6d8b9e8f957bf304R2631




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] ursabot edited a comment on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

Posted by GitBox <gi...@apache.org>.

ursabot edited a comment on pull request #12010:
URL: https://github.com/apache/arrow/pull/12010#issuecomment-1010075009


   Benchmark runs are scheduled for baseline = 7a0141a8cc867e5b406ed97e5decc227923eb3f5 and contender = ccffcea3fd383c448aa9da292baf2d0805ecab4d. ccffcea3fd383c448aa9da292baf2d0805ecab4d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/424d64b6b5e34533bc7f1c63332e9d3c...b05a94a3e4ed4e4ba18a4eb9b2dedc9b/)
   [Failed] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/697f12ce4d4f48a5a2b79bb42c7fbfb4...b512989219cc4aa8a8b220f739994144/)
   [Finished :arrow_down:0.22% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0bd1e5b6de5c4a08b2109f8f98951dfd...a254108559fa4cc88bca8b69c9cd951f/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org