You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Antoine Pitrou (JIRA)" <ji...@apache.org> on 2019/06/03 12:13:00 UTC

[jira] [Updated] (ARROW-5471) [C++][Gandiva]Array offset is ignored in Gandiva projector

     [ https://issues.apache.org/jira/browse/ARROW-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antoine Pitrou updated ARROW-5471:
----------------------------------
    Component/s: C++ - Gandiva

> [C++][Gandiva]Array offset is ignored in Gandiva projector
> ----------------------------------------------------------
>
>                 Key: ARROW-5471
>                 URL: https://issues.apache.org/jira/browse/ARROW-5471
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++ - Gandiva
>            Reporter: Zeyuan Shang
>            Priority: Major
>
> I used the test case in [https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py#L25], and found an issue when I was using the slice operator {{input_batch[1:]}}. It seems that the offset is ignored in the Gandiva projector.
> {code:java}
> import pyarrow as pa
> import pyarrow.gandiva as gandiva
> builder = gandiva.TreeExprBuilder()
> field_a = pa.field('a', pa.int32())
> field_b = pa.field('b', pa.int32())
> schema = pa.schema([field_a, field_b])
> field_result = pa.field('res', pa.int32())
> node_a = builder.make_field(field_a)
> node_b = builder.make_field(field_b)
> condition = builder.make_function("greater_than", [node_a, node_b],
> pa.bool_())
> if_node = builder.make_if(condition, node_a, node_b, pa.int32())
> expr = builder.make_expression(if_node, field_result)
> projector = gandiva.make_projector(
> schema, [expr], pa.default_memory_pool())
> a = pa.array([10, 12, -20, 5], type=pa.int32())
> b = pa.array([5, 15, 15, 17], type=pa.int32())
> e = pa.array([10, 15, 15, 17], type=pa.int32())
> input_batch = pa.RecordBatch.from_arrays([a, b], names=['a', 'b'])
> r, = projector.evaluate(input_batch[1:])
> print(r)
> {code}
> If we use the full record batch {{input_batch}}, the expected output is {{[10, 15, 15, 17]}}. So if we use {{input_batch[1:]}}, the expected output should be {{[15, 15, 17]}}, however this script returned {{[10, 15, 15]}}. It seems that the projector ignores the offset and always reads from 0.
>  
> A corresponding issue is created in GitHub as well [https://github.com/apache/arrow/issues/4420]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)