You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dominik Moritz (Jira)" <ji...@apache.org> on 2021/05/08 16:54:00 UTC

[jira] [Updated] (ARROW-10450) [JS] Table.fromStruct() silently truncates vectors to the first chunk

     [ https://issues.apache.org/jira/browse/ARROW-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominik Moritz updated ARROW-10450:
-----------------------------------
    Summary: [JS] Table.fromStruct() silently truncates vectors to the first chunk  (was: [Javascript] Table.fromStruct() silently truncates vectors to the first chunk)

> [JS] Table.fromStruct() silently truncates vectors to the first chunk
> ---------------------------------------------------------------------
>
>                 Key: ARROW-10450
>                 URL: https://issues.apache.org/jira/browse/ARROW-10450
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: JavaScript
>    Affects Versions: 2.0.0
>            Reporter: David Saslawsky
>            Priority: Minor
>
> Table.fromStruct() only uses the first chunk from the input vector.
> {code:javascript}
> import { Bool, Field, Int32, Struct, Table, Vector } from "apache-arrow";
> const myStruct = new Struct([
>   Field.new({ name: "over", type: new Int32() }),
>   Field.new({ name: "out", type: new Bool() })
> ]);
> const data = [];
> for(let i=0;i<1500;i++) {
>   data.push({ over:i, out:i%2 === 0 });
> // create a vector with two chunks
> const victor = Vector.from({
>   type: myStruct,
>   /*highWaterMark: Infinity,*/
>   values: data
> });
> console.log(victor.length);  // 1500 
> const table = Table.fromStruct(victor);
> console.log(table.length);   // 1000
> {code}
>  The workaround is to set highWaterMark to Infinity
>  
> Table.new() works as expected
> {code:javascript}
> const int32Array = new Int32Array(1500);for(let i=0;i<1500;i++)  int32Array[i] = i;
> const intVector = Vector.from({  type: new Int32(),  values: int32Array});
> console.log(intVector.length);  // 1500
>  const intTable = Table.new({ intColumn:intVector });
> console.log(intTable.length);   // 1500
> {code}
>  
> The origin seems to be in Chunked.data() but I don't understand the code enough to propose a fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)