You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "David Saslawsky (Jira)" <ji...@apache.org> on 2020/10/31 09:25:00 UTC
[jira] [Created] (ARROW-10450) Table.fromStruct() silently
truncates vectors to the first chunk
David Saslawsky created ARROW-10450:
---------------------------------------
Summary: Table.fromStruct() silently truncates vectors to the first chunk
Key: ARROW-10450
URL: https://issues.apache.org/jira/browse/ARROW-10450
Project: Apache Arrow
Issue Type: Bug
Components: JavaScript
Affects Versions: 2.0.0
Reporter: David Saslawsky
Table.fromStruct() only uses the first chunk from the input vector.
{code:javascript}
import { Bool, Field, Int32, Struct, Table, Vector } from "apache-arrow";
const myStruct = new Struct([
Field.new({ name: "over", type: new Int32() }),
Field.new({ name: "out", type: new Bool() })
]);
const data = [];
for(let i=0;i<1500;i++) {
data.push({ over:i, out:i%2 === 0 });
// create a vector with two chunks
const victor = Vector.from({
type: myStruct,
/*highWaterMark: Infinity,*/
values: data
});
console.log(victor.length); // 1500
const table = Table.fromStruct(victor);
console.log(table.length); // 1000
{code}
The workaround is to set highWaterMark to Infinity
Table.new() works as expected
{code:javascript}
const int32Array = new Int32Array(1500);for(let i=0;i<1500;i++) int32Array[i] = i;
const intVector = Vector.from({ type: new Int32(), values: int32Array});
console.log(intVector.length); // 1500
const intTable = Table.new({ intColumn:intVector });
console.log(intTable.length); // 1500
{code}
The origin seems to be in Chunked.data() but I don't understand the code enough to propose a fix.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)