You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/07/25 17:23:00 UTC
[jira] [Commented] (ARROW-6006) [C++] Empty IPC streams containing
a dictionary are corrupt
[ https://issues.apache.org/jira/browse/ARROW-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892974#comment-16892974 ]
Wes McKinney commented on ARROW-6006:
-------------------------------------
That sounds like a bug in the Java implementation.
> [C++] Empty IPC streams containing a dictionary are corrupt
> -----------------------------------------------------------
>
> Key: ARROW-6006
> URL: https://issues.apache.org/jira/browse/ARROW-6006
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Steven Fackler
> Priority: Major
>
>
> {code:java}
> #include <arrow/api.h>
> #include <arrow/ipc/api.h>
> #include <arrow/io/api.h>
> void check(arrow::Status status) {
> if (!status.ok()) {
> status.Abort();
> }
> }
> int main() {
> auto type = arrow::dictionary(arrow::int8(), arrow::utf8());
> auto f0 = arrow::field("f0", type);
> auto schema = arrow::schema({f0});
> std::shared_ptr<arrow::io::BufferOutputStream> os;
> check(arrow::io::BufferOutputStream::Create(0, arrow::default_memory_pool(), &os));
> std::shared_ptr<arrow::ipc::RecordBatchWriter> writer;
> check(arrow::ipc::RecordBatchStreamWriter::Open(&*os, schema, &writer));
> check(writer->Close());
> std::shared_ptr<arrow::Buffer> buffer;
> check(os->Finish(&buffer));
> arrow::io::BufferReader is(buffer);
> std::shared_ptr<arrow::ipc::RecordBatchReader> reader;
> check(arrow::ipc::RecordBatchStreamReader::Open(&is, &reader));
> std::shared_ptr<arrow::RecordBatch> batch;
> check(reader->ReadNext(&batch));
> }
> {code}
>
> {noformat}
> -- Arrow Fatal Error --
> Invalid: Expected message in stream, was null or length 0{noformat}
> It seems like this was caused by [https://github.com/apache/arrow/commit/e68ca7f9aed876a1afcad81a417afb87c94ee951], which moved the dictionary values from the DataType to the array itself.
> I initially thought I could work around this by writing a zero-length table but that doesn't seem to actually work.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)