You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/07/25 17:23:00 UTC

[jira] [Commented] (ARROW-6006) [C++] Empty IPC streams containing a dictionary are corrupt

    [ https://issues.apache.org/jira/browse/ARROW-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892974#comment-16892974 ] 

Wes McKinney commented on ARROW-6006:
-------------------------------------

That sounds like a bug in the Java implementation. 

> [C++] Empty IPC streams containing a dictionary are corrupt
> -----------------------------------------------------------
>
>                 Key: ARROW-6006
>                 URL: https://issues.apache.org/jira/browse/ARROW-6006
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Steven Fackler
>            Priority: Major
>
>  
> {code:java}
> #include <arrow/api.h>
> #include <arrow/ipc/api.h>
> #include <arrow/io/api.h>
> void check(arrow::Status status) {
>     if (!status.ok()) {
>         status.Abort();
>     }
> }
> int main() {
>     auto type = arrow::dictionary(arrow::int8(), arrow::utf8());
>     auto f0 = arrow::field("f0", type);
>     auto schema = arrow::schema({f0});
>     std::shared_ptr<arrow::io::BufferOutputStream> os;
>     check(arrow::io::BufferOutputStream::Create(0, arrow::default_memory_pool(), &os));
>     std::shared_ptr<arrow::ipc::RecordBatchWriter> writer;
>     check(arrow::ipc::RecordBatchStreamWriter::Open(&*os, schema, &writer));
>     check(writer->Close());
>     std::shared_ptr<arrow::Buffer> buffer;
>     check(os->Finish(&buffer));
>     arrow::io::BufferReader is(buffer);
>     std::shared_ptr<arrow::ipc::RecordBatchReader> reader;
>     check(arrow::ipc::RecordBatchStreamReader::Open(&is, &reader));
>     std::shared_ptr<arrow::RecordBatch> batch;
>     check(reader->ReadNext(&batch));
> }
> {code}
>  
> {noformat}
> -- Arrow Fatal Error --
> Invalid: Expected message in stream, was null or length 0{noformat}
> It seems like this was caused by [https://github.com/apache/arrow/commit/e68ca7f9aed876a1afcad81a417afb87c94ee951], which moved the dictionary values from the DataType to the array itself.
> I initially thought I could work around this by writing a zero-length table but that doesn't seem to actually work.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)