You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "helmi (Jira)" <ji...@apache.org> on 2022/09/20 16:34:00 UTC
[jira] [Commented] (ARROW-17644) Exception when reading binary arrow file (Value cannot be null. (Parameter 'name'))
[ https://issues.apache.org/jira/browse/ARROW-17644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607296#comment-17607296 ]
helmi commented on ARROW-17644:
-------------------------------
Trying to debug the issue and it turns out that `FixedSizeList` is unsupported, any thoughts ?
{code:java}
Unhandled exception: System.IO.InvalidDataException: Arrow primitive 'FixedSizeList' is unsupported.
at Apache.Arrow.Ipc.MessageSerializer.GetFieldArrowType(Field field, Field[] childFields) in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/MessageSerializer.cs:line 199
at Apache.Arrow.Ipc.MessageSerializer.FieldFromFlatbuffer(Field flatbufField, DictionaryMemo& dictionaryMemo) in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/MessageSerializer.cs:line 86
at Apache.Arrow.Ipc.MessageSerializer.GetSchema(Schema schema, DictionaryMemo& dictionaryMemo) in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/MessageSerializer.cs:line 62
at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.<ReadSchema>b__11_0(Memory`1 buff) in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/ArrowStreamReaderImplementation.cs:line 173
at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadSchema() in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/ArrowStreamReaderImplementation.cs:line 167
at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadRecordBatch() in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/ArrowStreamReaderImplementation.cs:line 99
at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadNextRecordBatch() in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/ArrowStreamReaderImplementation.cs:line 53
at Apache.Arrow.Ipc.ArrowStreamReader.ReadNextRecordBatch() in /home/helmi/workspace/arrow/csharp/src/Apache.Arrow/Ipc/ArrowStreamReader.cs:line 87
{code}
> Exception when reading binary arrow file (Value cannot be null. (Parameter 'name'))
> -----------------------------------------------------------------------------------
>
> Key: ARROW-17644
> URL: https://issues.apache.org/jira/browse/ARROW-17644
> Project: Apache Arrow
> Issue Type: Bug
> Components: C#
> Reporter: helmi
> Priority: Major
> Attachments: sample.arrow
>
>
> Hi everyone,
> I'm trying to read binary file using csharp apache arrow library v9.0.0 and I'm facing this exception
> {code:java}
> Unhandled exception. System.AggregateException: One or more errors occurred. (Value cannot be null. (Parameter 'name'))
> ---> System.ArgumentNullException: Value cannot be null. (Parameter 'name')
> at Apache.Arrow.Field..ctor(String name, IArrowType dataType, Boolean nullable)
> at Apache.Arrow.Ipc.MessageSerializer.FieldFromFlatbuffer(Field flatbufField, DictionaryMemo& dictionaryMemo)
> at Apache.Arrow.Ipc.MessageSerializer.FieldFromFlatbuffer(Field flatbufField, DictionaryMemo& dictionaryMemo)
> at Apache.Arrow.Ipc.MessageSerializer.GetSchema(Schema schema, DictionaryMemo& dictionaryMemo)
> at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.<ReadSchemaAsync>b__10_0(Memory`1 buff)
> at Apache.Arrow.ArrayPoolExtensions.RentReturnAsync(ArrayPool`1 pool, Int32 length, Func`2 action)
> at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadSchemaAsync()
> at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadRecordBatchAsync(CancellationToken cancellationToken)
> at Apache.Arrow.Ipc.ArrowStreamReaderImplementation.ReadNextRecordBatchAsync(CancellationToken cancellationToken) {code}
> As far as I do understand, the library is complaining about field name being null, not sure if it's the case since I tried to read the same file using apache arrow golang library and it seems to work without issue.
> Please find attached the `sample.arrow` file
> Below a sample code I'm using to read this arrow file:
> * Csharp sample (fails)
> {code:java}
> using System;
> using System.IO;
> using System.Threading.Tasks;
> using Apache.Arrow;
> using Apache.Arrow.Ipc;
> namespace arrow_csharp_issue
> {
> class Program
> {
> static async Task AsyncMain()
> {
> byte[] bytes = File.ReadAllBytes("./inputs/sample.arrow");
> using (var memoryStream = new MemoryStream(bytes))
> using (var reader = new ArrowStreamReader(memoryStream))
> {
> RecordBatch record = await reader.ReadNextRecordBatchAsync();
> Console.WriteLine(record);
> }
> }
> static void Main(string[] args)
> {
> AsyncMain().Wait();
> }
> }
> } {code}
> * Golang sample (works)
> {code:java}
> package main
> import (
> "bytes"
> "fmt"
> "os"
> "github.com/apache/arrow/go/v9/arrow/ipc"
> )
> func main() {
> data, err := os.ReadFile("./inputs/sample.arrow")
> if err != nil {
> panic(err)
> }
> reader, err := ipc.NewReader(bytes.NewReader(data))
> if err != nil { panic(err) }
> defer reader.Release()
> reader.Next()
> record := reader.Record()
> fmt.Println(record)
> }{code}
>
> Thank you
--
This message was sent by Atlassian Jira
(v8.20.10#820010)