You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "bryaan (via GitHub)" <gi...@apache.org> on 2023/03/31 00:19:20 UTC

[GitHub] [arrow] bryaan opened a new issue, #34809: C# to Python or Julia doesn't work. Indicates footer issue.

bryaan opened a new issue, #34809:
URL: https://github.com/apache/arrow/issues/34809

   ### Describe the usage question you have. Please include as many useful details as  possible.
   
   
   I am writing a struct array in C# using the following code:
   
               var structField = new StructType(
                   new []
                   {
                       new Field("field1", new StringType(), nullable: false),
                       new Field("field2", new Int64Type(), nullable: false)
                   });
               
               StringArray stringArray = new StringArray.Builder()
                   .AppendRange(new[] {"angel", "bobby", "charlie"})
                   .Build();
               Int64Array intArray = new Int64Array.Builder()
                   .AppendRange(new[] { 1L,2,3 })
                   .Build();
               
               StructArray structs = new StructArray(structField, 3, new IArrowArray[] { stringArray, intArray }, ArrowBuffer.Empty, nullCount: 0);
   
               var recordBatch = new Apache.Arrow.RecordBatch.Builder()
                   .Append("col1", false, structs)
                   .Build();
   
               using (var stream = File.OpenWrite("test.arrow"))
               using (var writer = new Apache.Arrow.Ipc.ArrowFileWriter(stream, recordBatch.Schema, true))
               {
                   await writer.WriteRecordBatchAsync(recordBatch);
                   await writer.WriteEndAsync();
               }
   
   But when I try to read it in julia I get an error:
   
       using Arrow
       table = Arrow.Table("test.arrow")
       @show table
       @show table[1]
   
       ERROR: LoadError: MethodError: no method matching iterate(::Nothing)
       Closest candidates are:
         iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
         iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
         iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
         ...
       Stacktrace:
        [1] getdictionaries!(dictencoded::Dict{Int64, Arrow.Flatbuf.Field}, field::Arrow.Flatbuf.Field)
          @ Arrow ~/.julia/packages/Arrow/P0wVk/src/table.jl:409
        [2] getdictionaries!(dictencoded::Dict{Int64, Arrow.Flatbuf.Field}, field::Arrow.Flatbuf.Field)
          @ Arrow ~/.julia/packages/Arrow/P0wVk/src/table.jl:410
        [3] macro expansion
          @ ~/.julia/packages/Arrow/P0wVk/src/table.jl:339 [inlined]
        [4] macro expansion
          @ ./task.jl:454 [inlined]
        [5] Arrow.Table(blobs::Vector{Arrow.ArrowBlob}; convert::Bool)
          @ Arrow ~/.julia/packages/Arrow/P0wVk/src/table.jl:321
        [6] Table
          @ ~/.julia/packages/Arrow/P0wVk/src/table.jl:295 [inlined]
        [7] #Table#98
          @ ~/.julia/packages/Arrow/P0wVk/src/table.jl:290 [inlined]
        [8] Table (repeats 2 times)
          @ ~/.julia/packages/Arrow/P0wVk/src/table.jl:290 [inlined]
        [9] top-level scope
   
   Similar error for:
   
       stream = Arrow.Stream("test.arrow")
       for d in stream
           @show d
       end
   
   I've also tested it with Python:
   
       table = feather.read_table("test.arrow")
       print(table[0])
   
       OSError: Verification of flatbuffer-encoded Footer failed.
   
   So it seems to be an issue with the footer maybe not being written in C#.
   
   
   ### Component(s)
   
   C#


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1492245840

   I'm unable to reproduce this.  I'm attaching the `test.arrow` file that gets created when I run your code (zipped so that github will allow it).  This file is readable in python just fine:
   
   ```
   >>> import pyarrow.ipc as ipc
   >>> with ipc.RecordBatchFileReader("test.arrow") as f:
   ...   f.read_all()
   ... 
   pyarrow.Table
   col1: struct<field1: string not null, field2: int64 not null> not null
     child 0, field1: string not null
     child 1, field2: int64 not null
   ----
   col1: [
     -- is_valid: all not null
     -- child 0 type: string
   ["angel","bobby","charlie"]
     -- child 1 type: int64
   [1,2,3]]
   >>> import pyarrow.feather as feather
   [test.arrow.zip](https://github.com/apache/arrow/files/11124084/test.arrow.zip)
   >>> feather.read_feather("test.arrow")
                                    col1
   0    {'field1': 'angel', 'field2': 1}
   1    {'field1': 'bobby', 'field2': 2}
   2  {'field1': 'charlie', 'field2': 3}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1498172174

   This is quite strange.  It's almost like the footer is written and then the last part of the file is rewritten.  In other words, if the file was:
   
   ABCDEFGHIJKLMNOP
   
   you have:
   
   ABCDEFGHIJKLMNOP**MNOP**
   
   ![image](https://user-images.githubusercontent.com/1696093/230212938-7d0fbef8-059f-4424-91ae-cb4fb412a8f5.png)
   
   Do you know if there is anything that could be modifying the file after you have written it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bryaan commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "bryaan (via GitHub)" <gi...@apache.org>.
bryaan commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1492738160

   [test.arrow.zip](https://github.com/apache/arrow/files/11127941/test.arrow.zip)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bryaan commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "bryaan (via GitHub)" <gi...@apache.org>.
bryaan commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1492518497

   Can't get it to work, how strange.  I am running python 3.9 with arrow ^11.0.0 on a Macbook M1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] CurtHagenlocher commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "CurtHagenlocher (via GitHub)" <gi...@apache.org>.
CurtHagenlocher commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1501887019

   It might be interesting to write the batch to a MemoryStream instead and then inspect its contents.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1492549538

   Can you upload the file that gets created?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bryaan commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "bryaan (via GitHub)" <gi...@apache.org>.
bryaan commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1501188772

   No, that was the entire script.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bryaan commented on issue #34809: C# to Python or Julia doesn't work. Indicates footer issue.

Posted by "bryaan (via GitHub)" <gi...@apache.org>.
bryaan commented on issue #34809:
URL: https://github.com/apache/arrow/issues/34809#issuecomment-1503932972

   FYI,  https://gist.github.com/amoeba/1883b2823fe597a5921a20d9af7baa47


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org