You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Saloni Udani <sa...@gmail.com> on 2020/11/04 08:43:09 UTC

Arrow java implementation: Compatible IO streams.

Hello,
I have a use case where I want to write an arrow batch to my existing
output stream (custom stream extending java.io.OutputStream)  and reading
from my existing input stream (custom stream extending
java.io.InputStream).  I used ArrowStreamWriter and ArrowStreamReader but
on the reader side I get NPE while getting VectorSchemaRoot.

Caused by: java.lang.NullPointerException: null
at
org.apache.arrow.vector.types.pojo.Field.mutateOriginalNameIfNeeded(Field.java:188)
at org.apache.arrow.vector.types.pojo.Field.convertField(Field.java:172)
at org.apache.arrow.vector.types.pojo.Schema.convertSchema(Schema.java:87)
at
org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeSchema(MessageSerializer.java:201)
at
org.apache.arrow.vector.ipc.ArrowStreamReader.readSchema(ArrowStreamReader.java:174)
at org.apache.arrow.vector.ipc.ArrowReader.initialize(ArrowReader.java:170)
at
org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:161)
at
org.apache.arrow.vector.ipc.ArrowReader.getVectorSchemaRoot(ArrowReader.java:63)




Kindly guide what am I doing wrong? I believe this to be an issue of IO
streams. Is there any prerequisite to which IO streams can be used with
Arrow reader and writer?

Also how can I validate if the output generated by Arrow writer is in valid
format?


Regards
Saloni Udani

Re: Arrow java implementation: Compatible IO streams.

Posted by Saloni Udani <sa...@gmail.com>.
Thanks Micah.
The issue here was that I did not provide a field name for the nested child
field which caused this error. As per doc I thought it was optional but
doesn't work like that.

On Fri, Nov 13, 2020 at 12:12 PM Micah Kornfield <em...@gmail.com>
wrote:

> Hi Saloni,
> Sample code for how you are using the Arrow APIs are useful.
>
> Also how can I validate if the output generated by Arrow writer is in
>> valid format?
>
> Being able to read it back would be the best mechanism here?  Are you
> think of unit tests or in production?  Or something else?
>
> Thanks,
> Micah
>
> On Wed, Nov 4, 2020 at 12:43 AM Saloni Udani <sa...@gmail.com>
> wrote:
>
>> Hello,
>> I have a use case where I want to write an arrow batch to my existing
>> output stream (custom stream extending java.io.OutputStream)  and reading
>> from my existing input stream (custom stream extending
>> java.io.InputStream).  I used ArrowStreamWriter and ArrowStreamReader but
>> on the reader side I get NPE while getting VectorSchemaRoot.
>>
>> Caused by: java.lang.NullPointerException: null
>> at
>> org.apache.arrow.vector.types.pojo.Field.mutateOriginalNameIfNeeded(Field.java:188)
>> at org.apache.arrow.vector.types.pojo.Field.convertField(Field.java:172)
>> at org.apache.arrow.vector.types.pojo.Schema.convertSchema(Schema.java:87)
>> at
>> org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeSchema(MessageSerializer.java:201)
>> at
>> org.apache.arrow.vector.ipc.ArrowStreamReader.readSchema(ArrowStreamReader.java:174)
>> at
>> org.apache.arrow.vector.ipc.ArrowReader.initialize(ArrowReader.java:170)
>> at
>> org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:161)
>> at
>> org.apache.arrow.vector.ipc.ArrowReader.getVectorSchemaRoot(ArrowReader.java:63)
>>
>>
>>
>>
>> Kindly guide what am I doing wrong? I believe this to be an issue of IO
>> streams. Is there any prerequisite to which IO streams can be used with
>> Arrow reader and writer?
>>
>> Also how can I validate if the output generated by Arrow writer is in
>> valid format?
>>
>>
>> Regards
>> Saloni Udani
>>
>

Re: Arrow java implementation: Compatible IO streams.

Posted by Micah Kornfield <em...@gmail.com>.
Hi Saloni,
Sample code for how you are using the Arrow APIs are useful.

Also how can I validate if the output generated by Arrow writer is in valid
> format?

Being able to read it back would be the best mechanism here?  Are you think
of unit tests or in production?  Or something else?

Thanks,
Micah

On Wed, Nov 4, 2020 at 12:43 AM Saloni Udani <sa...@gmail.com>
wrote:

> Hello,
> I have a use case where I want to write an arrow batch to my existing
> output stream (custom stream extending java.io.OutputStream)  and reading
> from my existing input stream (custom stream extending
> java.io.InputStream).  I used ArrowStreamWriter and ArrowStreamReader but
> on the reader side I get NPE while getting VectorSchemaRoot.
>
> Caused by: java.lang.NullPointerException: null
> at
> org.apache.arrow.vector.types.pojo.Field.mutateOriginalNameIfNeeded(Field.java:188)
> at org.apache.arrow.vector.types.pojo.Field.convertField(Field.java:172)
> at org.apache.arrow.vector.types.pojo.Schema.convertSchema(Schema.java:87)
> at
> org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeSchema(MessageSerializer.java:201)
> at
> org.apache.arrow.vector.ipc.ArrowStreamReader.readSchema(ArrowStreamReader.java:174)
> at org.apache.arrow.vector.ipc.ArrowReader.initialize(ArrowReader.java:170)
> at
> org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:161)
> at
> org.apache.arrow.vector.ipc.ArrowReader.getVectorSchemaRoot(ArrowReader.java:63)
>
>
>
>
> Kindly guide what am I doing wrong? I believe this to be an issue of IO
> streams. Is there any prerequisite to which IO streams can be used with
> Arrow reader and writer?
>
> Also how can I validate if the output generated by Arrow writer is in
> valid format?
>
>
> Regards
> Saloni Udani
>