You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@arrow.apache.org by Joris Peeters <jo...@gmail.com> on 2021/05/04 10:38:59 UTC

[Java JDBC adapter] non-nullable fields?

I'm looking to use the Java JDBC adapter for loading tables from SQL Server
into Arrow record batches.

At first glance the Arrow JDBC adapter seems to work well but, unless I'm
mistaken, it simply makes every vector nullable, irrespective of whether
the corresponding SQL column is nullable or not.

I think the line

final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
encoding */ null, metadata);

in
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
might be the cause here.

Is my interpretation correct, or am I missing a setting of sorts? If indeed
correct, is there a fundamental reason the NULL-ness is not transferred, or
is this something I could contribute in a PR? (which I'd be happy to) I
guess it's just a matter of inspecting the result metadata.

Cheers,
-J

Re: [Java JDBC adapter] non-nullable fields?

Posted by Fan Liya <li...@gmail.com>.

Thanks for your effort.
I'd like to help with the code review.

Best,
Liya Fan


On Fri, May 7, 2021 at 5:20 PM Joris Peeters <jo...@gmail.com>
wrote:

> https://issues.apache.org/jira/browse/ARROW-12679
>
> On Fri, May 7, 2021 at 8:54 AM Joris Peeters <jo...@gmail.com>
> wrote:
>
>> Fair enough.
>> I have this data moving through a few different servers and clients, in
>> IPC streaming format, consumed on various platforms/languages. The
>> nullability in the schema is often used in "language-friendly" clients,
>> e.g. to build a `std::vector<bool>` or `std::vector<std::optional<bool>>`
>> depending on whether the bit column is nullable, so preserving this
>> information is quite important, even if locally in Java it makes little
>> difference.
>>
>> I've worked around it for now by fudging the VectorSchemaRoot's schema
>> myself, but I'll open a JIRA to track, and I'll assign it to myself and
>> provide a fix.
>>
>> Cheers!
>> -Joris.
>>
>>
>> On Fri, May 7, 2021 at 3:22 AM Fan Liya <li...@gmail.com> wrote:
>>
>>> Hi Joris,
>>>
>>> I think you are right.
>>>
>>> We only use the nullability information in the consumers, because it
>>> makes a difference in performance.
>>>
>>> The nullability information in the schema is not accurate, as you have
>>> observed.
>>> However, such information is not well-used in the Java implementation
>>> (IMHO). For example, the validity buffer is allocated even if the vector is
>>> non-nullable.
>>>
>>> That said, I think it would be better to keep the nullability
>>> information in sync.
>>> So maybe we can open a JIRA to track it?
>>>
>>> Best,
>>> Liya Fan
>>>
>>>
>>> On Thu, May 6, 2021 at 3:09 PM Joris Peeters <jo...@gmail.com>
>>> wrote:
>>>
>>>> Hello Fan,
>>>>
>>>> Yes, but it seems that code path only affects the consumers, and
>>>> whether they set a value in the vector or not, see e.g.
>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
>>>> However, the VectorSchemaRoot's schema, defined I believe at
>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
>>>> does not appear to use this info, and just sets every column's nullability
>>>> to true (as per the link in my original email).
>>>>
>>>> Note that we are indeed using the ArrowVectorIterator, and it's when
>>>> iterating over the iterator and inspecting the schema of the elements
>>>> (VectorSchemaRoot) that I notice this.
>>>> Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true`
>>>> in `final FieldType fieldType = new FieldType(true, arrowType, /*
>>>> dictionary encoding */ null, metadata);`.
>>>>
>>>> Cheers,
>>>> -J
>>>>
>>>> On Thu, May 6, 2021 at 5:53 AM Fan Liya <li...@gmail.com> wrote:
>>>>
>>>>> Hi Joris,
>>>>>
>>>>> Thanks for reporting the problem.
>>>>>
>>>>> We make use of the nullable information
>>>>> in ArrowVectorIterator#initialize. (Details can be found in
>>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
>>>>> )
>>>>>
>>>>> Please note that the  ArrowVectorIterator is our encouraged way of
>>>>> using the JDBC adapter.
>>>>>
>>>>> Best,
>>>>> Liya Fan
>>>>>
>>>>>
>>>>> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I would need to look further, but I thought we handled null vs not
>>>>>> null.  At least I thought we had specialized conversion code to avoid
>>>>>> branches.  If this isn't the case it seems reasonable to contribute a path.
>>>>>>
>>>>>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <
>>>>>> joris.mg.peeters@gmail.com> wrote:
>>>>>>
>>>>>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>>>>>> Server into Arrow record batches.
>>>>>>>
>>>>>>> At first glance the Arrow JDBC adapter seems to work well but,
>>>>>>> unless I'm mistaken, it simply makes every vector nullable, irrespective of
>>>>>>> whether the corresponding SQL column is nullable or not.
>>>>>>>
>>>>>>> I think the line
>>>>>>>
>>>>>>> final FieldType fieldType = new FieldType(true, arrowType, /*
>>>>>>> dictionary encoding */ null, metadata);
>>>>>>>
>>>>>>> in
>>>>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>>>>>> might be the cause here.
>>>>>>>
>>>>>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>>>>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>>>>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>>>>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> -J
>>>>>>>
>>>>>>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Joris Peeters <jo...@gmail.com>.

https://issues.apache.org/jira/browse/ARROW-12679

On Fri, May 7, 2021 at 8:54 AM Joris Peeters <jo...@gmail.com>
wrote:

> Fair enough.
> I have this data moving through a few different servers and clients, in
> IPC streaming format, consumed on various platforms/languages. The
> nullability in the schema is often used in "language-friendly" clients,
> e.g. to build a `std::vector<bool>` or `std::vector<std::optional<bool>>`
> depending on whether the bit column is nullable, so preserving this
> information is quite important, even if locally in Java it makes little
> difference.
>
> I've worked around it for now by fudging the VectorSchemaRoot's schema
> myself, but I'll open a JIRA to track, and I'll assign it to myself and
> provide a fix.
>
> Cheers!
> -Joris.
>
>
> On Fri, May 7, 2021 at 3:22 AM Fan Liya <li...@gmail.com> wrote:
>
>> Hi Joris,
>>
>> I think you are right.
>>
>> We only use the nullability information in the consumers, because it
>> makes a difference in performance.
>>
>> The nullability information in the schema is not accurate, as you have
>> observed.
>> However, such information is not well-used in the Java implementation
>> (IMHO). For example, the validity buffer is allocated even if the vector is
>> non-nullable.
>>
>> That said, I think it would be better to keep the nullability information
>> in sync.
>> So maybe we can open a JIRA to track it?
>>
>> Best,
>> Liya Fan
>>
>>
>> On Thu, May 6, 2021 at 3:09 PM Joris Peeters <jo...@gmail.com>
>> wrote:
>>
>>> Hello Fan,
>>>
>>> Yes, but it seems that code path only affects the consumers, and whether
>>> they set a value in the vector or not, see e.g.
>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
>>> However, the VectorSchemaRoot's schema, defined I believe at
>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
>>> does not appear to use this info, and just sets every column's nullability
>>> to true (as per the link in my original email).
>>>
>>> Note that we are indeed using the ArrowVectorIterator, and it's when
>>> iterating over the iterator and inspecting the schema of the elements
>>> (VectorSchemaRoot) that I notice this.
>>> Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true` in
>>> `final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
>>> encoding */ null, metadata);`.
>>>
>>> Cheers,
>>> -J
>>>
>>> On Thu, May 6, 2021 at 5:53 AM Fan Liya <li...@gmail.com> wrote:
>>>
>>>> Hi Joris,
>>>>
>>>> Thanks for reporting the problem.
>>>>
>>>> We make use of the nullable information
>>>> in ArrowVectorIterator#initialize. (Details can be found in
>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
>>>> )
>>>>
>>>> Please note that the  ArrowVectorIterator is our encouraged way of
>>>> using the JDBC adapter.
>>>>
>>>> Best,
>>>> Liya Fan
>>>>
>>>>
>>>> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
>>>> wrote:
>>>>
>>>>> I would need to look further, but I thought we handled null vs not
>>>>> null.  At least I thought we had specialized conversion code to avoid
>>>>> branches.  If this isn't the case it seems reasonable to contribute a path.
>>>>>
>>>>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <
>>>>> joris.mg.peeters@gmail.com> wrote:
>>>>>
>>>>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>>>>> Server into Arrow record batches.
>>>>>>
>>>>>> At first glance the Arrow JDBC adapter seems to work well but, unless
>>>>>> I'm mistaken, it simply makes every vector nullable, irrespective of
>>>>>> whether the corresponding SQL column is nullable or not.
>>>>>>
>>>>>> I think the line
>>>>>>
>>>>>> final FieldType fieldType = new FieldType(true, arrowType, /*
>>>>>> dictionary encoding */ null, metadata);
>>>>>>
>>>>>> in
>>>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>>>>> might be the cause here.
>>>>>>
>>>>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>>>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>>>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>>>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>>>>
>>>>>> Cheers,
>>>>>> -J
>>>>>>
>>>>>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Joris Peeters <jo...@gmail.com>.

Fair enough.
I have this data moving through a few different servers and clients, in IPC
streaming format, consumed on various platforms/languages. The nullability
in the schema is often used in "language-friendly" clients, e.g. to build a
`std::vector<bool>` or `std::vector<std::optional<bool>>` depending on
whether the bit column is nullable, so preserving this information is quite
important, even if locally in Java it makes little difference.

I've worked around it for now by fudging the VectorSchemaRoot's schema
myself, but I'll open a JIRA to track, and I'll assign it to myself and
provide a fix.

Cheers!
-Joris.


On Fri, May 7, 2021 at 3:22 AM Fan Liya <li...@gmail.com> wrote:

> Hi Joris,
>
> I think you are right.
>
> We only use the nullability information in the consumers, because it makes
> a difference in performance.
>
> The nullability information in the schema is not accurate, as you have
> observed.
> However, such information is not well-used in the Java implementation
> (IMHO). For example, the validity buffer is allocated even if the vector is
> non-nullable.
>
> That said, I think it would be better to keep the nullability information
> in sync.
> So maybe we can open a JIRA to track it?
>
> Best,
> Liya Fan
>
>
> On Thu, May 6, 2021 at 3:09 PM Joris Peeters <jo...@gmail.com>
> wrote:
>
>> Hello Fan,
>>
>> Yes, but it seems that code path only affects the consumers, and whether
>> they set a value in the vector or not, see e.g.
>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
>> However, the VectorSchemaRoot's schema, defined I believe at
>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
>> does not appear to use this info, and just sets every column's nullability
>> to true (as per the link in my original email).
>>
>> Note that we are indeed using the ArrowVectorIterator, and it's when
>> iterating over the iterator and inspecting the schema of the elements
>> (VectorSchemaRoot) that I notice this.
>> Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true` in
>> `final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
>> encoding */ null, metadata);`.
>>
>> Cheers,
>> -J
>>
>> On Thu, May 6, 2021 at 5:53 AM Fan Liya <li...@gmail.com> wrote:
>>
>>> Hi Joris,
>>>
>>> Thanks for reporting the problem.
>>>
>>> We make use of the nullable information
>>> in ArrowVectorIterator#initialize. (Details can be found in
>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
>>> )
>>>
>>> Please note that the  ArrowVectorIterator is our encouraged way of using
>>> the JDBC adapter.
>>>
>>> Best,
>>> Liya Fan
>>>
>>>
>>> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
>>> wrote:
>>>
>>>> I would need to look further, but I thought we handled null vs not
>>>> null.  At least I thought we had specialized conversion code to avoid
>>>> branches.  If this isn't the case it seems reasonable to contribute a path.
>>>>
>>>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <
>>>> joris.mg.peeters@gmail.com> wrote:
>>>>
>>>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>>>> Server into Arrow record batches.
>>>>>
>>>>> At first glance the Arrow JDBC adapter seems to work well but, unless
>>>>> I'm mistaken, it simply makes every vector nullable, irrespective of
>>>>> whether the corresponding SQL column is nullable or not.
>>>>>
>>>>> I think the line
>>>>>
>>>>> final FieldType fieldType = new FieldType(true, arrowType, /*
>>>>> dictionary encoding */ null, metadata);
>>>>>
>>>>> in
>>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>>>> might be the cause here.
>>>>>
>>>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>>>
>>>>> Cheers,
>>>>> -J
>>>>>
>>>>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Fan Liya <li...@gmail.com>.

Hi Joris,

I think you are right.

We only use the nullability information in the consumers, because it makes
a difference in performance.

The nullability information in the schema is not accurate, as you have
observed.
However, such information is not well-used in the Java implementation
(IMHO). For example, the validity buffer is allocated even if the vector is
non-nullable.

That said, I think it would be better to keep the nullability information
in sync.
So maybe we can open a JIRA to track it?

Best,
Liya Fan


On Thu, May 6, 2021 at 3:09 PM Joris Peeters <jo...@gmail.com>
wrote:

> Hello Fan,
>
> Yes, but it seems that code path only affects the consumers, and whether
> they set a value in the vector or not, see e.g.
> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
> However, the VectorSchemaRoot's schema, defined I believe at
> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
> does not appear to use this info, and just sets every column's nullability
> to true (as per the link in my original email).
>
> Note that we are indeed using the ArrowVectorIterator, and it's when
> iterating over the iterator and inspecting the schema of the elements
> (VectorSchemaRoot) that I notice this.
> Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true` in
> `final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
> encoding */ null, metadata);`.
>
> Cheers,
> -J
>
> On Thu, May 6, 2021 at 5:53 AM Fan Liya <li...@gmail.com> wrote:
>
>> Hi Joris,
>>
>> Thanks for reporting the problem.
>>
>> We make use of the nullable information
>> in ArrowVectorIterator#initialize. (Details can be found in
>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
>> )
>>
>> Please note that the  ArrowVectorIterator is our encouraged way of using
>> the JDBC adapter.
>>
>> Best,
>> Liya Fan
>>
>>
>> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
>> wrote:
>>
>>> I would need to look further, but I thought we handled null vs not
>>> null.  At least I thought we had specialized conversion code to avoid
>>> branches.  If this isn't the case it seems reasonable to contribute a path.
>>>
>>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <jo...@gmail.com>
>>> wrote:
>>>
>>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>>> Server into Arrow record batches.
>>>>
>>>> At first glance the Arrow JDBC adapter seems to work well but, unless
>>>> I'm mistaken, it simply makes every vector nullable, irrespective of
>>>> whether the corresponding SQL column is nullable or not.
>>>>
>>>> I think the line
>>>>
>>>> final FieldType fieldType = new FieldType(true, arrowType, /*
>>>> dictionary encoding */ null, metadata);
>>>>
>>>> in
>>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>>> might be the cause here.
>>>>
>>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>>
>>>> Cheers,
>>>> -J
>>>>
>>>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Joris Peeters <jo...@gmail.com>.

Hello Fan,

Yes, but it seems that code path only affects the consumers, and whether
they set a value in the vector or not, see e.g.
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DoubleConsumer.java#L57
However, the VectorSchemaRoot's schema, defined I believe at
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L59,
does not appear to use this info, and just sets every column's nullability
to true (as per the link in my original email).

Note that we are indeed using the ArrowVectorIterator, and it's when
iterating over the iterator and inspecting the schema of the elements
(VectorSchemaRoot) that I notice this.
Maybe all this needs is a `isColumnNullable(i, ..)` instead of `true` in
`final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
encoding */ null, metadata);`.

Cheers,
-J

On Thu, May 6, 2021 at 5:53 AM Fan Liya <li...@gmail.com> wrote:

> Hi Joris,
>
> Thanks for reporting the problem.
>
> We make use of the nullable information in ArrowVectorIterator#initialize.
> (Details can be found in
> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
> )
>
> Please note that the  ArrowVectorIterator is our encouraged way of using
> the JDBC adapter.
>
> Best,
> Liya Fan
>
>
> On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
>> I would need to look further, but I thought we handled null vs not null.
>> At least I thought we had specialized conversion code to avoid branches.
>> If this isn't the case it seems reasonable to contribute a path.
>>
>> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <jo...@gmail.com>
>> wrote:
>>
>>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>>> Server into Arrow record batches.
>>>
>>> At first glance the Arrow JDBC adapter seems to work well but, unless
>>> I'm mistaken, it simply makes every vector nullable, irrespective of
>>> whether the corresponding SQL column is nullable or not.
>>>
>>> I think the line
>>>
>>> final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
>>> encoding */ null, metadata);
>>>
>>> in
>>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>>> might be the cause here.
>>>
>>> Is my interpretation correct, or am I missing a setting of sorts? If
>>> indeed correct, is there a fundamental reason the NULL-ness is not
>>> transferred, or is this something I could contribute in a PR? (which I'd be
>>> happy to) I guess it's just a matter of inspecting the result metadata.
>>>
>>> Cheers,
>>> -J
>>>
>>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Fan Liya <li...@gmail.com>.

Hi Joris,

Thanks for reporting the problem.

We make use of the nullable information in ArrowVectorIterator#initialize.
(Details can be found in
https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowVectorIterator.java#L73
)

Please note that the  ArrowVectorIterator is our encouraged way of using
the JDBC adapter.

Best,
Liya Fan


On Wed, May 5, 2021 at 1:42 PM Micah Kornfield <em...@gmail.com>
wrote:

> I would need to look further, but I thought we handled null vs not null.
> At least I thought we had specialized conversion code to avoid branches.
> If this isn't the case it seems reasonable to contribute a path.
>
> On Tue, May 4, 2021 at 3:39 AM Joris Peeters <jo...@gmail.com>
> wrote:
>
>> I'm looking to use the Java JDBC adapter for loading tables from SQL
>> Server into Arrow record batches.
>>
>> At first glance the Arrow JDBC adapter seems to work well but, unless I'm
>> mistaken, it simply makes every vector nullable, irrespective of whether
>> the corresponding SQL column is nullable or not.
>>
>> I think the line
>>
>> final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
>> encoding */ null, metadata);
>>
>> in
>> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
>> might be the cause here.
>>
>> Is my interpretation correct, or am I missing a setting of sorts? If
>> indeed correct, is there a fundamental reason the NULL-ness is not
>> transferred, or is this something I could contribute in a PR? (which I'd be
>> happy to) I guess it's just a matter of inspecting the result metadata.
>>
>> Cheers,
>> -J
>>
>

Re: [Java JDBC adapter] non-nullable fields?

Posted by Micah Kornfield <em...@gmail.com>.

I would need to look further, but I thought we handled null vs not null.
At least I thought we had specialized conversion code to avoid branches.
If this isn't the case it seems reasonable to contribute a path.

On Tue, May 4, 2021 at 3:39 AM Joris Peeters <jo...@gmail.com>
wrote:

> I'm looking to use the Java JDBC adapter for loading tables from SQL
> Server into Arrow record batches.
>
> At first glance the Arrow JDBC adapter seems to work well but, unless I'm
> mistaken, it simply makes every vector nullable, irrespective of whether
> the corresponding SQL column is nullable or not.
>
> I think the line
>
> final FieldType fieldType = new FieldType(true, arrowType, /* dictionary
> encoding */ null, metadata);
>
> in
> https://github.com/apache/arrow/blob/master/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L158
> might be the cause here.
>
> Is my interpretation correct, or am I missing a setting of sorts? If
> indeed correct, is there a fundamental reason the NULL-ness is not
> transferred, or is this something I could contribute in a PR? (which I'd be
> happy to) I guess it's just a matter of inspecting the result metadata.
>
> Cheers,
> -J
>