You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/17 13:13:40 UTC

[GitHub] [arrow] Alfred-Mountfield opened a new pull request #12451: [ARROW-15705] Allowing appending null on children in a StructBuilder

Alfred-Mountfield opened a new pull request #12451:
URL: https://github.com/apache/arrow/pull/12451


   When trying to add a null value to a StructBuilder, the code previously only modified the null-bitmap. 
   I believe this breaks the spec as it results in the child arrays being a different length to the Struct array (and the child arrays should have the null value appended to their values).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810650513



##########
File path: js/src/builder/struct.ts
##########
@@ -25,11 +25,18 @@ import { Struct, TypeMap } from '../type.js';
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
     public setValue(index: number, value: Struct<T>['TValue']) {
         const children = this.children;
-        switch (Array.isArray(value) || value.constructor) {
-            case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
-            case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
-            default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            switch (Array.isArray(value) || value.constructor) {
+                case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
+                case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
+                default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            }

Review comment:
       That was sloppy, apologies, had whitespace changes off on the diff and didn't notice. 
   I've fixed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810364151



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }
+
+    public setValue(index: number, value: Struct<T>['TValue'] | TNull) {
         const children = this.children;
-        switch (Array.isArray(value) || value.constructor) {
-            case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
-            case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
-            default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+        if (this.isValid(value) && 'constructor' in value) {
+            switch (Array.isArray(value) || value.constructor) {
+                case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
+                case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
+                default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            }
+        } else { // Is a null value
+            return this.type.children.forEach((_, i) => children[i].set(index, value));

Review comment:
       If the above is applied, we can revert these changes. This function signature was deliberately designed to only accept valid values.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810394714



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       > I would have thought that doesn't work for more complex nested types such as a struct within a struct which would need the fields of that inner-struct to also have a null value appended.
   
   @Alfred-Mountfield Ah good point. In that case, I think the change needs to be not to override `set()` or change the `setValue()` implementation, but to also override `setValid()` and make it set nulls into each child. I'll update my suggestion here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810394714



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       > I would have thought that doesn't work for more complex nested types such as a struct within a struct which would need the fields of that inner-struct to also have a null value appended.
   
   @Alfred-Mountfield Ah good point. In that case, I think the change needs to be not to override `set()` or change the `setValue()` implementation, but to also override `setValid()` and make it set nulls into each child. I'll update my suggestion in a new thread.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810402923



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       I've just tested it locally and that seems to work :) although with `valid` instead of value` in `if (!super.setValid(index, value)`
   
   There doesn't seem to be documentation on the `setValid` method on `Builder` however, so I'm not sure what doc it's inheriting




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1057428774


   > Can you show me a code snippet that produces different results @Alfred-Mountfield?
   
   I apologise, I seem to have conflated two issues and accidentally implied that this fixes both. That's what I meant by:
   
   > The problem goes a bit deeper, as I've raised in a different [Jira ticket](https://issues.apache.org/jira/browse/ARROW-15651)
   
   The fix I've pushed in this branch stops an error when reading the Array back in Rust. I believe there's an assert within the rust-rs code that checks the children's lengths are equal to the Struct node itself. I hadn't checked to see if it actually changed the values of the latter snippet, but I'm sad to hear it doesn't :( Seems like there's another underlying issue.
   
   I happened to notice the discrepancies with the Builder and makeVector methods when trying to debug and check if we were building things incorrectly in the JavaScript side of our app.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12451: [ARROW-15705] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1042952699


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
   
   Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1044837282


   CC @domoritz 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810379309



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       Is `inheritdoc` correct here? I believe the inherited documentation would be misleading as it wouldn't mention calling the `setValid` function on the children.
   
   Also I'll admit I'm very unfamiliar with this codebase, is calling `setValid` good enough, I would have thought that doesn't work for more complex nested types such as a struct within a struct which would need the fields _of that inner-struct_ to also have a null value appended. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810402923



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       I've just tested it locally and that seems to work :) although with `valid` instead of value` in `if (!super.setValid(index, value)`
   
   There doesn't seem to be documentation on the `setValid` method on `Builder` however, so I'm not sure what doc it's inheriting
   
   I've pushed the update




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1060572661


   Apologies for the late response. Thank you for pushing for more concrete proof that this PR achieves its intention.
   
   While trying to create a better test-case for you I realise we have a large quantity of errors within our uses of the library that have only been exposed to me while trying to create a test-case within the arrow project itself in an IDE with proper intellisense. 
   
   I think the code examples I have given (i.e. the StackBlitz links) aren't valid due to errors such as these:
   <img width="1119" alt="image" src="https://user-images.githubusercontent.com/25749103/157021673-7e14f2ae-ca97-4321-8fae-2bf64dcf8745.png">
   <img width="537" alt="image" src="https://user-images.githubusercontent.com/25749103/157021284-5a0c8df4-c424-437a-87ff-944d80a22975.png">
   <img width="699" alt="image" src="https://user-images.githubusercontent.com/25749103/157021782-25003004-84fc-4a97-b0a6-711ff7fbf0c9.png">
   
   Unfortunately, using the JS Arrow packages don't really give any indication that doing things like that are incorrect. Not trying to restate the obvious benefits of using TypeScript, but I do worry a bit that the arrow package lends itself to so many errors that fail somewhat silently in plain JS because of things like abstract classes. I think having better documentation showing more use-cases and trying to convey convention would help a lot with this. Is there somewhere I can raise this properly (as this PR doesn't feel like the right place to start that discussion), should I just open a Jira ticket or will that likely languish?
   
   Going back to this PR, I'm going to investigate the methods a bit more deeply, and try and understand the library's implementation a bit better when it comes to things like `setValid` etc. I worry that the spec is a little too vague when it comes to the specifics of the lengths of buffers in things like `fixedSizeList`s and `Struct`s with null elements, and it's possible the different language impls have conflicting interpretations. I'll do my investigation in TS (rather than JS as I was before) and post again afterwards :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1046288157


   >Since you have a local test case, please add a unit test so we can ensure not to repeat this issue.
   
   @domoritz 
   The local code that exposes the issue is a system that uses both the Rust Arrow implementation and the JavaScript implementation. I actually am not 100% sure how to write a unit-test for this within this repository. 
   
   The problem goes a bit deeper, as I've raised in a different [Jira ticket](https://issues.apache.org/jira/browse/ARROW-15651), and I'm afraid I'm nowhere near familiar enough with the implementation within this repository to be able to contribute effectively at this point. I have found it difficult to understand the correct way of doing things with this package, as the docs are less beginner-friendly than the other language implementations, and the lack of examples exacerbates this.
   
   I'm open to trying to contribute further, but I do not know how much time I can allocate to learning the specifics of this implementation (hence me missing the purposes of the `setValid` method). I did have a look around the testing for the builders and such but there seems to be a rather complex technique to generating test cases which would take a while to understand how to expand it to cover these quite specific edge-cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810364151



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }
+
+    public setValue(index: number, value: Struct<T>['TValue'] | TNull) {
         const children = this.children;
-        switch (Array.isArray(value) || value.constructor) {
-            case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
-            case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
-            default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+        if (this.isValid(value) && 'constructor' in value) {
+            switch (Array.isArray(value) || value.constructor) {
+                case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
+                case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
+                default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            }
+        } else { // Is a null value
+            return this.type.children.forEach((_, i) => children[i].set(index, value));

Review comment:
       If the above is applied, we can revert these changes.

##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       ```suggestion
       /** @inheritdoc */
       public set(index: number, value: Struct<T>['TValue'] | TNull) {
           if (this.setValid(index, this.isValid(value))) {
               this.setValue(index, value);
           } else {
             this.children.forEach((child) => child.setValid(index, false));
           }
           return this;
       }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810377591



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       Is `inheritdoc` correct here? I believe the inherited documentation would be misleading as it wouldn't mention calling the `setValid` function on the children.
   
   Also I'll admit I'm very unfamiliar with this codebase, is passing `false` good enough, I would have thought you need to pass `value`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1046273260


   With regards to changing it in the `UnionBuilder` I think it'd be best to do it in another PR to avoid polluting this one and the associated Jira issue. Also, I had code locally that exposed this problem so I was able to test these changes fixed those cases. I don't have the same for the UnionBuilder. 
   
   It's probably worth making another Jira issue for that and maybe removing the ambiguity in the specification as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1052815099


   I'm having a hard time finding the time right now. If someone else can write the test case, that would be great. Otherwise I'll do it eventually. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810363957



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       ```suggestion
       /** @inheritdoc */
       public setValid(index: number, valid: boolean) {
           if (!super.setValid(index, value)) {
               this.children.forEach((child) => child.setValid(index, valid));
           }
           return valid;
       }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1046293032


   I totally understand. Thank you for sending the pull request and explaining your background. 
   
   I can help with adding the test code if you can send a code snippet and describe the expectation before and after the fix. Would that work? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz edited a comment on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz edited a comment on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1057305963


   I agree that the behavior right now is odd. My replication is at https://stackblitz.com/edit/node-sep8dg?file=index.js. I seem to be getting the same results in this branch, though.
   
   ```
   $ node --experimental-vm-modules ./test.js
   Builder
   [ null ]
   1
   1
   1
   makeVector
   []
   0
   1
   0
   ```
   
   Can you show me a code snippet that produces different results @Alfred-Mountfield?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1057495505


   Okay. We can't use the rust code in the unit tests so we need an example that changes behaves differently in this branch. Can you write one for this pull request so we can make it into a unit test? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1045473457


   @Alfred-Mountfield I think we have the same issue in the [UnionBuilder](https://github.com/apache/arrow/blob/master/js/src/builder/union.ts) (but only when the type is `SparseUnion`). We can either add an override to that builder in this PR, or we can do it in a follow up PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1045479718


   > add an override to that builder in this PR
   
   @trxcllnt If it's the same fix then I'm happy to add the changes at some point tomorrow, otherwise we could just do it in the follow-up as you say


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1057305963


   I agree that the behavior right now is odd. My replication is at https://stackblitz.com/edit/node-sep8dg?file=index.js. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1042967279






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810394714



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }

Review comment:
       > I would have thought that doesn't work for more complex nested types such as a struct within a struct which would need the fields of that inner-struct to also have a null value appended.
   
   @Alfred-Mountfield Ah good point. In that case, I think the change should not be to override `set()` or change the `setValue()` implementation, but to also override `setValid()` and make it set nulls into each child. I'll update my suggestion here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] trxcllnt commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
trxcllnt commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1045469973


   @Alfred-Mountfield looks good to me. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810403088



##########
File path: js/src/builder/struct.ts
##########
@@ -23,12 +23,31 @@ import { Struct, TypeMap } from '../type.js';
 
 /** @ignore */
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
-    public setValue(index: number, value: Struct<T>['TValue']) {
+    /**
+     * Write a value (or null-value sentinel) at the supplied index.
+     * If the value matches one of the null-value representations, a 1-bit is
+     * written to the null `BitmapBufferBuilder`, otherwise, a 0 is written. The
+     * value is then passed to `Builder.prototype.setValue()`.
+     * @param {number} index The index of the value to write.
+     * @param {T['TValue'] | TNull } value The value to write at the supplied index.
+     * @returns {this} The updated `Builder` instance.
+     */
+    public set(index: number, value: Struct<T>['TValue'] | TNull) {
+        this.setValid(index, this.isValid(value));
+        this.setValue(index, value);
+        return this;
+    }
+
+    public setValue(index: number, value: Struct<T>['TValue'] | TNull) {
         const children = this.children;
-        switch (Array.isArray(value) || value.constructor) {
-            case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
-            case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
-            default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+        if (this.isValid(value) && 'constructor' in value) {
+            switch (Array.isArray(value) || value.constructor) {
+                case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
+                case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
+                default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            }
+        } else { // Is a null value
+            return this.type.children.forEach((_, i) => children[i].set(index, value));

Review comment:
       I have removed the changes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] domoritz commented on a change in pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
domoritz commented on a change in pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#discussion_r810560073



##########
File path: js/src/builder/struct.ts
##########
@@ -25,11 +25,18 @@ import { Struct, TypeMap } from '../type.js';
 export class StructBuilder<T extends TypeMap = any, TNull = any> extends Builder<Struct<T>, TNull> {
     public setValue(index: number, value: Struct<T>['TValue']) {
         const children = this.children;
-        switch (Array.isArray(value) || value.constructor) {
-            case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
-            case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
-            default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            switch (Array.isArray(value) || value.constructor) {
+                case true: return this.type.children.forEach((_, i) => children[i].set(index, value[i]));
+                case Map: return this.type.children.forEach((f, i) => children[i].set(index, value.get(f.name)));
+                default: return this.type.children.forEach((f, i) => children[i].set(index, value[f.name]));
+            }

Review comment:
       Please un-indent the block so it's aligned with the previous line. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Alfred-Mountfield commented on pull request #12451: ARROW-15705: [JavaScript] Allowing appending null on children in a StructBuilder

Posted by GitBox <gi...@apache.org>.
Alfred-Mountfield commented on pull request #12451:
URL: https://github.com/apache/arrow/pull/12451#issuecomment-1046294849


   That'd be great, thank you :)
   
   I believe the code snippet in [that last Jira](https://issues.apache.org/jira/browse/ARROW-15651) I sent should probably be about right.
   
   I believe the problem is that when a struct vector is made (either through the `Builder` or through the `makeVector` methods), the underlying metadata is incorrect when you have a null struct within that vector. In the simplest case, a vector of a single null struct, with one field, e.g.
   ```javascript
   let vec = makeVector({
       data: [null],
       type: struct_field.type,
       nullable: true,
   });
   ```
   
   or 
   ```javascript
   let builder = new Builder({
       type: struct_field.type,
       nullValues: [null, undefined],
   });
   builder.append(null);
   let vec = builder.toVector();
   ```
   
   Then the `children`, `nullCount`, and `length`s are not what I'd expect them to be. I _believe_ (although I'm still not 100% sure I'm reading the spec correctly), that all three of those should be equal to 1.
   
   Where the underlying memory layout will have a null-bitmap buffer of `00000001`, and one child array (which in also has a null value, [as shown in the specification](https://arrow.apache.org/docs/format/Columnar.html#struct-layout))
   
   As the code-snippet (and [StackBlitz](https://stackblitz.com/edit/node-kpgovc?file=index.js)) shows, all three of those methods seem to have differing values and I'm not really sure why. 
   
   It's possible that those three values are not adequate for testing that it is correct though, we might need to check that the underlying buffers are initialised for more complex field types on the struct, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org