You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "almackenzie (via GitHub)" <gi...@apache.org> on 2023/03/22 12:51:44 UTC

[GitHub] [arrow] almackenzie opened a new issue, #34682: Data.setValid results in incorrect nullCount

almackenzie opened a new issue, #34682:
URL: https://github.com/apache/arrow/issues/34682

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   Version: apache-arrow v11.0.0
   
   I have some code that modifies values in a nullable vector to null, by calling `setValid(index: number, value: boolean)` on its underlying `data` instance. 
   
   However, I think there is a bug in the arrow js library which means the nullCount is incorrectly _incremented_ when setting an item as valid and _decremented_ when setting something null. 
   
   I think the correct behaviour is to _increment_ nullCount when setting an item null, and _decrement_ when setting valid.
   
   Note - the nullBitmap appears to be correctly modified to reflect the set/unset, it's just the count goes the wrong way.
   
   The code in question is at
   https://github.com/apache/arrow/blob/928827515bfbd99060d04ccb754646a76a20f7ec/js/src/data.ts#L141-L149
   
   NB I think the comment here is wrong too - the code does what the comment says.
   
   Here's a jest test that I believe illustrates the issue:
   
   ```typescript
      it("simple repro of nullCount issue", () => {
           const builder = makeBuilder({
               type: new Float(Precision.DOUBLE),
               nullValues: [null]
           });
   
           builder.append(1.0);
           builder.append(null);
   
           const vector = builder.toVector();
           expect(vector.get(0)).toEqual(1.0); // OK
           expect(vector.get(1)).toBeNull(); // OK
   
           vector.data[0].setValid(0, false);
   
           // would expect both to be null now, but in fact neither are returned as such - because internally nullCount
           // was dropped to 0, so data.getValid(index) now returns true, due to checking for nullCount > 0
           expect(vector.get(0)).toBeNull(); // <-- Error: Received: 1
           expect(vector.get(1)).toBeNull(); // <-- Error: Received: 0
       });
   
   ```
   
   I am of course not very familiar with this library so apologies if I've misconstrued what the expected behaviour is here.
   
   
   
   
   
   ### Component(s)
   
   JavaScript


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org