You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/05 18:58:00 UTC

[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #14: Owning/mutable `struct ArrowArray`

paleolimbot opened a new pull request, #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14

   Fixes #5 by implementing an Array whose buffer lifecycle is handled by `struct ArrowBuffer`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14#discussion_r940252598


##########
src/nanoarrow/nanoarrow.h:
##########
@@ -503,6 +456,35 @@ static inline void ArrowBitmapReset(struct ArrowBitmap* bitmap);
 
 /// }@
 
+/// \defgroup nanoarrow-array Array producer helpers
+/// These functions allocate, copy, and destroy ArrowArray structures
+
+/// \brief Initialize the fields of an array
+///
+/// Initializes the fields and release callback of array. Caller
+/// is responsible for calling the array->release callback if
+/// NANOARROW_OK is returned.
+ArrowErrorCode ArrowArrayInit(struct ArrowArray* array, enum ArrowType storage_type);
+
+/// \brief Allocate the array->children array
+///
+/// Includes the memory for each child struct ArrowArray.
+/// schema must have been allocated using ArrowArrayInit.
+ArrowErrorCode ArrowArrayAllocateChildren(struct ArrowArray* array, int64_t n_children);
+
+/// \brief Allocate the array->dictionary member
+///
+/// array must have been allocated using ArrowArrayInit

Review Comment:
   They don't have to to avoid leaking memory (each member is marked released initially), but functionally that's probably what most people should do (or create the child arrays separately and move them). I added both those bits to the documentation.



##########
src/nanoarrow/array.c:
##########
@@ -0,0 +1,220 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "nanoarrow.h"
+
+static void ArrowArrayRelease(struct ArrowArray* array) {
+  // Release buffers held by this array
+  struct ArrowArrayPrivateData* data = (struct ArrowArrayPrivateData*)array->private_data;
+  if (data != NULL) {
+    ArrowBitmapReset(&data->bitmap);
+    ArrowBufferReset(&data->buffers[0]);
+    ArrowBufferReset(&data->buffers[1]);
+    ArrowFree(data);
+  }
+
+  // This object owns the memory for all the children, but those
+  // children may have been generated elsewhere and might have
+  // their own release() callback.
+  if (array->children != NULL) {
+    for (int64_t i = 0; i < array->n_children; i++) {
+      if (array->children[i] != NULL) {
+        if (array->children[i]->release != NULL) {
+          array->children[i]->release(array->children[i]);
+        }
+
+        ArrowFree(array->children[i]);
+      }
+    }
+
+    ArrowFree(array->children);
+  }
+
+  // This object owns the memory for the dictionary but it
+  // may have been generated somewhere else and have its own
+  // release() callback.
+  if (array->dictionary != NULL) {
+    if (array->dictionary->release != NULL) {
+      array->dictionary->release(array->dictionary);
+    }
+
+    ArrowFree(array->dictionary);
+  }
+
+  // Mark released
+  array->release = NULL;
+}
+
+ArrowErrorCode ArrowArraySetStorageType(struct ArrowArray* array,
+                                        enum ArrowType storage_type) {
+  switch (storage_type) {
+    case NANOARROW_TYPE_UNINITIALIZED:
+    case NANOARROW_TYPE_NA:
+      array->n_buffers = 0;
+      break;
+
+    case NANOARROW_TYPE_LIST:
+    case NANOARROW_TYPE_LARGE_LIST:
+    case NANOARROW_TYPE_STRUCT:
+    case NANOARROW_TYPE_MAP:
+    case NANOARROW_TYPE_SPARSE_UNION:

Review Comment:
   Totally! (Done)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot merged pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
paleolimbot merged PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14#discussion_r939166568


##########
src/nanoarrow/typedefs_inline.h:
##########
@@ -165,6 +212,20 @@ struct ArrowBitmap {
   int64_t size_bits;
 };
 
+/// \brief A structure used as the private data member for ArrowArrays allocated here

Review Comment:
   nit: does this need to be in the public header?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14#discussion_r940170048


##########
src/nanoarrow/typedefs_inline.h:
##########
@@ -165,6 +212,20 @@ struct ArrowBitmap {
   int64_t size_bits;
 };
 
+/// \brief A structure used as the private data member for ArrowArrays allocated here

Review Comment:
   The definition needs to be visible for the appenders to be inlined (I removed the documentation comments and gave it a scarier name of `ArrowArrayPrivateData` to hopefully make that clearer).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] codecov-commenter commented on pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14#issuecomment-1208146419

   # [Codecov](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#14](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4f427fc) into [main](https://codecov.io/gh/apache/arrow-nanoarrow/commit/51e5052ddd08fb424d8c20c86f9d5ea7d7b4ff51?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (51e5052) will **decrease** coverage by `0.35%`.
   > The diff coverage is `87.50%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##             main      #14      +/-   ##
   ==========================================
   - Coverage   91.97%   91.62%   -0.36%     
   ==========================================
     Files           5        8       +3     
     Lines         798      931     +133     
     Branches       30       35       +5     
   ==========================================
   + Hits          734      853     +119     
   - Misses         41       51      +10     
   - Partials       23       27       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [src/nanoarrow/array.c](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3JjL25hbm9hcnJvdy9hcnJheS5j) | `87.50% <87.50%> (ø)` | |
   | [src/nanoarrow/buffer\_inline.h](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3JjL25hbm9hcnJvdy9idWZmZXJfaW5saW5lLmg=) | `100.00% <0.00%> (ø)` | |
   | [src/nanoarrow/bitmap\_inline.h](https://codecov.io/gh/apache/arrow-nanoarrow/pull/14/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3JjL25hbm9hcnJvdy9iaXRtYXBfaW5saW5lLmg=) | `100.00% <0.00%> (ø)` | |
   
   :mega: Codecov can now indicate which changes are the most critical in Pull Requests. [Learn more](https://about.codecov.io/product/feature/runtime-insights/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #14: Owning/mutable `struct ArrowArray`

Posted by GitBox <gi...@apache.org>.
lidavidm commented on code in PR #14:
URL: https://github.com/apache/arrow-nanoarrow/pull/14#discussion_r940193353


##########
src/nanoarrow/array.c:
##########
@@ -0,0 +1,220 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "nanoarrow.h"
+
+static void ArrowArrayRelease(struct ArrowArray* array) {
+  // Release buffers held by this array
+  struct ArrowArrayPrivateData* data = (struct ArrowArrayPrivateData*)array->private_data;
+  if (data != NULL) {
+    ArrowBitmapReset(&data->bitmap);
+    ArrowBufferReset(&data->buffers[0]);
+    ArrowBufferReset(&data->buffers[1]);
+    ArrowFree(data);
+  }
+
+  // This object owns the memory for all the children, but those
+  // children may have been generated elsewhere and might have
+  // their own release() callback.
+  if (array->children != NULL) {
+    for (int64_t i = 0; i < array->n_children; i++) {
+      if (array->children[i] != NULL) {
+        if (array->children[i]->release != NULL) {
+          array->children[i]->release(array->children[i]);
+        }
+
+        ArrowFree(array->children[i]);
+      }
+    }
+
+    ArrowFree(array->children);
+  }
+
+  // This object owns the memory for the dictionary but it
+  // may have been generated somewhere else and have its own
+  // release() callback.
+  if (array->dictionary != NULL) {
+    if (array->dictionary->release != NULL) {
+      array->dictionary->release(array->dictionary);
+    }
+
+    ArrowFree(array->dictionary);
+  }
+
+  // Mark released
+  array->release = NULL;
+}
+
+ArrowErrorCode ArrowArraySetStorageType(struct ArrowArray* array,
+                                        enum ArrowType storage_type) {
+  switch (storage_type) {
+    case NANOARROW_TYPE_UNINITIALIZED:
+    case NANOARROW_TYPE_NA:
+      array->n_buffers = 0;
+      break;
+
+    case NANOARROW_TYPE_LIST:
+    case NANOARROW_TYPE_LARGE_LIST:
+    case NANOARROW_TYPE_STRUCT:
+    case NANOARROW_TYPE_MAP:
+    case NANOARROW_TYPE_SPARSE_UNION:

Review Comment:
   Shouldn't fixed-size list also go here?



##########
src/nanoarrow/nanoarrow.h:
##########
@@ -503,6 +456,35 @@ static inline void ArrowBitmapReset(struct ArrowBitmap* bitmap);
 
 /// }@
 
+/// \defgroup nanoarrow-array Array producer helpers
+/// These functions allocate, copy, and destroy ArrowArray structures
+
+/// \brief Initialize the fields of an array
+///
+/// Initializes the fields and release callback of array. Caller
+/// is responsible for calling the array->release callback if
+/// NANOARROW_OK is returned.
+ArrowErrorCode ArrowArrayInit(struct ArrowArray* array, enum ArrowType storage_type);
+
+/// \brief Allocate the array->children array
+///
+/// Includes the memory for each child struct ArrowArray.
+/// schema must have been allocated using ArrowArrayInit.
+ArrowErrorCode ArrowArrayAllocateChildren(struct ArrowArray* array, int64_t n_children);
+
+/// \brief Allocate the array->dictionary member
+///
+/// array must have been allocated using ArrowArrayInit

Review Comment:
   I suppose afterwards, the dictionary/each child should be in turn initialized with ArrowArrayInit?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org