You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/25 02:09:48 UTC

[GitHub] [arrow] wesm opened a new pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

wesm opened a new pull request #7264:
URL: https://github.com/apache/arrow/pull/7264


   I also fixed the `Arity` ctor to be explicit which I thought I'd done in #7240 but it is taken care of here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kiszk commented on a change in pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
kiszk commented on a change in pull request #7264:
URL: https://github.com/apache/arrow/pull/7264#discussion_r429998394



##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -345,13 +388,13 @@ class ARROW_EXPORT KernelSignature {
                                                bool is_varargs = false);
 
   /// \brief Return true if the signature if compatible with the list of input
-  /// value descriptors
+  /// value descriptorsl

Review comment:
       Is this change necessary?

##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -345,13 +388,13 @@ class ARROW_EXPORT KernelSignature {
                                                bool is_varargs = false);
 
   /// \brief Return true if the signature if compatible with the list of input
-  /// value descriptors
+  /// value descriptorsl

Review comment:
       nit: Is this change necessary?

##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -386,47 +435,75 @@ struct SimdLevel {
   enum type { NONE, SSE4_2, AVX, AVX2, AVX512, NEON };
 };
 
+/// \brief The strategy to use for propagating or otherwise populating the
+/// validity bitmap of a kernel output.
 struct NullHandling {
   enum type {
     /// Compute the output validity bitmap by intersecting the validity bitmaps
-    /// of the arguments. Kernel does not do anything with the bitmap
+    /// of the arguments. Kernel generally need not touch the bitmap
+    /// thereafter, but a kernel's exec function is permitted to alter the
+    /// bitmap after the null intersection is computed if it needs to.
     INTERSECTION,
 
-    /// Kernel expects a pre-allocated buffer to write the result bitmap into
+    /// Kernel expects a pre-allocated buffer to write the result bitmap
+    /// into. The preallocated memory is not zeroed (except for the last byte),
+    /// so the kernel should ensure to completely populate the bitmap.
     COMPUTED_PREALLOCATE,
 
-    /// Kernel allocates and populates the validity bitmap of the output
+    /// Kernel allocates and set the validity bitmap of the output.

Review comment:
       nit: `set` -> `sets`

##########
File path: cpp/src/arrow/compute/exec.h
##########
@@ -146,22 +158,53 @@ class ARROW_EXPORT SelectionVector {
 /// Array and Scalar values and an optional SelectionVector indicating that
 /// there is an unmaterialized filter that either must be materialized, or (if
 /// the kernel supports it) pushed down into the kernel implementation.
+///
+/// ExecBatch is semantically similar to RecordBatch in that in a SQL context
+/// it represents a collection of records, but constant "columns" are
+/// represented by Scalar values rather than having to be converted into arrays
+/// with repeated values.
+///
+/// TOOD: Datum uses arrow/util/variant.h which may be a bit heavier-weight

Review comment:
       nit: `TOOD` -> `TODO`

##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -324,16 +365,18 @@ class ARROW_EXPORT OutputType {
   // For FIXED resolution
   std::shared_ptr<DataType> type_;
 
+  /// \brief The shape of the output type to return when using Resolve. If ANY
+  /// will promote the input shapes.
   ValueDescr::Shape shape_ = ValueDescr::ANY;
 
   // For COMPUTED resolution
   Resolver resolver_;
 };
 
-/// \brief Holds the input types and output type of the kernel
+/// \brief Holds the input types and output type of the kernel.
 ///
-/// Varargs functions should pass a single input type to be used to validate
-/// the the input types of a function invocation
+/// VarArgs functions should pass a single input type to be used to validate
+/// the the input types of a function invocation.

Review comment:
       nit: `the the` -> `the`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm closed pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
wesm closed pull request #7264:
URL: https://github.com/apache/arrow/pull/7264


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on a change in pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
wesm commented on a change in pull request #7264:
URL: https://github.com/apache/arrow/pull/7264#discussion_r430008693



##########
File path: cpp/src/arrow/compute/exec.h
##########
@@ -60,29 +60,38 @@ static constexpr int64_t kDefaultExecChunksize = UINT16_MAX;
 /// function evaluation
 class ARROW_EXPORT ExecContext {
  public:
-  // If no function registry passed, the default is used
+  // If no function registry passed, the default is used.
   explicit ExecContext(MemoryPool* pool = default_memory_pool(),
                        FunctionRegistry* func_registry = NULLPTR);
 
+  /// \brief The MemoryPool used for allocations, default is
+  /// default_memory_pool().
   MemoryPool* memory_pool() const { return pool_; }
 
   ::arrow::internal::CpuInfo* cpu_info() const;
 
+  /// \brief The FunctionRegistry for looking up functions by name and
+  /// selecting kernels for execution. Defaults to the library-global function
+  /// registry provided by GetFunctionRegistry.
   FunctionRegistry* func_registry() const { return func_registry_; }
 
-  // \brief Set maximum length unit of work for kernel execution. Larger inputs
-  // will be split into smaller chunks, and, if desired, processed in
-  // parallel. Set to -1 for no limit
+  // \brief Set maximum length unit of work for kernel execution. Larger
+  // contiguous array inputs will be split into smaller chunks, and, if
+  // possible and enable, processed in parallel. The default chunksize is

Review comment:
       enabled

##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -345,13 +388,13 @@ class ARROW_EXPORT KernelSignature {
                                                bool is_varargs = false);
 
   /// \brief Return true if the signature if compatible with the list of input
-  /// value descriptors
+  /// value descriptorsl

Review comment:
       typo. fixed

##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -386,47 +435,75 @@ struct SimdLevel {
   enum type { NONE, SSE4_2, AVX, AVX2, AVX512, NEON };
 };
 
+/// \brief The strategy to use for propagating or otherwise populating the
+/// validity bitmap of a kernel output.
 struct NullHandling {
   enum type {
     /// Compute the output validity bitmap by intersecting the validity bitmaps
-    /// of the arguments. Kernel does not do anything with the bitmap
+    /// of the arguments. Kernel generally need not touch the bitmap
+    /// thereafter, but a kernel's exec function is permitted to alter the
+    /// bitmap after the null intersection is computed if it needs to.

Review comment:
       Yes. I'll clarify that it means a bitwise-and operation with all of the respective validity bitmaps




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7264:
URL: https://github.com/apache/arrow/pull/7264#issuecomment-633343370


   https://issues.apache.org/jira/browse/ARROW-8926


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #7264:
URL: https://github.com/apache/arrow/pull/7264#issuecomment-635608773


   Rebasing. I'm going to merge this so avoid further rebase conflicts, but please continue to leave comments and I will address them


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7264: ARROW-8926: [C++] Improve arrow/compute/*.h comments, correct typos and outdated language

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #7264:
URL: https://github.com/apache/arrow/pull/7264#discussion_r430269461



##########
File path: cpp/src/arrow/compute/kernel.h
##########
@@ -386,47 +435,75 @@ struct SimdLevel {
   enum type { NONE, SSE4_2, AVX, AVX2, AVX512, NEON };
 };
 
+/// \brief The strategy to use for propagating or otherwise populating the
+/// validity bitmap of a kernel output.
 struct NullHandling {
   enum type {
     /// Compute the output validity bitmap by intersecting the validity bitmaps
-    /// of the arguments. Kernel does not do anything with the bitmap
+    /// of the arguments. Kernel generally need not touch the bitmap
+    /// thereafter, but a kernel's exec function is permitted to alter the
+    /// bitmap after the null intersection is computed if it needs to.

Review comment:
       From this explanation, I find it not fully clear what "intersecting" means. 
   (is it where one of the arguments is null, or all or the argument is null, that the output validity indicates null?)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org