You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/07 15:39:34 UTC

[GitHub] [arrow] romainfrancois commented on pull request #7660: ARROW-9291 [R]: Support fixed size binary/list types

romainfrancois commented on pull request #7660:
URL: https://github.com/apache/arrow/pull/7660#issuecomment-654949268


   I think we need a function for the `raws <- vctrs::new_list_of(...)` part, and probably an equivalent function for the other binary types, but `binary()`, `large_binary()` and `fixed_size_binary()` which would be the obvious choice is already taken. 
   
   I think that in this function: 
   
   ```
   template <>
   std::shared_ptr<arrow::DataType> InferArrowTypeFromVector<VECSXP>(SEXP x) {
     if (Rf_inherits(x, "data.frame") || Rf_inherits(x, "POSIXlt")) {
       return InferArrowTypeFromDataFrame(x);
     } else {
       if (Rf_inherits(x, "arrow_fixed_size_binary")) {
         SEXP byte_width = Rf_getAttrib(x, symbols::byte_width);
         if (Rf_isNull(byte_width) || TYPEOF(byte_width) != INTSXP ||
             XLENGTH(byte_width) != 1) {
           Rcpp::stop("malformed arrow_fixed_size_binary object");
         }
         return arrow::fixed_size_binary(INTEGER(byte_width)[0]);
       }
   
       SEXP ptype = Rf_getAttrib(x, symbols::ptype);
       if (Rf_isNull(ptype)) {
         if (XLENGTH(x) == 0) {
           Rcpp::stop(
               "Requires at least one element to infer the values' type of a list vector");
         }
   
         ptype = VECTOR_ELT(x, 0);
       }
   
       // special case list(raw()) -> BinaryArray
       if (TYPEOF(ptype) == RAWSXP) {
         return arrow::binary();
       }
   
       return arrow::list(InferArrowType(ptype));
     }
   }
   ```
   
   we should rather dispatch based on an r type for binary and large binary, as it's done here for fixed size binary, and forget the magic conversion of a `vctrs_list_of<raw>` to a `BinaryArray`, as we might actually want to create an Array of type `List<uint8>`: 
   
   ``` r
   arrow::list_of(arrow::uint8())
   #> ListType
   #> list<item: uint8>
   ```
   
   <sup>Created on 2020-07-07 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0.9001)</sup>
   
   but type deduction currently short circuits it.  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org