You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/13 05:22:02 UTC

[GitHub] [arrow] PalGal2 opened a new issue #8454: [R Package] Error: Filter expression not supported for Arrow Datasets (substr, grepl, str_detect)

PalGal2 opened a new issue #8454:
URL: https://github.com/apache/arrow/issues/8454


   Hi,
   
   Some expressions, such as substr(), grepl(), str_detect() or others, are not supported while filtering after open_datatset(). Specifically, the code below :
   
   ```
   library(dplyr)
   library(arrow)
   data = data.frame(a = c("a", "a2", "a3"))
   write_parquet(data, "Test_filter/data.parquet")
   
   ds <- open_dataset("Test_filter/")
   
   data_flt <- ds %>% 
     filter(substr(a, 1, 1) == "a")
   ```
   
   gives this error :
   
   ```
   Error: Filter expression not supported for Arrow Datasets: substr(a, 1, 1) == "a"
   Call collect() first to pull data into R.
   ```
   These expressions may be very helpful, not to say necessary, to filter and collect a very large dataset. Is there anything it can be done to implement this new feature ?
   
   Thank you. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] PalGal2 commented on issue #8454: [R Package] Error: Filter expression not supported for Arrow Datasets (substr, grepl, str_detect)

Posted by GitBox <gi...@apache.org>.
PalGal2 commented on issue #8454:
URL: https://github.com/apache/arrow/issues/8454#issuecomment-708183731


   Thank you @nealrichardson. I opened a new issue at https://issues.apache.org/jira/browse/ARROW-10305.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on issue #8454: [R Package] Error: Filter expression not supported for Arrow Datasets (substr, grepl, str_detect)

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on issue #8454:
URL: https://github.com/apache/arrow/issues/8454#issuecomment-707808808


   Please direct feature requests to https://issues.apache.org/jira/projects/ARROW/issues. This particular one should be covered in ARROW-9856 but feel free to provide details on your use cases there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed issue #8454: [R Package] Error: Filter expression not supported for Arrow Datasets (substr, grepl, str_detect)

Posted by GitBox <gi...@apache.org>.
nealrichardson closed issue #8454:
URL: https://github.com/apache/arrow/issues/8454


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org