You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2020/04/02 22:42:00 UTC
[jira] [Resolved] (ARROW-8216) [R][C++][Dataset] Filtering returns
all-missing rows where the filtering column is missing
[ https://issues.apache.org/jira/browse/ARROW-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson resolved ARROW-8216.
------------------------------------
Resolution: Fixed
Issue resolved by pull request 6732
[https://github.com/apache/arrow/pull/6732]
> [R][C++][Dataset] Filtering returns all-missing rows where the filtering column is missing
> ------------------------------------------------------------------------------------------
>
> Key: ARROW-8216
> URL: https://issues.apache.org/jira/browse/ARROW-8216
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 0.16.0
> Environment: R 3.6.3, Windows 10
> Reporter: Sam Albers
> Assignee: Ben Kietzman
> Priority: Minor
> Labels: pull-request-available
> Fix For: 0.17.0
>
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
>
> I have just noticed some slightly odd behaviour with the filter method for Dataset.
>
> {code:java}
> library(arrow)
> library(dplyr)
> packageVersion("arrow")
> #> [1] '0.16.0.20200323'
> ## Make sample parquet
> starwars$hair_color[starwars$hair_color == "brown"] <- ""
> dir <- tempdir()
> fpath <- file.path(dir, "data.parquet")
> write_parquet(starwars, fpath)
> ## df in memory
> df_mem <- starwars %>%
> filter(hair_color == "")
> ## reading from the parquet
> df_parquet <- read_parquet(fpath) %>%
> filter(hair_color == "")
> ## using open_dataset
> df_dataset <- open_dataset(dir) %>%
> filter(hair_color == "") %>%
> collect()
> identical(df_mem, df_parquet)
> #> [1] TRUE
> identical(df_mem, df_dataset)
> #> [1] FALSE
> {code}
>
>
> I'm pretty sure all these should return the same data.frame. Am I missing something?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)