You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "csgillespie (via GitHub)" <gi...@apache.org> on 2023/09/17 20:56:36 UTC

[GitHub] [arrow] csgillespie opened a new issue, #37762: Column names that are empty strings

csgillespie opened a new issue, #37762:
URL: https://github.com/apache/arrow/issues/37762

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   The following fails
   ```
   library(arrow)
   library(dplyr)
   write.csv(sleep, "sleep.csv", row.names = TRUE)
   open_dataset("sleep.csv", format = "csv") |>
     mutate(group = group + 1) |>
     collect()
   # Error in env_bind0(env, data) : attempt to use zero-length variable name
   ```
   This is due to the first column having no column name.
   ```
   open_dataset("sleep.csv", format = "csv") |>
      head() |>
      collect()
   # A tibble: 6 × 4
        `` extra group    ID
     <int> <dbl> <int> <int>
   1     1   0.7     1     1
   2     2  -1.6     1     2
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on issue #37762: [R] Column names that are empty strings

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #37762:
URL: https://github.com/apache/arrow/issues/37762#issuecomment-1723225540

   Thanks for reporting this @csgillespie! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on issue #37762: [R] Column names that are empty strings

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on issue #37762:
URL: https://github.com/apache/arrow/issues/37762#issuecomment-1724002604

   FWIW if you use `read.csv()` or `readr::read_csv()` on that file, both will fill in a non-empty name for the first column (`"X"` and `"...1"`, respectively). Not saying we should copy that, but that would be one reason they would not error if you tried the same on a data.frame version of this.
   
   Not sure where exactly we should check this since it's technically not invalid in Arrow. And unfortunately it's not trivial to fix either once you've read it in. `dplyr::rename()` doesn't seem to let you rename an empty name. `names<-.Dataset` is not implemented, though it could be. You can do `names(ds$schema)[1] <- "not_empty"` and that does work, though clearly suboptimal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org