You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "kmkramer23 (via GitHub)" <gi...@apache.org> on 2023/03/03 17:05:38 UTC

[GitHub] [arrow] kmkramer23 opened a new issue, #34434: read_csv_arrow stops reading the rest of the file during a conversion error

kmkramer23 opened a new issue, #34434:
URL: https://github.com/apache/arrow/issues/34434

   ### Describe the usage question you have. Please include as many useful details as  possible.
   
   
   When reading a csv file in R using read_csv_arrow with a schema, the program will stop and not finish reading in the whole file if it encounters a conversion error.  Is there a setting that will force all the records to be read in even if a conversion error is encountered?  
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on issue #34434: [R] read_csv_arrow stops reading the rest of the file during a conversion error

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #34434:
URL: https://github.com/apache/arrow/issues/34434#issuecomment-1479432355

   @kmkramer23 I'm going to close this now, as the proposed solution should solve the problem you're having, but if not, feel free to reopen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on issue #34434: read_csv_arrow stops reading the rest of the file during a conversion error

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #34434:
URL: https://github.com/apache/arrow/issues/34434#issuecomment-1453916081

   How about this? You can specify alternative NA values using the `na` parameter:
   
   ``` r
   tf <- tempfile()
   df <- tibble::tibble(x = c("1", "2", "3", "null value"))
   arrow::write_csv_arrow(df, tf)
   arrow::read_csv_arrow(tf, na = c("", "NA", "null value"))
   #> # A tibble: 4 × 1
   #>       x
   #>   <int>
   #> 1     1
   #> 2     2
   #> 3     3
   #> 4    NA
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic closed issue #34434: [R] read_csv_arrow stops reading the rest of the file during a conversion error

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic closed issue #34434: [R] read_csv_arrow stops reading the rest of the file during a conversion error
URL: https://github.com/apache/arrow/issues/34434


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on issue #34434: read_csv_arrow stops reading the rest of the file during a conversion error

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #34434:
URL: https://github.com/apache/arrow/issues/34434#issuecomment-1453848992

   Hi @kmkramer23.  There isn't a setting for this, as the conversion is part of the reading process, but perhaps if you could share a bit more information about the error you're encountering, we can help you fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kmkramer23 commented on issue #34434: read_csv_arrow stops reading the rest of the file during a conversion error

Posted by "kmkramer23 (via GitHub)" <gi...@apache.org>.
kmkramer23 commented on issue #34434:
URL: https://github.com/apache/arrow/issues/34434#issuecomment-1453857333

   Some files we get may have a character value in one of the integer fields representing missing data.  Since the file is coming from an outside organization it is not something I can have fixed.  I just need to read it in and convert it to numeric setting any of those records with the missing character codes to NA.  I can read everything in as character and do some pre processing but it would have been nice if read_csv_arrow could have forced the non int values to NA to avoid the extra step.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org