You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Jack Howard (Jira)" <ji...@apache.org> on 2022/07/19 16:07:00 UTC

[jira] [Created] (ARROW-17130) Enable multiple character delimiters in read_csv

Jack Howard created ARROW-17130:
-----------------------------------

             Summary: Enable multiple character delimiters in read_csv
                 Key: ARROW-17130
                 URL: https://issues.apache.org/jira/browse/ARROW-17130
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Format
    Affects Versions: 8.0.1
            Reporter: Jack Howard


Read_CSV ParseOptions allows only a single character delimiter.   Single character delimiters are highly susceptible to the candidate value existing within the data to be loaded, negating the ability to serve as a delimiter.

If a double character delimiter is used, the current limit of a single character returns  "only single character unicode strings can be converted to Py_UCS4, got length 2"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)