You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/03/31 21:21:25 UTC

[jira] [Resolved] (SPARK-14260) Increase default value for maxCharsPerColumn

     [ https://issues.apache.org/jira/browse/SPARK-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-14260.
-------------------------------
    Resolution: Won't Fix

Yeah I think that would be a very rare case. I also suggest we not increase the default limit. This was motivated I think by SPARK-14103 but I'm not sure the cause is a long line, not yet. (Or if it is, the solution is to raise the limit.)

> Increase default value for maxCharsPerColumn
> --------------------------------------------
>
>                 Key: SPARK-14260
>                 URL: https://issues.apache.org/jira/browse/SPARK-14260
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Hyukjin Kwon
>            Priority: Trivial
>
> I guess the default value of the option {{maxCharsPerColumn}} looks relatively small,1000000 characters meaning 976KB.
> It looks some of guys have a problem with this ending up setting the value manually.
> https://github.com/databricks/spark-csv/issues/295
> https://issues.apache.org/jira/browse/SPARK-14103
> According to [univocity API|http://docs.univocity.com/parsers/2.0.0/com/univocity/parsers/common/CommonSettings.html#setMaxCharsPerColumn(int)], this exists to avoid {{OutOfMemoryErrors}}.
> If this does not harm performance, then I think it would be better to make the default value much bigger (eg. 10MB or 100MB) so that users do not take care of the lengths of each field in CSV file.
> Apparently Apache CSV Parser does not have such limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org