You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Hyunsik Choi (JIRA)" <ji...@apache.org> on 2015/10/15 06:58:07 UTC

[jira] [Resolved] (TAJO-1339) Incorrect handling of tables with custom delimiter when their data contain '|'

     [ https://issues.apache.org/jira/browse/TAJO-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyunsik Choi resolved TAJO-1339.
--------------------------------
    Resolution: Duplicate

Its essential problem was fixed in TAJO-1340.

> Incorrect handling of tables with custom delimiter when their data contain '|'
> ------------------------------------------------------------------------------
>
>                 Key: TAJO-1339
>                 URL: https://issues.apache.org/jira/browse/TAJO-1339
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Keuntae Park
>            Assignee: Hyunsik Choi
>
> With the table data
> {code}
> 1;a;1.1
> 2;a|b;2.4
> 3;b|c|d;3.2
> {code}
> and external table declaration
> {code}
> create external table delimiter (id int, name text, score float) using csv
> with ('csvfile.delimiter'=';') location 'xxx';
> {code}
> , I got the following incorrect query result for query 'select name, score from delimiter'
> {code}
> name,score
> -------------------------------
> a,1.1
> a,null
> b,null
> {code}
> It looks like '|' in name column is recognized as delimiter.
> As I inspect the code,
> table meta information like 'csvfile.delimiter' is only valid on leaf scan operation and all the other operations (including making intermediate data and materialize query result) assumes that delimiter is DEFAULT_FIELD_DELIMITER, which is '|'.
> Hence, if the plan has the process of making intermediate data, 
> it handles '|' in the data as a delimiter even though it is not actually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)