You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Matt Pestritto (JIRA)" <ji...@apache.org> on 2009/09/09 16:04:57 UTC

[jira] Created: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Describe Extended Line Breaks When Delimiter is \n
--------------------------------------------------

                 Key: HIVE-820
                 URL: https://issues.apache.org/jira/browse/HIVE-820
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Matt Pestritto
            Priority: Minor


Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.

Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.

For example.

Original Output:
Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

Proposed Output:
Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792214#action_12792214 ] 

Namit Jain commented on HIVE-820:
---------------------------------

blocker for 0.5

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-820:
----------------------------

    Fix Version/s:     (was: 0.5.0)

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797289#action_12797289 ] 

Namit Jain commented on HIVE-820:
---------------------------------

We should be consistent across different fields.

serialization.format=9,line.delim= ,field.delim= 

We should use the same format for all of them. We can choose the decimal format for all of them. Since it is a existing problem, this need not be a blocker for 0.5



> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Matt Pestritto (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797150#action_12797150 ] 

Matt Pestritto commented on HIVE-820:
-------------------------------------

All -

Do we have a decision on what you want the output to show ?  A few different ideas were being thrown around.

I would rather replace only characters that would break the output ( tab, \n ) with something meaningful vs, as Edward stated, always showing the octal representation which would require an ascii table to figure out what the delimiter is.  If something is | ( pipe ) delimited, I always need to look it up when that is a printable character.

I'll wait for feedback from the FB team and make the changes.

Thanks.

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-820:
----------------------------------

    Affects Version/s: 0.5.0
                       0.4.0
                       0.3.2
                       0.3.1
                       0.2.0
                       0.3.0
        Fix Version/s: 0.5.0
             Assignee: Matt Pestritto

Looks good. Will commit if tests pass.

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Matt Pestritto (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753629#action_12753629 ] 

Matt Pestritto commented on HIVE-820:
-------------------------------------

Edward - 

I made this suggested change and it did not work.  For the LF, the output still breaks and two fetches have to be done to get the extended plan.  The 054 did not display anything.  

I also tried escaping the backslash and just a 054 and 012 were printed.  Would you prefer that notation ?   054 and 012 with no \

Thanks
-Matt



> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753654#action_12753654 ] 

Edward Capriolo commented on HIVE-820:
--------------------------------------

Matt,

It is a tough call. 

I believe you can issue
{noformat}
 FIELDS TERMINATED BY ',' " +
{noformat}

As well as 
{noformat}
 FIELDS TERMINATED BY '\054' " +
{noformat}

in a create table statement. It is stored as its ascii/unicode value. Hive will restrict delimiters over ascii 128 I believe. 

If someone is issuing 'show tables' showing them '054' as opposed to a ',' might be a pain, as they will need an ascii table to figure out what the delimiter is'.  However I think showing them the octal/hex/decimal is the best way as the ouput is consistent. 

We also can do <LF> but we should do replacements for all non-printable characters. 


I think 054 is fine, but maybe someone wants to chime in and speak about what the delimiters could be down the road. Guys?

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753218#action_12753218 ] 

Edward Capriolo commented on HIVE-820:
--------------------------------------

Can I drop a late comment in....

-            outStream.writeBytes(tbl.getTTable().toString());
+            outStream.writeBytes(tbl.getTTable().toString().replaceAll("\n", "<LF>").replaceAll("\t", "<TAB>"));

We should do this in a uniform format. There are lots of non printable characters we use US UnitSeparator for example

http://web.cs.mun.ca/~michael/c/ascii-table.html
Why not output in the same format the create table would specify?

{noformat}
 FIELDS TERMINATED BY '\054' " +
        " LINES TERMINATED BY '\012' " );
{noformat}

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0
>            Reporter: Matt Pestritto
>            Assignee: Matt Pestritto
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-820) Describe Extended Line Breaks When Delimiter is \n

Posted by "Matt Pestritto (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Pestritto updated HIVE-820:
--------------------------------

    Attachment: hive_820.patch

Patch Attached.

> Describe Extended Line Breaks When Delimiter is \n
> --------------------------------------------------
>
>                 Key: HIVE-820
>                 URL: https://issues.apache.org/jira/browse/HIVE-820
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Matt Pestritto
>            Priority: Minor
>         Attachments: hive_820.patch
>
>
> Tables defined delimited with \t and breaks using \n has output of describe extended that is not contiguous.
> Line.delim outputs an actual \n which breaks the display output so using the hiveservice you have to do another FetchOne to get the rest of the line.
> For example.
> Original Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   
> Proposed Output:
> Detailed Table Information    Table(tableName:cobra_merchandise, dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid, type:string, comment:null), FieldSchema(name:client_merch_type_tid, type:string, comment:null), FieldSchema(name:description, type:string, comment:null), FieldSchema(name:client_description, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:cost, type:string, comment:null), FieldSchema(name:start_date, type:string, comment:null), FieldSchema(name:end_date, type:string, comment:null)], location:hdfs://mustique:9000/user/hive/warehouse/m, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)], parameters:{})   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.