You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Matt Pestritto <ma...@pestritto.com> on 2009/09/09 01:50:16 UTC

Describe Extended - Replace Tab and LF

Hi.

I was wondering if you could replace the Tab and LF to a string <TAB> and
<LF> in the describe extended output ?
I have tables defined delimited with \t and breaks using \n so the output of
describe extended is not contiguous.

Minor patch below.  Feel free to use if you want to.

For example.  Note Line.delim outputs an actual \n which breaks the display
output so using the hiveservice you have to do another FetchOne to get the
rest of the line.

Detailed Table Information    Table(tableName:cobra_merchandise,
dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0,
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid,
type:string, comment:null), FieldSchema(name:client_merch_type_tid,
type:string, comment:null), FieldSchema(name:description, type:string,
comment:null), FieldSchema(name:client_description, type:string,
comment:null), FieldSchema(name:price, type:string, comment:null),
FieldSchema(name:cost, type:string, comment:null),
FieldSchema(name:start_date, type:string, comment:null),
FieldSchema(name:end_date, type:string, comment:null)],
location:hdfs://mustique:9000/user/hive/warehouse/m,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=9,*line.delim=<LF>,field.delim=<TAB>*}),
bucketCols:[], sortCols:[], parameters:{}),
partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)],
parameters:{})

Detailed Table Information    Table(tableName:cobra_merchandise,
dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0,
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid,
type:string, comment:null), FieldSchema(name:client_merch_type_tid,
type:string, comment:null), FieldSchema(name:description, type:string,
comment:null), FieldSchema(name:client_description, type:string,
comment:null), FieldSchema(name:price, type:string, comment:null),
FieldSchema(name:cost, type:string, comment:null),
FieldSchema(name:start_date, type:string, comment:null),
FieldSchema(name:end_date, type:string, comment:null)],
location:hdfs://mustique:9000/user/hive/warehouse/m,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=9,*line.delim=
,field.delim=*    }), bucketCols:[], sortCols:[], parameters:{}),
partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)],
parameters:{})


Patch File:

Index: ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
===================================================================
--- ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java    (revision
812724)
+++ ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java    (working
copy)
@@ -588,7 +588,7 @@
             // show table information
             outStream.writeBytes("Detailed Table Information");
             outStream.write(separator);
-            outStream.writeBytes(tbl.getTTable().toString());
+
outStream.writeBytes(tbl.getTTable().toString().replaceAll("\n",
"<LF>").replaceAll("\t", "<TAB>"));
             outStream.write(separator);
             // comment column is empty
             outStream.write(terminator);


Thanks
-Matt

Re: Describe Extended - Replace Tab and LF

Posted by Raghu Murthy <rm...@facebook.com>.
Hi Matt,

Thanks for the patch. It would be great if you could file a jira and submit
it there? :)

raghu


On 9/8/09 4:50 PM, "Matt Pestritto" <ma...@pestritto.com> wrote:

> Hi.
> 
> I was wondering if you could replace the Tab and LF to a string <TAB> and <LF>
> in the describe extended output ?
> I have tables defined delimited with \t and breaks using \n so the output of
> describe extended is not contiguous.
> 
> Minor patch below.  Feel free to use if you want to.
> 
> For example.  Note Line.delim outputs an actual \n which breaks the display
> output so using the hiveservice you have to do another FetchOne to get the
> rest of the line.
> 
> Detailed Table Information    Table(tableName:cobra_merchandise,
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0,
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid,
> type:string, comment:null), FieldSchema(name:client_merch_type_tid,
> type:string, comment:null), FieldSchema(name:description, type:string,
> comment:null), FieldSchema(name:client_description, type:string,
> comment:null), FieldSchema(name:price, type:string, comment:null),
> FieldSchema(name:cost, type:string, comment:null),
> FieldSchema(name:start_date, type:string, comment:null),
> FieldSchema(name:end_date, type:string, comment:null)],
> location:hdfs://mustique:9000/user/hive/warehouse/m,
> inputFormat:org.apache.hadoop.mapred.TextInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> parameters:{serialization.format=9,line.delim=<LF>,field.delim=<TAB>}),
> bucketCols:[], sortCols:[], parameters:{}),
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)],
> parameters:{})   
> 
> Detailed Table Information    Table(tableName:cobra_merchandise,
> dbName:default, owner:hive, createTime:1248726291, lastAccessTime:0,
> retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:merchandise_tid,
> type:string, comment:null), FieldSchema(name:client_merch_type_tid,
> type:string, comment:null), FieldSchema(name:description, type:string,
> comment:null), FieldSchema(name:client_description, type:string,
> comment:null), FieldSchema(name:price, type:string, comment:null),
> FieldSchema(name:cost, type:string, comment:null),
> FieldSchema(name:start_date, type:string, comment:null),
> FieldSchema(name:end_date, type:string, comment:null)],
> location:hdfs://mustique:9000/user/hive/warehouse/m,
> inputFormat:org.apache.hadoop.mapred.TextInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> parameters:{serialization.format=9,line.delim=
> ,field.delim=    }), bucketCols:[], sortCols:[], parameters:{}),
> partitionKeys:[FieldSchema(name:client_tid, type:int, comment:null)],
> parameters:{})   
> 
> 
> Patch File: 
> 
> Index: ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
> ===================================================================
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java    (revision
> 812724)
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java    (working copy)
> @@ -588,7 +588,7 @@
>              // show table information
>              outStream.writeBytes("Detailed Table Information");
>              outStream.write(separator);
> -            outStream.writeBytes(tbl.getTTable().toString());
> +            outStream.writeBytes(tbl.getTTable().toString().replaceAll("\n",
> "<LF>").replaceAll("\t", "<TAB>"));
>              outStream.write(separator);
>              // comment column is empty
>              outStream.write(terminator);
> 
> 
> Thanks
> -Matt
> 
>