You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2010/01/21 00:12:54 UTC

[jira] Created: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

CreateTime is reset to 0 when a partition is overwritten
--------------------------------------------------------

                 Key: HIVE-1076
                 URL: https://issues.apache.org/jira/browse/HIVE-1076
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: Zheng Shao
            Assignee: Paul Yang


Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.

{code}
hive> describe extended zshao_ttp;
OK
d       string
ds      string

Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
Time taken: 3.062 seconds
hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
Time taken: 0.436 seconds


hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
Kill Command = /mnt/vol/hive/sites/prod.latest/hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_200912262300_1111
Launching Job 2 out of 2
Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
Loading data to table zshao_ttp partition {ds=2010-01-01}
Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
2 Rows loaded to zshao_ttp
OK
Time taken: 187.049 seconds


hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
Time taken: 0.283 seconds
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Yang updated HIVE-1076:
----------------------------

    Attachment: HIVE-1076.1.patch

The reason why this is occurring is due to a custom pre-execute hook (which is in use by facebook) that was calling alterTable to the metastore to set a flag. The pre-execute hook would get information about what values to set for the partition through the WriteEntity object generated by SemanticAnalyzer:genFileSinkPlan(). The partition information (stored in the Partition object) for the WriteEntity object is populated by the tableSpec class in BaseSemantic analyzer.

When creating the Partition object, tableSpec does not query the metastore and instead just creates the Partition object using the constructor. Consequently, create time is initialized to the default value of 0. For cases where the partition did not exist before, there is no problem as there is nothing to alter when the pre-execute hook runs. However, for partitions that already exist, the alter partition call in the pre-execute hook would overwrite the partition info with creation time = 0.

To fix this, the patch queries the metastore in tableSpec to get the correct partition information.

Currently, it's difficult to create a test for this issue because DESCRIBE EXTENDED will print out a single line for the partition attributes. Because the line probably contains 'file:/', diff will ignore the whole line.

Zheng - how quickly do you need the fix? Fixing that, plus adding a test case will take some additional time. A quick solution would be to stop using that pre-execution hook.

> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803996#action_12803996 ] 

Zheng Shao commented on HIVE-1076:
----------------------------------

When I commit, the "svn commit" message was set to HIVE-1072 by mistake.
So please look for HIVE-1072 when looking for svn log or CHANGES.txt.



> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
> lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
> location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
> comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
> {lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-1076:
---------------------------------

      Component/s: Metastore
    Fix Version/s:     (was: 0.6.0)

> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>             Fix For: 0.5.0
>
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
> lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
> location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
> comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
> {lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Yang updated HIVE-1076:
----------------------------

    Description: 
Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.

{code}
hive> describe extended zshao_ttp;
OK
d       string
ds      string

Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
comment:null)], parameters:{transient_lastDdlTime=1264027720})
Time taken: 3.062 seconds
hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
Time taken: 0.436 seconds


hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_200912262300_1111
Launching Job 2 out of 2
Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
Loading data to table zshao_ttp partition {ds=2010-01-01}
Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
2 Rows loaded to zshao_ttp
OK
Time taken: 187.049 seconds


hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
Time taken: 0.283 seconds
{code}


  was:
Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.

{code}
hive> describe extended zshao_ttp;
OK
d       string
ds      string

Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
Time taken: 3.062 seconds
hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
Time taken: 0.436 seconds


hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_200912262300_1111
Launching Job 2 out of 2
Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
Loading data to table zshao_ttp partition {ds=2010-01-01}
Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
2 Rows loaded to zshao_ttp
OK
Time taken: 187.049 seconds


hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
Time taken: 0.283 seconds
{code}



> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
> lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
> location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
> comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
> {lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803598#action_12803598 ] 

Zheng Shao commented on HIVE-1076:
----------------------------------

Agree. The code looks good. Let me commit this fix first.

Let's open a new JIRA to pretty-print the Detailed Partition/Table Information. Once that gets done, let's add a test for this one.


> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
> lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
> location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
> comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
> {lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao resolved HIVE-1076.
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.6.0
                   0.5.0
     Release Note: HIVE-1076. Keep CreateTime when a partition is overwritten. (Paul Yang via zshao)

Committed to trunk and branch-0.5. Thanks Paul!

> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: HIVE-1076.1.patch
>
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, 
> lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], 
> location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, 
> comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, 
> lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000
> /user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, 
> serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
> {serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:
> {lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1076) CreateTime is reset to 0 when a partition is overwritten

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-1076:
-----------------------------

    Description: 
Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.

{code}
hive> describe extended zshao_ttp;
OK
d       string
ds      string

Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
Time taken: 3.062 seconds
hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
Time taken: 0.436 seconds


hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_200912262300_1111
Launching Job 2 out of 2
Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
Loading data to table zshao_ttp partition {ds=2010-01-01}
Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
2 Rows loaded to zshao_ttp
OK
Time taken: 187.049 seconds


hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
Time taken: 0.283 seconds
{code}


  was:
Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.

{code}
hive> describe extended zshao_ttp;
OK
d       string
ds      string

Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
Time taken: 3.062 seconds
hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
Time taken: 0.436 seconds


hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
Kill Command = /mnt/vol/hive/sites/prod.latest/hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_200912262300_1111
Launching Job 2 out of 2
Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
Loading data to table zshao_ttp partition {ds=2010-01-01}
Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
2 Rows loaded to zshao_ttp
OK
Time taken: 187.049 seconds


hive> describe extended zshao_ttp partition(ds='2010-01-01');
OK
d       string
ds      string

Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
Time taken: 0.283 seconds
{code}



> CreateTime is reset to 0 when a partition is overwritten
> --------------------------------------------------------
>
>                 Key: HIVE-1076
>                 URL: https://issues.apache.org/jira/browse/HIVE-1076
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Paul Yang
>
> Hive should keep "CreateTime" when a partition is overwritten. The "CreateTime" should be the first time the partition is created.
> {code}
> hive> describe extended zshao_ttp;
> OK
> d       string
> ds      string
> Detailed Table Information      Table(tableName:zshao_ttp, dbName:default, owner:zshao, createTime:1264027720, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:ds, type:string, comment:null)], parameters:{transient_lastDdlTime=1264027720})
> Time taken: 3.062 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:1264027788, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{transient_lastDdlTime=1264027788})
> Time taken: 0.436 seconds
> hive> insert overwrite table zshao_ttp partition (ds='2010-01-01') select d from zshao_ttp where ds = '2010-01-01';
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_200912262300_1111, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_200912262300_1111
> Kill Command = hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=jobtracker:50029 -kill job_200912262300_1111
> 2010-01-20 15:04:15,272 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:05:16,895 Stage-1 map = 0%,  reduce = 0%
> 2010-01-20 15:06:16,768 Stage-1 map = 100%,  reduce = 0%
> 2010-01-20 15:06:43,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_200912262300_1111
> Launching Job 2 out of 2
> Moving data to: hdfs://hdfs:9000/tmp/hive-zshao/262641680/10000
> Loading data to table zshao_ttp partition {ds=2010-01-01}
> Moved to trash: /user/hive/zshao_ttp/ds=2010-01-01
> 2 Rows loaded to zshao_ttp
> OK
> Time taken: 187.049 seconds
> hive> describe extended zshao_ttp partition(ds='2010-01-01');
> OK
> d       string
> ds      string
> Detailed Partition Information  Partition(values:[2010-01-01], dbName:default, tableName:zshao_ttp, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:d, type:string, comment:null)], location:hdfs://hdfs:9000/user/hive/zshao_ttp/ds=2010-01-01, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), parameters:{lastQueryTime=1264028626290,archiveFlag=false,transient_lastDdlTime=1264028626})
> Time taken: 0.283 seconds
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.