You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by vic0777 <vi...@163.com> on 2014/12/03 03:40:58 UTC

Where is the base directory of a transaction table?

Hi All,


I know probably this should not post here. I posted it in the user maillist without any response, so I moved it here. Thanks in advance for any help.



I am trying to use the new transaction feature in Hive-0.14. According to its document, every transaction table have a base directory and one delta directory for each transaction in HDFS for data storage. But I can not find the base directory under the datawarehouse directory in HDFS, there is only delta directories. Even the initial data is stored in a delta directory. Following is the commands I used.

create table test_txn (id int ,name string ) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');
insert into table test_txn select * from test_text;
update test_txn set name="liu" where id = 10;

P.S. I have configured the parameters required by the transaction feature:
  hive.support.concurrency,
  hive.enforce.bucketing,
  hive.exec.dynamic.partition.mode,
  hive.txn.manager,
  hive.compactor.initiator.on
  hive.compactor.worker.threads.

Although I cannot find the base directory in HDFS, all SELECT, UPDATE and DELETE statements works fine and the data in the table is correct. I am wondering where the base directory is.

Any help is appreciated.

Thanks,
Wantao

Re: Where is the base directory of a transaction table?

Posted by Eugene Koifman <ek...@hortonworks.com>.
I think base will show up after major compaction has ran.

On Tue, Dec 2, 2014 at 6:40 PM, vic0777 <vi...@163.com> wrote:

> Hi All,
>
>
> I know probably this should not post here. I posted it in the user
> maillist without any response, so I moved it here. Thanks in advance for
> any help.
>
>
>
> I am trying to use the new transaction feature in Hive-0.14. According to
> its document, every transaction table have a base directory and one delta
> directory for each transaction in HDFS for data storage. But I can not find
> the base directory under the datawarehouse directory in HDFS, there is only
> delta directories. Even the initial data is stored in a delta directory.
> Following is the commands I used.
>
> create table test_txn (id int ,name string ) clustered by (id) into 2
> buckets stored as orc TBLPROPERTIES('transactional'='true');
> insert into table test_txn select * from test_text;
> update test_txn set name="liu" where id = 10;
>
> P.S. I have configured the parameters required by the transaction feature:
>   hive.support.concurrency,
>   hive.enforce.bucketing,
>   hive.exec.dynamic.partition.mode,
>   hive.txn.manager,
>   hive.compactor.initiator.on
>   hive.compactor.worker.threads.
>
> Although I cannot find the base directory in HDFS, all SELECT, UPDATE and
> DELETE statements works fine and the data in the table is correct. I am
> wondering where the base directory is.
>
> Any help is appreciated.
>
> Thanks,
> Wantao




-- 

Thanks,
Eugene

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.