You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by pengcheng xiong <px...@hortonworks.com> on 2016/06/09 23:16:42 UTC

Review Request 48520: Use multi-threaded approach to listing files for msck

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/
-----------------------------------------------------------

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
-------

HIVE-13984


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 10fa561 

Diff: https://reviews.apache.org/r/48520/diff/


Testing
-------


Thanks,

pengcheng xiong


Re: Review Request 48520: Use multi-threaded approach to listing files for msck

Posted by pengcheng xiong <px...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/
-----------------------------------------------------------

(Updated June 10, 2016, 11:58 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
-------

HIVE-13984


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 285caa3 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 10fa561 

Diff: https://reviews.apache.org/r/48520/diff/


Testing
-------


Thanks,

pengcheng xiong


Re: Review Request 48520: Use multi-threaded approach to listing files for msck

Posted by pengcheng xiong <px...@hortonworks.com>.

> On June 10, 2016, 12:24 a.m., Hari Sankar Sivarama Subramaniyan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java, line 379
> > <https://reviews.apache.org/r/48520/diff/1/?file=1414178#file1414178line379>
> >
> >     nit: is it possible to make allDirs as SynchronizedSet so that someone doesnt misuse this in future.

It is already synchronized? see Set<Path> dirSet = Collections.synchronizedSet(new HashSet<Path>());


- pengcheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/#review136931
-----------------------------------------------------------


On June 9, 2016, 11:16 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48520/
> -----------------------------------------------------------
> 
> (Updated June 9, 2016, 11:16 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-13984
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 10fa561 
> 
> Diff: https://reviews.apache.org/r/48520/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 48520: Use multi-threaded approach to listing files for msck

Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/#review136931
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 379)
<https://reviews.apache.org/r/48520/#comment202051>

    nit: is it possible to make allDirs as SynchronizedSet so that someone doesnt misuse this in future.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 385)
<https://reviews.apache.org/r/48520/#comment202037>

    Can you please update this parameter description in HiveConf.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 390)
<https://reviews.apache.org/r/48520/#comment202046>

    nit: Fine to use a Void return type and return null object instead of true always.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 394)
<https://reviews.apache.org/r/48520/#comment202052>

    This will be effectively a serial operation if we have a skewed directory structure (very rare or possibly no scenarios ??)
    
    Another thing I remembered is that HIVE_MOVE_FILES_THREAD_COUNT does support a value of 0, which runs the entire thing in serial mode. So if you are reusing that configuration, you will have to keep the serial code path or else you need to introduce a new param. Otherwise there will be a conflict.


- Hari Sankar Sivarama Subramaniyan


On June 9, 2016, 11:16 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48520/
> -----------------------------------------------------------
> 
> (Updated June 9, 2016, 11:16 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-13984
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 10fa561 
> 
> Diff: https://reviews.apache.org/r/48520/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>