You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "liubangchen (JIRA)" <ji...@apache.org> on 2018/02/01 10:58:00 UTC

[jira] [Comment Edited] (HIVE-18582) MSCK REPAIR TABLE Throw MetaException

    [ https://issues.apache.org/jira/browse/HIVE-18582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348219#comment-16348219 ] 

liubangchen edited comment on HIVE-18582 at 2/1/18 10:57 AM:
-------------------------------------------------------------

Throw an exception on detect a bad directories.

Add a method to check exsits partition data location dir is valid.

Code diff is here :

 https://reviews.apache.org/r/65458/diff/1#index_header


was (Author: liubangchen):
Add a method to check exsits partition data location dir is valid and throw on bad directories.
Code diff is here :
https://reviews.apache.org/r/65458/diff/1#index_header

>  MSCK REPAIR TABLE Throw MetaException
> --------------------------------------
>
>                 Key: HIVE-18582
>                 URL: https://issues.apache.org/jira/browse/HIVE-18582
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.1.1
>            Reporter: liubangchen
>            Assignee: liubangchen
>            Priority: Major
>         Attachments: HIVE-18582-1.patch, HIVE-18582.patch
>
>
> while executing query MSCK REPAIR TABLE tablename I got Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Expected 1 components, got 2 (log_date=2015121309/vgameid=lyjt))
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1847)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:402)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> --
> Caused by: MetaException(message:Expected 1 components, got 2 (log_date=2015121309/vgameid=lyjt))
> at org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:385)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1845)
> {code}
> table PARTITIONED by (log_date,vgameid)
> The data file on HDFS is:
>  
> {code:java}
> /usr/hive/warehouse/a.db/tablename/log_date=2015063023
> drwxr-xr-x - root supergroup 0 2018-01-26 09:41 /usr/hive/warehouse/a.db/tablename/log_date=2015121309/vgameid=lyjt
> {code}
> The subdir of log_data=2015063023 is empty
> If i set  hive.msck.path.validation=ignore Then msck repair table will executed ok.
> Then I found code like this:
> {code:java}
> private int msck(Hive db, MsckDesc msckDesc) {
>   CheckResult result = new CheckResult();
>   List<String> repairOutput = new ArrayList<String>();
>   try {
>     HiveMetaStoreChecker checker = new HiveMetaStoreChecker(db);
>     String[] names = Utilities.getDbTableName(msckDesc.getTableName());
>     checker.checkMetastore(names[0], names[1], msckDesc.getPartSpecs(), result);
>     List<CheckResult.PartitionResult> partsNotInMs = result.getPartitionsNotInMs();
>     if (msckDesc.isRepairPartitions() && !partsNotInMs.isEmpty()) {
>      //I think bug is here
>       AbstractList<String> vals = null;
>       String settingStr = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION);
>       boolean doValidate = !("ignore".equals(settingStr));
>       boolean doSkip = doValidate && "skip".equals(settingStr);
>       // The default setting is "throw"; assume doValidate && !doSkip means throw.
>       if (doValidate) {
>         // Validate that we can add partition without escaping. Escaping was originally intended
>         // to avoid creating invalid HDFS paths; however, if we escape the HDFS path (that we
>         // deem invalid but HDFS actually supports - it is possible to create HDFS paths with
>         // unprintable characters like ASCII 7), metastore will create another directory instead
>         // of the one we are trying to "repair" here.
>         Iterator<CheckResult.PartitionResult> iter = partsNotInMs.iterator();
>         while (iter.hasNext()) {
>           CheckResult.PartitionResult part = iter.next();
>           try {
>             vals = Warehouse.makeValsFromName(part.getPartitionName(), vals);
>           } catch (MetaException ex) {
>             throw new HiveException(ex);
>           }
>           for (String val : vals) {
>             String escapedPath = FileUtils.escapePathName(val);
>             assert escapedPath != null;
>             if (escapedPath.equals(val)) continue;
>             String errorMsg = "Repair: Cannot add partition " + msckDesc.getTableName()
>                 + ':' + part.getPartitionName() + " due to invalid characters in the name";
>             if (doSkip) {
>               repairOutput.add(errorMsg);
>               iter.remove();
>             } else {
>               throw new HiveException(errorMsg);
>             }
>           }
>         }
>       }
> {code}
> I think  AbstractList<String> vals = null; must placed after  "while (iter.hasNext()) {" will work ok.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)