You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by manishgupta88 <gi...@git.apache.org> on 2017/07/08 10:22:56 UTC

[GitHub] carbondata pull request #1147: [WIP][CARBONDATA-1277] Dictionary generation ...

GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/carbondata/pull/1147

    [WIP][CARBONDATA-1277] Dictionary generation failure if there is failure in closing output steam in HDFS

    Analysis: If there is any failure while closing the output stream of dictionary file in HDFS then on next data load, update or insert into operation dictionary generation fails. This is because we open the dictionary file in append mode and when we try to get the output stream for that file HDFS throws an exception that Lease is already acquired by some other client.
    
    Fix: Recover the lease through carbondata code if exception is for lease failure

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/carbondata hdfs_lease_recovery_exception

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1147
    
----
commit 6ffcf18068b914a86bb83241300c7d8e7ba44c07
Author: manishgupta88 <to...@gmail.com>
Date:   2017-07-08T10:16:25Z

    Problem: Dictionary generation failure if there is failure in closing output steam in HDFS
    
    Analysis: If there is any failure while closing the output stream of dictionary file in HDFS then on next data load, update or insert into operation dictionary generation fails. This is because we open the dictionary file in append mode and when we try to get the output stream for that file HDFS throws an exception that Lease is already acquired by some other client.
    
    Fix: Recover the lease through carbondata code if exception is for lease failure

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [WIP][CARBONDATA-1277] Dictionary generation failure...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/371/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [WIP][CARBONDATA-1277] Dictionary generation failure...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [WIP][CARBONDATA-1277] Dictionary generation failure...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2959/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [WIP][CARBONDATA-1277] Dictionary generation failure...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2958/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1147: [CARBONDATA-1277] Dictionary generation failu...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1147#discussion_r126415422
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -1287,6 +1287,12 @@
     
       public static final String CARBON_BAD_RECORDS_ACTION_DEFAULT = "FORCE";
     
    +  @CarbonProperty
    +  public static final String CARBON_LEASE_RECOVERY_RETRY_COUNT =
    +      "carbon.lease.recovery.retry.count";
    +  public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
    --- End diff --
    
    add attribute for this also


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [CARBONDATA-1277] Dictionary generation failure if t...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1147: [CARBONDATA-1277] Dictionary generation failu...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1147#discussion_r126420489
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java ---
    @@ -0,0 +1,188 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.util.path;
    +
    +import java.io.FileNotFoundException;
    +import java.io.IOException;
    +
    +import org.apache.carbondata.common.logging.LogService;
    +import org.apache.carbondata.common.logging.LogServiceFactory;
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.util.CarbonProperties;
    +
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hdfs.DistributedFileSystem;
    +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException;
    +
    +/**
    + * Implementation for HDFS utility methods
    + */
    +public class HDFSUtils {
    +
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MIN = 1;
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MAX = 5;
    +  private static final String CARBON_LEASE_RECOVERY_RETRY_COUNT_DEFAULT = "3";
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_INTERVAL_MIN = 100;
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_INTERVAL_MAX = 10000;
    +  private static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL_DEFAULT = "1000";
    +
    +  /**
    +   * LOGGER
    +   */
    +  private static final LogService LOGGER =
    +      LogServiceFactory.getLogService(HDFSUtils.class.getName());
    +
    +  /**
    +   * This method will validate whether the exception thrown if for lease recovery from HDFS
    +   *
    +   * @param message
    +   * @return
    +   */
    +  public static boolean checkExceptionMessageForLeaseRecovery(String message) {
    +    // depending on the scenario few more cases can be added for validating lease recovery exception
    +    if (null != message && message.contains("Failed to APPEND_FILE")) {
    +      return true;
    +    }
    +    return false;
    +  }
    +
    +  /**
    +   * This method will make attempts to recover lease on a file using the
    +   * distributed file system utility.
    +   *
    +   * @param filePath
    +   * @return
    +   * @throws IOException
    +   */
    +  public static boolean recoverFileLease(String filePath) throws IOException {
    +    LOGGER.info("Trying to recover lease on file: " + filePath);
    +    FileFactory.FileType fileType = FileFactory.getFileType(filePath);
    +    switch (fileType) {
    +      case ALLUXIO:
    +      case HDFS:
    +      case VIEWFS:
    +        DistributedFileSystem dfs = null;
    +        Path path = FileFactory.getPath(filePath);
    +        FileSystem fs = FileFactory.getFileSystem(path);
    +        dfs = (DistributedFileSystem) fs;
    +        int maxAttempts = getLeaseRecoveryRetryCount();
    +        int retryInterval = getLeaseRecoveryRetryInterval();
    +        boolean leaseRecovered = false;
    +        IOException ioException = null;
    +        for (int retryCount = 1; retryCount <= maxAttempts; retryCount++) {
    +          try {
    +            leaseRecovered = dfs.recoverLease(path);
    --- End diff --
    
    check viwefs lease recovery mechanism


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1147: [CARBONDATA-1277] Dictionary generation failu...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1147#discussion_r126419166
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java ---
    @@ -0,0 +1,188 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.util.path;
    +
    +import java.io.FileNotFoundException;
    +import java.io.IOException;
    +
    +import org.apache.carbondata.common.logging.LogService;
    +import org.apache.carbondata.common.logging.LogServiceFactory;
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.util.CarbonProperties;
    +
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hdfs.DistributedFileSystem;
    +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException;
    +
    +/**
    + * Implementation for HDFS utility methods
    + */
    +public class HDFSUtils {
    +
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MIN = 1;
    +  private static final int CARBON_LEASE_RECOVERY_RETRY_COUNT_MAX = 5;
    --- End diff --
    
    make max 50


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [CARBONDATA-1277] Dictionary generation failure if t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3001/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [CARBONDATA-1277] Dictionary generation failure if t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2999/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1147: [CARBONDATA-1277] Dictionary generation failu...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1147#discussion_r126418899
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/HDFSUtils.java ---
    @@ -0,0 +1,188 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.util.path;
    +
    +import java.io.FileNotFoundException;
    +import java.io.IOException;
    +
    +import org.apache.carbondata.common.logging.LogService;
    +import org.apache.carbondata.common.logging.LogServiceFactory;
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.util.CarbonProperties;
    +
    +import org.apache.hadoop.fs.FileSystem;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.hdfs.DistributedFileSystem;
    +import org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException;
    +
    +/**
    + * Implementation for HDFS utility methods
    + */
    +public class HDFSUtils {
    --- End diff --
    
    Make it hdfsLeaseUtils


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [CARBONDATA-1277] Dictionary generation failure if t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/410/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [WIP][CARBONDATA-1277] Dictionary generation failure...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/370/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1147: [CARBONDATA-1277] Dictionary generation failure if t...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1147
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/412/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1147: [CARBONDATA-1277] Dictionary generation failu...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/1147


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---