You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/11/04 16:58:00 UTC

[jira] [Work logged] (HADOOP-17990) Failing concurrent FS.initialize commands when fs.azure.createRemoteFileSystemDuringInitialization is enabled on hadoop-azure ABFS

     [ https://issues.apache.org/jira/browse/HADOOP-17990?focusedWorklogId=676561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676561 ]

ASF GitHub Bot logged work on HADOOP-17990:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Nov/21 16:57
            Start Date: 04/Nov/21 16:57
    Worklog Time Spent: 10m 
      Work Description: majdyz opened a new pull request #3619:
URL: https://github.com/apache/hadoop/pull/3619


   …
   
   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   
   When fs.azure.createRemoteFileSystemDuringInitialization is enabled, the filesystem will create a container if it does not already exist inside the initialize method. The current flow of creating the container will fail in the case of concurrent initialize methods being executed simultaneously (only one request can create the container, the rest will fail instead of moving on). This PR is fixing this issue by also catching org.apache.Hadoop.fs.FileAlreadyExistsException generated by the createFilesystem command.
   
   ### How was this patch tested?
   
   A new test in ITestAzureBlobFileSystemInitAndCreate is introduced which was breaking before the fox.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 676561)
    Remaining Estimate: 0h
            Time Spent: 10m

> Failing concurrent FS.initialize commands when fs.azure.createRemoteFileSystemDuringInitialization is enabled on hadoop-azure ABFS
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-17990
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17990
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.1
>            Reporter: Zamil Majdy
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> When {{fs.azure.createRemoteFileSystemDuringInitialization}} is enabled, the filesystem will create a container if it does not already exist inside the {{initialize}} method. The current flow of creating the container will fail in the case of concurrent {{initialize}} methods being executed simultaneously (only one request can create the container, the rest will fail instead of moving on). This is happen due to the `checkException` method that is not catching the Hadoop `FileAlreadyExists` exception.
> Stacktrace:
> {{Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: Operation failed: "The specified filesystem already exists.", 409, PUT, https://<REDACTED>.dfs.core.windows.net/project?resource=filesystem, FilesystemAlreadyExists, "The specified filesystem already exists. RequestId:<REDACTED> Time:2021-10-18T13:46:05.7504906Z"}}
> {{ {{at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1182)}}}}
> {{ {{at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.createFileSystem(AzureBlobFileSystem.java:1067)}}}}
> {{ {{at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:126)}}}}
> {{ {{at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)}}}}
> *To reproduce:*
>  * Set `fs.azure.createRemoteFileSystemDuringInitialization` to `true`
>  * Run two concurrent `initialize` commands with the root to the non existing container/filesystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org