You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Chao Wang (JIRA)" <ji...@apache.org> on 2009/10/28 20:08:59 UTC

[jira] Created: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

[Zebra] Zebra does not support concurrent deletions of column groups now.
-------------------------------------------------------------------------

                 Key: PIG-1057
                 URL: https://issues.apache.org/jira/browse/PIG-1057
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.4.0
            Reporter: Chao Wang
            Assignee: Chao Wang
             Fix For: 0.6.0


Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):


/*************************************************************************************************************************/
... 
java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
  at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
  at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
  at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
  at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
  at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
  at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
  at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
  at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
  at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
  at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
...
/*************************************************************************************************************************/

We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 

Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771679#action_12771679 ] 

Yan Zhou commented on PIG-1057:
-------------------------------

patch reviewed +1.

This patch will address the concern Raghu had in Pig-993.

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Wang updated PIG-1057:
---------------------------

    Attachment: patch_1057

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Wang updated PIG-1057:
---------------------------

    Attachment: patch_1057

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057, patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Wang updated PIG-1057:
---------------------------

    Status: Patch Available  (was: Open)

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Wang updated PIG-1057:
---------------------------

    Attachment:     (was: patch_1057)

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771714#action_12771714 ] 

Hadoop QA commented on PIG-1057:
--------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12423611/patch_1057
  against trunk revision 831051.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/console

This message is automatically generated.

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-1057:
----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Patch checked in.

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -------------------------------------------------------------------------
>
>                 Key: PIG-1057
>                 URL: https://issues.apache.org/jira/browse/PIG-1057
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Chao Wang
>            Assignee: Chao Wang
>             Fix For: 0.6.0
>
>         Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one deleting one particular column group.  The following exception can be thrown (with callstack):
> /*************************************************************************************************************************/
> ... 
> java.io.FileNotFoundException: File /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at org.apache.hadoop.zebra.io.BasicTable$SchemaFile.<init>(BasicTable.java:1416)
>   at org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*************************************************************************************************************************/
> We plan to fix this in Zebra to support concurrent deletions of column groups. The root cause is that a thread or process reads in some stale file system information (e.g., it sees /CG0 first) and then can fail later on (it tries to access /CG0, however /CG0 may be deleted by another thread or process).  Therefore, we plan to adopt a retry logic to resolve this issue. More detailed, we allow a dropping column group thread to retry n times when doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column group deletions + reads issue. If a process is reading some data that could be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If you have multiple threads or processes to delete column groups, they should succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.