You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Brian Bockelman (JIRA)" <ji...@apache.org> on 2008/10/11 19:05:44 UTC

[jira] Created: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

fuse-dfs causes corruptions on multi-threaded access
----------------------------------------------------

                 Key: HADOOP-4397
                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/fuse-dfs
    Affects Versions: 0.18.1
            Reporter: Brian Bockelman
             Fix For: 0.18.2


If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.

I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.

If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.

This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?

Thanks as always for looking into this,

Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640397#action_12640397 ] 

Pete Wyckoff commented on HADOOP-4397:
--------------------------------------

this is ready to commit.


> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639156#action_12639156 ] 

Pete Wyckoff commented on HADOOP-4397:
--------------------------------------

looking at the code actually i think the others are safe as they all either use local variables or for getattr, the place where is sets globalFS = hdfsConnect ... is actually an impossible condition to hit since globalFs is initialized in dfs_init and never set again.

the only thing is the mutex is global whereas the problem is only on specific file handles, so it is somewhat more restrictive than need be.  this would be a problem when many files are being read from. but may be ok for now in practice.

+1 on this patch with the caveat of the above problem

pete





> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Attachment: HADOOP-4397.3.txt

sorry - this one includes the TestFuseDFS.java in the right place.


> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Attachment: HADOOP-4397.4.txt

fixes the comments for some ciritcal sections.

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4397:
-------------------------------------

    Status: Patch Available  (was: Open)

I will commit it to the 0.18 and 0.19 branch as well as trunk.

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1, 0.18.0
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>            Priority: Blocker
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Release Note: must add -Dlibhdfs.noperms=1 when compiling. i.e., ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1 -Dlibhdfs.noperms=1
    Hadoop Flags: [Incompatible change]

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>            Priority: Blocker
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639041#action_12639041 ] 

Brian Bockelman commented on HADOOP-4397:
-----------------------------------------

Hey Pete,

I didn't protect all writes to the dfs context because I only had time to look at the one case - no reason other than that, probably needs to be done.

Brian

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Affects Version/s: 0.18.0
             Assignee: Pete Wyckoff

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640393#action_12640393 ] 

Zheng Shao commented on HADOOP-4397:
------------------------------------

+1
Talked with Pete. Looks good to me.

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Attachment: HADOOP-4397.2.txt

I suggest we use this patch which fixes both bugs and is the version i intend to submit to trunk - it needs to be compiled with the extra flag -Dlibhdfs.noperms=1  so that 0.19 new permission functions in libhdfs are not used.  I actually use this version of the code with 0.17 in production.

of course the unit tests fail for chown and chmod (since they are not implemented) with this but all else passes.

and i hope it fixes all the concurrency problems including system calls that return static structs and so need to be protected.




> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640919#action_12640919 ] 

dhruba borthakur commented on HADOOP-4397:
------------------------------------------

This looks strikingly similar to the patch in HADOOP-4399. Is this a duplicate?

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>            Priority: Blocker
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Bockelman updated HADOOP-4397:
------------------------------------

    Attachment: hadoop-4397.patch

The attached file provides a patch which solves the problem at hand, but does not solve the "larger issue"

Feel free to use this patch directly or as food for thought for something more elaborate.

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4397:
-------------------------------------

    Resolution: Duplicate
        Status: Resolved  (was: Patch Available)

Duplicate of HADOOP-4399

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>            Priority: Blocker
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Priority: Blocker  (was: Major)

> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Brian Bockelman
>            Assignee: Pete Wyckoff
>            Priority: Blocker
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HADOOP-4397:
---------------------------------

    Attachment: hadoop-4397.out

     [exec] +1 overall.  

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 21 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.



patch is good.  this is c code that does not affect anything else so not attaching ant test.


> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: HADOOP-4397.2.txt, HADOOP-4397.3.txt, HADOOP-4397.4.txt, hadoop-4397.out, hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638927#action_12638927 ] 

Pete Wyckoff commented on HADOOP-4397:
--------------------------------------

thanks Brian.

Let's get this patch into 0.18.2 and something like this into 0.19 (if possible as this should be a blocker).

And then, right, work on something better.

One question though, it doesn't look like you protected all writes to the dfs context in other operations. Also, do you have a test case for this? I know it's nearly impossible :) .

We should probably work on 0.19 patch with the highest priority since it may be released soon (or maybe has ?).

pete


> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>
>         Attachments: hadoop-4397.patch
>
>
> If multiple threads in the same process perform file system reads, then fuse-dfs causes various problems due to the per-context buffer.  I've seen this reflected in segmentation violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to dfs_read with a mutex.  You will obviously get performance degradations through thrashing if the threads are reading different parts of the file (but for our application, the multi-threaded reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.