You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Colin McCabe <cm...@alumni.cmu.edu> on 2012/10/01 05:26:31 UTC

Re: DFSClient may read wrong data in local read

I'm going to post a patch to HDFS-347 shortly.  From the user's point
of view, the important thing about the new implementation of local
reads is that it will work with security, and will work on files that
are in the process of being written.

cheers,
Colin


On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
> Hi Colin, thanks for your reply.
>
> Where can I see the new design for BlockReaderLocal class?
>
> Thanks,
>
> LiuLei
>
>
>
> 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
>>
>> We don't make very strong guarantees about what happens when clients
>> read from a deleted file.  DFSClients definitely may read data from a
>> deleted file even if local reads are not enabled.
>>
>> Incidentally, BlockReaderLocal is being redesigned to pass file
>> descriptors rather than paths, which will be more secure and fix some
>> corner cases surrounding append and local reads.
>>
>> cheers,
>> Colin
>>
>>
>> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com> wrote:
>> > In local read, BlockReaderLocal class use "static Map<Integer,
>> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local block
>> > file
>> > path and local meta file path. When I stop HDFS cluster or I kill the
>> > local
>> > DataNode and delete file use "./hadoop dfs -rm path" command , the
>> > DFSClient
>> > still can read the data form local file. I think that may lead to
>> > DFSClient
>> > read wrong data.
>> >
>> > I think we should fix the problem.
>> >
>> >
>> > Thanks,
>> >
>> > LiuLei
>
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin,

We can use Unix Domian Socket to implement Local Read.  After the Local
DataNode  is dead, the client read data from Local DataNode will throw
Exception,  so I think use Unix Domain Sokcet to implement Local Read is
more reliable.
2012/10/1 lei liu <li...@gmail.com>

> Hi Colin, Thanks for your reply.
>
> What is mean that the patch will work on files that are in the process of
> being written?
>
> Thanks,
> LiuLei
>
>
> 2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>
>
>> I'm going to post a patch to HDFS-347 shortly.  From the user's point
>> of view, the important thing about the new implementation of local
>> reads is that it will work with security, and will work on files that
>> are in the process of being written.
>>
>> cheers,
>> Colin
>>
>>
>> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
>> > Hi Colin, thanks for your reply.
>> >
>> > Where can I see the new design for BlockReaderLocal class?
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>> >
>> >
>> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
>> >>
>> >> We don't make very strong guarantees about what happens when clients
>> >> read from a deleted file.  DFSClients definitely may read data from a
>> >> deleted file even if local reads are not enabled.
>> >>
>> >> Incidentally, BlockReaderLocal is being redesigned to pass file
>> >> descriptors rather than paths, which will be more secure and fix some
>> >> corner cases surrounding append and local reads.
>> >>
>> >> cheers,
>> >> Colin
>> >>
>> >>
>> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com>
>> wrote:
>> >> > In local read, BlockReaderLocal class use "static Map<Integer,
>> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local
>> block
>> >> > file
>> >> > path and local meta file path. When I stop HDFS cluster or I kill the
>> >> > local
>> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
>> >> > DFSClient
>> >> > still can read the data form local file. I think that may lead to
>> >> > DFSClient
>> >> > read wrong data.
>> >> >
>> >> > I think we should fix the problem.
>> >> >
>> >> >
>> >> > Thanks,
>> >> >
>> >> > LiuLei
>> >
>> >
>>
>
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin,

We can use Unix Domian Socket to implement Local Read.  After the Local
DataNode  is dead, the client read data from Local DataNode will throw
Exception,  so I think use Unix Domain Sokcet to implement Local Read is
more reliable.
2012/10/1 lei liu <li...@gmail.com>

> Hi Colin, Thanks for your reply.
>
> What is mean that the patch will work on files that are in the process of
> being written?
>
> Thanks,
> LiuLei
>
>
> 2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>
>
>> I'm going to post a patch to HDFS-347 shortly.  From the user's point
>> of view, the important thing about the new implementation of local
>> reads is that it will work with security, and will work on files that
>> are in the process of being written.
>>
>> cheers,
>> Colin
>>
>>
>> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
>> > Hi Colin, thanks for your reply.
>> >
>> > Where can I see the new design for BlockReaderLocal class?
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>> >
>> >
>> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
>> >>
>> >> We don't make very strong guarantees about what happens when clients
>> >> read from a deleted file.  DFSClients definitely may read data from a
>> >> deleted file even if local reads are not enabled.
>> >>
>> >> Incidentally, BlockReaderLocal is being redesigned to pass file
>> >> descriptors rather than paths, which will be more secure and fix some
>> >> corner cases surrounding append and local reads.
>> >>
>> >> cheers,
>> >> Colin
>> >>
>> >>
>> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com>
>> wrote:
>> >> > In local read, BlockReaderLocal class use "static Map<Integer,
>> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local
>> block
>> >> > file
>> >> > path and local meta file path. When I stop HDFS cluster or I kill the
>> >> > local
>> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
>> >> > DFSClient
>> >> > still can read the data form local file. I think that may lead to
>> >> > DFSClient
>> >> > read wrong data.
>> >> >
>> >> > I think we should fix the problem.
>> >> >
>> >> >
>> >> > Thanks,
>> >> >
>> >> > LiuLei
>> >
>> >
>>
>
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin,

We can use Unix Domian Socket to implement Local Read.  After the Local
DataNode  is dead, the client read data from Local DataNode will throw
Exception,  so I think use Unix Domain Sokcet to implement Local Read is
more reliable.
2012/10/1 lei liu <li...@gmail.com>

> Hi Colin, Thanks for your reply.
>
> What is mean that the patch will work on files that are in the process of
> being written?
>
> Thanks,
> LiuLei
>
>
> 2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>
>
>> I'm going to post a patch to HDFS-347 shortly.  From the user's point
>> of view, the important thing about the new implementation of local
>> reads is that it will work with security, and will work on files that
>> are in the process of being written.
>>
>> cheers,
>> Colin
>>
>>
>> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
>> > Hi Colin, thanks for your reply.
>> >
>> > Where can I see the new design for BlockReaderLocal class?
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>> >
>> >
>> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
>> >>
>> >> We don't make very strong guarantees about what happens when clients
>> >> read from a deleted file.  DFSClients definitely may read data from a
>> >> deleted file even if local reads are not enabled.
>> >>
>> >> Incidentally, BlockReaderLocal is being redesigned to pass file
>> >> descriptors rather than paths, which will be more secure and fix some
>> >> corner cases surrounding append and local reads.
>> >>
>> >> cheers,
>> >> Colin
>> >>
>> >>
>> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com>
>> wrote:
>> >> > In local read, BlockReaderLocal class use "static Map<Integer,
>> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local
>> block
>> >> > file
>> >> > path and local meta file path. When I stop HDFS cluster or I kill the
>> >> > local
>> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
>> >> > DFSClient
>> >> > still can read the data form local file. I think that may lead to
>> >> > DFSClient
>> >> > read wrong data.
>> >> >
>> >> > I think we should fix the problem.
>> >> >
>> >> >
>> >> > Thanks,
>> >> >
>> >> > LiuLei
>> >
>> >
>>
>
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin,

We can use Unix Domian Socket to implement Local Read.  After the Local
DataNode  is dead, the client read data from Local DataNode will throw
Exception,  so I think use Unix Domain Sokcet to implement Local Read is
more reliable.
2012/10/1 lei liu <li...@gmail.com>

> Hi Colin, Thanks for your reply.
>
> What is mean that the patch will work on files that are in the process of
> being written?
>
> Thanks,
> LiuLei
>
>
> 2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>
>
>> I'm going to post a patch to HDFS-347 shortly.  From the user's point
>> of view, the important thing about the new implementation of local
>> reads is that it will work with security, and will work on files that
>> are in the process of being written.
>>
>> cheers,
>> Colin
>>
>>
>> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
>> > Hi Colin, thanks for your reply.
>> >
>> > Where can I see the new design for BlockReaderLocal class?
>> >
>> > Thanks,
>> >
>> > LiuLei
>> >
>> >
>> >
>> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
>> >>
>> >> We don't make very strong guarantees about what happens when clients
>> >> read from a deleted file.  DFSClients definitely may read data from a
>> >> deleted file even if local reads are not enabled.
>> >>
>> >> Incidentally, BlockReaderLocal is being redesigned to pass file
>> >> descriptors rather than paths, which will be more secure and fix some
>> >> corner cases surrounding append and local reads.
>> >>
>> >> cheers,
>> >> Colin
>> >>
>> >>
>> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com>
>> wrote:
>> >> > In local read, BlockReaderLocal class use "static Map<Integer,
>> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local
>> block
>> >> > file
>> >> > path and local meta file path. When I stop HDFS cluster or I kill the
>> >> > local
>> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
>> >> > DFSClient
>> >> > still can read the data form local file. I think that may lead to
>> >> > DFSClient
>> >> > read wrong data.
>> >> >
>> >> > I think we should fix the problem.
>> >> >
>> >> >
>> >> > Thanks,
>> >> >
>> >> > LiuLei
>> >
>> >
>>
>
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin, Thanks for your reply.

What is mean that the patch will work on files that are in the process of
being written?

Thanks,
LiuLei


2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>

> I'm going to post a patch to HDFS-347 shortly.  From the user's point
> of view, the important thing about the new implementation of local
> reads is that it will work with security, and will work on files that
> are in the process of being written.
>
> cheers,
> Colin
>
>
> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
> > Hi Colin, thanks for your reply.
> >
> > Where can I see the new design for BlockReaderLocal class?
> >
> > Thanks,
> >
> > LiuLei
> >
> >
> >
> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
> >>
> >> We don't make very strong guarantees about what happens when clients
> >> read from a deleted file.  DFSClients definitely may read data from a
> >> deleted file even if local reads are not enabled.
> >>
> >> Incidentally, BlockReaderLocal is being redesigned to pass file
> >> descriptors rather than paths, which will be more secure and fix some
> >> corner cases surrounding append and local reads.
> >>
> >> cheers,
> >> Colin
> >>
> >>
> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com> wrote:
> >> > In local read, BlockReaderLocal class use "static Map<Integer,
> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local block
> >> > file
> >> > path and local meta file path. When I stop HDFS cluster or I kill the
> >> > local
> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
> >> > DFSClient
> >> > still can read the data form local file. I think that may lead to
> >> > DFSClient
> >> > read wrong data.
> >> >
> >> > I think we should fix the problem.
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > LiuLei
> >
> >
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin, Thanks for your reply.

What is mean that the patch will work on files that are in the process of
being written?

Thanks,
LiuLei


2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>

> I'm going to post a patch to HDFS-347 shortly.  From the user's point
> of view, the important thing about the new implementation of local
> reads is that it will work with security, and will work on files that
> are in the process of being written.
>
> cheers,
> Colin
>
>
> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
> > Hi Colin, thanks for your reply.
> >
> > Where can I see the new design for BlockReaderLocal class?
> >
> > Thanks,
> >
> > LiuLei
> >
> >
> >
> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
> >>
> >> We don't make very strong guarantees about what happens when clients
> >> read from a deleted file.  DFSClients definitely may read data from a
> >> deleted file even if local reads are not enabled.
> >>
> >> Incidentally, BlockReaderLocal is being redesigned to pass file
> >> descriptors rather than paths, which will be more secure and fix some
> >> corner cases surrounding append and local reads.
> >>
> >> cheers,
> >> Colin
> >>
> >>
> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com> wrote:
> >> > In local read, BlockReaderLocal class use "static Map<Integer,
> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local block
> >> > file
> >> > path and local meta file path. When I stop HDFS cluster or I kill the
> >> > local
> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
> >> > DFSClient
> >> > still can read the data form local file. I think that may lead to
> >> > DFSClient
> >> > read wrong data.
> >> >
> >> > I think we should fix the problem.
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > LiuLei
> >
> >
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin, Thanks for your reply.

What is mean that the patch will work on files that are in the process of
being written?

Thanks,
LiuLei


2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>

> I'm going to post a patch to HDFS-347 shortly.  From the user's point
> of view, the important thing about the new implementation of local
> reads is that it will work with security, and will work on files that
> are in the process of being written.
>
> cheers,
> Colin
>
>
> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
> > Hi Colin, thanks for your reply.
> >
> > Where can I see the new design for BlockReaderLocal class?
> >
> > Thanks,
> >
> > LiuLei
> >
> >
> >
> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
> >>
> >> We don't make very strong guarantees about what happens when clients
> >> read from a deleted file.  DFSClients definitely may read data from a
> >> deleted file even if local reads are not enabled.
> >>
> >> Incidentally, BlockReaderLocal is being redesigned to pass file
> >> descriptors rather than paths, which will be more secure and fix some
> >> corner cases surrounding append and local reads.
> >>
> >> cheers,
> >> Colin
> >>
> >>
> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com> wrote:
> >> > In local read, BlockReaderLocal class use "static Map<Integer,
> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local block
> >> > file
> >> > path and local meta file path. When I stop HDFS cluster or I kill the
> >> > local
> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
> >> > DFSClient
> >> > still can read the data form local file. I think that may lead to
> >> > DFSClient
> >> > read wrong data.
> >> >
> >> > I think we should fix the problem.
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > LiuLei
> >
> >
>

Re: DFSClient may read wrong data in local read

Posted by jlei liu <li...@gmail.com>.
Hi Colin, Thanks for your reply.

What is mean that the patch will work on files that are in the process of
being written?

Thanks,
LiuLei


2012/10/1 Colin McCabe <cm...@alumni.cmu.edu>

> I'm going to post a patch to HDFS-347 shortly.  From the user's point
> of view, the important thing about the new implementation of local
> reads is that it will work with security, and will work on files that
> are in the process of being written.
>
> cheers,
> Colin
>
>
> On Thu, Sep 27, 2012 at 9:30 PM, jlei liu <li...@gmail.com> wrote:
> > Hi Colin, thanks for your reply.
> >
> > Where can I see the new design for BlockReaderLocal class?
> >
> > Thanks,
> >
> > LiuLei
> >
> >
> >
> > 2012/9/28 Colin McCabe <cm...@alumni.cmu.edu>
> >>
> >> We don't make very strong guarantees about what happens when clients
> >> read from a deleted file.  DFSClients definitely may read data from a
> >> deleted file even if local reads are not enabled.
> >>
> >> Incidentally, BlockReaderLocal is being redesigned to pass file
> >> descriptors rather than paths, which will be more secure and fix some
> >> corner cases surrounding append and local reads.
> >>
> >> cheers,
> >> Colin
> >>
> >>
> >> On Wed, Sep 26, 2012 at 11:19 PM, jlei liu <li...@gmail.com> wrote:
> >> > In local read, BlockReaderLocal class use "static Map<Integer,
> >> > LocalDatanodeInfo> localDatanodeInfoMap" property to store local block
> >> > file
> >> > path and local meta file path. When I stop HDFS cluster or I kill the
> >> > local
> >> > DataNode and delete file use "./hadoop dfs -rm path" command , the
> >> > DFSClient
> >> > still can read the data form local file. I think that may lead to
> >> > DFSClient
> >> > read wrong data.
> >> >
> >> > I think we should fix the problem.
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > LiuLei
> >
> >
>