You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by jlei liu <li...@gmail.com> on 2012/10/08 12:06:31 UTC

HDFS-347 and HDFS-2246 issues different

The two issues both implement  DFSClient to directly open data blocks that
happen to be on the same machine function.  What are advantage of  HDFS-347?

Thanks,

LiuLei

Re: HDFS-347 and HDFS-2246 issues different

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Would it make more sense to post these comments on the JIRA?
This list is more for user issues.

In general, I don't see why the DataNode being "dead" has anything to
do with whether we are reading the wrong data by reading the files
that it used to manage.  Files in HDFS are append-only, and our
consistency semantics are not POSIX.  There's nothing "wrong" about
the current or proposed semantics of local reads versus traditional
HDFS semantics. As far as I can tell.

cheers,
Colin


On Mon, Oct 8, 2012 at 11:52 PM, jlei liu <li...@gmail.com> wrote:
> Hi Harsh, thank you for your replay.
>
> In HDFS-2246, when Local DataNode is dead, the DFSClient still can read data
> form local file. I think that may lead to DFSCient read wrong data.
> HDFS-347 use domain socket to implement Local Read, when Local DataNode is
> dead, can DFSClient read data from local file?
>
> Thanks,
>
> LiuLei
>
>
>
>
> 2012/10/8 Harsh J <ha...@cloudera.com>
>>
>> As HDFS-2246 itself states in one comment:
>>
>> """
>> HDFS-347 discusses ways to optimize reads for local clients. A clean
>> design is fairly involved. A shortcut has been proposed where the
>> client access the hdfs file blocks directly; this works if the client
>> is the same user/group as the DN daemon. This is non-invasive and is a
>> good short term solution till HDFS-347 is completed.
>> """
>>
>> HDFS-347, when done, would not have the user/group limitation (you may
>> call that a security/permissions problem as well), among other things.
>> Going over the JIRA's comments should give you more insight :)
>>
>> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
>> > The two issues both implement  DFSClient to directly open data blocks
>> > that
>> > happen to be on the same machine function.  What are advantage of
>> > HDFS-347?
>> >
>> > Thanks,
>> >
>> > LiuLei
>>
>>
>>
>> --
>> Harsh J
>
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Would it make more sense to post these comments on the JIRA?
This list is more for user issues.

In general, I don't see why the DataNode being "dead" has anything to
do with whether we are reading the wrong data by reading the files
that it used to manage.  Files in HDFS are append-only, and our
consistency semantics are not POSIX.  There's nothing "wrong" about
the current or proposed semantics of local reads versus traditional
HDFS semantics. As far as I can tell.

cheers,
Colin


On Mon, Oct 8, 2012 at 11:52 PM, jlei liu <li...@gmail.com> wrote:
> Hi Harsh, thank you for your replay.
>
> In HDFS-2246, when Local DataNode is dead, the DFSClient still can read data
> form local file. I think that may lead to DFSCient read wrong data.
> HDFS-347 use domain socket to implement Local Read, when Local DataNode is
> dead, can DFSClient read data from local file?
>
> Thanks,
>
> LiuLei
>
>
>
>
> 2012/10/8 Harsh J <ha...@cloudera.com>
>>
>> As HDFS-2246 itself states in one comment:
>>
>> """
>> HDFS-347 discusses ways to optimize reads for local clients. A clean
>> design is fairly involved. A shortcut has been proposed where the
>> client access the hdfs file blocks directly; this works if the client
>> is the same user/group as the DN daemon. This is non-invasive and is a
>> good short term solution till HDFS-347 is completed.
>> """
>>
>> HDFS-347, when done, would not have the user/group limitation (you may
>> call that a security/permissions problem as well), among other things.
>> Going over the JIRA's comments should give you more insight :)
>>
>> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
>> > The two issues both implement  DFSClient to directly open data blocks
>> > that
>> > happen to be on the same machine function.  What are advantage of
>> > HDFS-347?
>> >
>> > Thanks,
>> >
>> > LiuLei
>>
>>
>>
>> --
>> Harsh J
>
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Would it make more sense to post these comments on the JIRA?
This list is more for user issues.

In general, I don't see why the DataNode being "dead" has anything to
do with whether we are reading the wrong data by reading the files
that it used to manage.  Files in HDFS are append-only, and our
consistency semantics are not POSIX.  There's nothing "wrong" about
the current or proposed semantics of local reads versus traditional
HDFS semantics. As far as I can tell.

cheers,
Colin


On Mon, Oct 8, 2012 at 11:52 PM, jlei liu <li...@gmail.com> wrote:
> Hi Harsh, thank you for your replay.
>
> In HDFS-2246, when Local DataNode is dead, the DFSClient still can read data
> form local file. I think that may lead to DFSCient read wrong data.
> HDFS-347 use domain socket to implement Local Read, when Local DataNode is
> dead, can DFSClient read data from local file?
>
> Thanks,
>
> LiuLei
>
>
>
>
> 2012/10/8 Harsh J <ha...@cloudera.com>
>>
>> As HDFS-2246 itself states in one comment:
>>
>> """
>> HDFS-347 discusses ways to optimize reads for local clients. A clean
>> design is fairly involved. A shortcut has been proposed where the
>> client access the hdfs file blocks directly; this works if the client
>> is the same user/group as the DN daemon. This is non-invasive and is a
>> good short term solution till HDFS-347 is completed.
>> """
>>
>> HDFS-347, when done, would not have the user/group limitation (you may
>> call that a security/permissions problem as well), among other things.
>> Going over the JIRA's comments should give you more insight :)
>>
>> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
>> > The two issues both implement  DFSClient to directly open data blocks
>> > that
>> > happen to be on the same machine function.  What are advantage of
>> > HDFS-347?
>> >
>> > Thanks,
>> >
>> > LiuLei
>>
>>
>>
>> --
>> Harsh J
>
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Would it make more sense to post these comments on the JIRA?
This list is more for user issues.

In general, I don't see why the DataNode being "dead" has anything to
do with whether we are reading the wrong data by reading the files
that it used to manage.  Files in HDFS are append-only, and our
consistency semantics are not POSIX.  There's nothing "wrong" about
the current or proposed semantics of local reads versus traditional
HDFS semantics. As far as I can tell.

cheers,
Colin


On Mon, Oct 8, 2012 at 11:52 PM, jlei liu <li...@gmail.com> wrote:
> Hi Harsh, thank you for your replay.
>
> In HDFS-2246, when Local DataNode is dead, the DFSClient still can read data
> form local file. I think that may lead to DFSCient read wrong data.
> HDFS-347 use domain socket to implement Local Read, when Local DataNode is
> dead, can DFSClient read data from local file?
>
> Thanks,
>
> LiuLei
>
>
>
>
> 2012/10/8 Harsh J <ha...@cloudera.com>
>>
>> As HDFS-2246 itself states in one comment:
>>
>> """
>> HDFS-347 discusses ways to optimize reads for local clients. A clean
>> design is fairly involved. A shortcut has been proposed where the
>> client access the hdfs file blocks directly; this works if the client
>> is the same user/group as the DN daemon. This is non-invasive and is a
>> good short term solution till HDFS-347 is completed.
>> """
>>
>> HDFS-347, when done, would not have the user/group limitation (you may
>> call that a security/permissions problem as well), among other things.
>> Going over the JIRA's comments should give you more insight :)
>>
>> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
>> > The two issues both implement  DFSClient to directly open data blocks
>> > that
>> > happen to be on the same machine function.  What are advantage of
>> > HDFS-347?
>> >
>> > Thanks,
>> >
>> > LiuLei
>>
>>
>>
>> --
>> Harsh J
>
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by jlei liu <li...@gmail.com>.
Hi Harsh, thank you for your replay.

In HDFS-2246, when Local DataNode is dead, the DFSClient still can read
data form local file. I think that may lead to DFSCient read wrong data.
HDFS-347 use domain socket to implement Local Read, when Local DataNode is
dead, can DFSClient read data from local file?

Thanks,

LiuLei



2012/10/8 Harsh J <ha...@cloudera.com>

> As HDFS-2246 itself states in one comment:
>
> """
> HDFS-347 discusses ways to optimize reads for local clients. A clean
> design is fairly involved. A shortcut has been proposed where the
> client access the hdfs file blocks directly; this works if the client
> is the same user/group as the DN daemon. This is non-invasive and is a
> good short term solution till HDFS-347 is completed.
> """
>
> HDFS-347, when done, would not have the user/group limitation (you may
> call that a security/permissions problem as well), among other things.
> Going over the JIRA's comments should give you more insight :)
>
> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> > The two issues both implement  DFSClient to directly open data blocks
> that
> > happen to be on the same machine function.  What are advantage of
>  HDFS-347?
> >
> > Thanks,
> >
> > LiuLei
>
>
>
> --
> Harsh J
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by jlei liu <li...@gmail.com>.
Hi Harsh, thank you for your replay.

In HDFS-2246, when Local DataNode is dead, the DFSClient still can read
data form local file. I think that may lead to DFSCient read wrong data.
HDFS-347 use domain socket to implement Local Read, when Local DataNode is
dead, can DFSClient read data from local file?

Thanks,

LiuLei



2012/10/8 Harsh J <ha...@cloudera.com>

> As HDFS-2246 itself states in one comment:
>
> """
> HDFS-347 discusses ways to optimize reads for local clients. A clean
> design is fairly involved. A shortcut has been proposed where the
> client access the hdfs file blocks directly; this works if the client
> is the same user/group as the DN daemon. This is non-invasive and is a
> good short term solution till HDFS-347 is completed.
> """
>
> HDFS-347, when done, would not have the user/group limitation (you may
> call that a security/permissions problem as well), among other things.
> Going over the JIRA's comments should give you more insight :)
>
> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> > The two issues both implement  DFSClient to directly open data blocks
> that
> > happen to be on the same machine function.  What are advantage of
>  HDFS-347?
> >
> > Thanks,
> >
> > LiuLei
>
>
>
> --
> Harsh J
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by jlei liu <li...@gmail.com>.
Hi Harsh, thank you for your replay.

In HDFS-2246, when Local DataNode is dead, the DFSClient still can read
data form local file. I think that may lead to DFSCient read wrong data.
HDFS-347 use domain socket to implement Local Read, when Local DataNode is
dead, can DFSClient read data from local file?

Thanks,

LiuLei



2012/10/8 Harsh J <ha...@cloudera.com>

> As HDFS-2246 itself states in one comment:
>
> """
> HDFS-347 discusses ways to optimize reads for local clients. A clean
> design is fairly involved. A shortcut has been proposed where the
> client access the hdfs file blocks directly; this works if the client
> is the same user/group as the DN daemon. This is non-invasive and is a
> good short term solution till HDFS-347 is completed.
> """
>
> HDFS-347, when done, would not have the user/group limitation (you may
> call that a security/permissions problem as well), among other things.
> Going over the JIRA's comments should give you more insight :)
>
> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> > The two issues both implement  DFSClient to directly open data blocks
> that
> > happen to be on the same machine function.  What are advantage of
>  HDFS-347?
> >
> > Thanks,
> >
> > LiuLei
>
>
>
> --
> Harsh J
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by jlei liu <li...@gmail.com>.
Hi Harsh, thank you for your replay.

In HDFS-2246, when Local DataNode is dead, the DFSClient still can read
data form local file. I think that may lead to DFSCient read wrong data.
HDFS-347 use domain socket to implement Local Read, when Local DataNode is
dead, can DFSClient read data from local file?

Thanks,

LiuLei



2012/10/8 Harsh J <ha...@cloudera.com>

> As HDFS-2246 itself states in one comment:
>
> """
> HDFS-347 discusses ways to optimize reads for local clients. A clean
> design is fairly involved. A shortcut has been proposed where the
> client access the hdfs file blocks directly; this works if the client
> is the same user/group as the DN daemon. This is non-invasive and is a
> good short term solution till HDFS-347 is completed.
> """
>
> HDFS-347, when done, would not have the user/group limitation (you may
> call that a security/permissions problem as well), among other things.
> Going over the JIRA's comments should give you more insight :)
>
> On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> > The two issues both implement  DFSClient to directly open data blocks
> that
> > happen to be on the same machine function.  What are advantage of
>  HDFS-347?
> >
> > Thanks,
> >
> > LiuLei
>
>
>
> --
> Harsh J
>

Re: HDFS-347 and HDFS-2246 issues different

Posted by Harsh J <ha...@cloudera.com>.
As HDFS-2246 itself states in one comment:

"""
HDFS-347 discusses ways to optimize reads for local clients. A clean
design is fairly involved. A shortcut has been proposed where the
client access the hdfs file blocks directly; this works if the client
is the same user/group as the DN daemon. This is non-invasive and is a
good short term solution till HDFS-347 is completed.
"""

HDFS-347, when done, would not have the user/group limitation (you may
call that a security/permissions problem as well), among other things.
Going over the JIRA's comments should give you more insight :)

On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> The two issues both implement  DFSClient to directly open data blocks that
> happen to be on the same machine function.  What are advantage of  HDFS-347?
>
> Thanks,
>
> LiuLei



-- 
Harsh J

Re: HDFS-347 and HDFS-2246 issues different

Posted by Harsh J <ha...@cloudera.com>.
As HDFS-2246 itself states in one comment:

"""
HDFS-347 discusses ways to optimize reads for local clients. A clean
design is fairly involved. A shortcut has been proposed where the
client access the hdfs file blocks directly; this works if the client
is the same user/group as the DN daemon. This is non-invasive and is a
good short term solution till HDFS-347 is completed.
"""

HDFS-347, when done, would not have the user/group limitation (you may
call that a security/permissions problem as well), among other things.
Going over the JIRA's comments should give you more insight :)

On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> The two issues both implement  DFSClient to directly open data blocks that
> happen to be on the same machine function.  What are advantage of  HDFS-347?
>
> Thanks,
>
> LiuLei



-- 
Harsh J

Re: HDFS-347 and HDFS-2246 issues different

Posted by Harsh J <ha...@cloudera.com>.
As HDFS-2246 itself states in one comment:

"""
HDFS-347 discusses ways to optimize reads for local clients. A clean
design is fairly involved. A shortcut has been proposed where the
client access the hdfs file blocks directly; this works if the client
is the same user/group as the DN daemon. This is non-invasive and is a
good short term solution till HDFS-347 is completed.
"""

HDFS-347, when done, would not have the user/group limitation (you may
call that a security/permissions problem as well), among other things.
Going over the JIRA's comments should give you more insight :)

On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> The two issues both implement  DFSClient to directly open data blocks that
> happen to be on the same machine function.  What are advantage of  HDFS-347?
>
> Thanks,
>
> LiuLei



-- 
Harsh J

Re: HDFS-347 and HDFS-2246 issues different

Posted by Harsh J <ha...@cloudera.com>.
As HDFS-2246 itself states in one comment:

"""
HDFS-347 discusses ways to optimize reads for local clients. A clean
design is fairly involved. A shortcut has been proposed where the
client access the hdfs file blocks directly; this works if the client
is the same user/group as the DN daemon. This is non-invasive and is a
good short term solution till HDFS-347 is completed.
"""

HDFS-347, when done, would not have the user/group limitation (you may
call that a security/permissions problem as well), among other things.
Going over the JIRA's comments should give you more insight :)

On Mon, Oct 8, 2012 at 3:36 PM, jlei liu <li...@gmail.com> wrote:
> The two issues both implement  DFSClient to directly open data blocks that
> happen to be on the same machine function.  What are advantage of  HDFS-347?
>
> Thanks,
>
> LiuLei



-- 
Harsh J