You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Chris Nauroth <cn...@apache.org> on 2022/05/10 16:15:38 UTC

Re: [DISCUSS] Hadoop on Windows

+1 from me as well for HADOOP-13223 (replacement of winutils.exe with a JNI
implementation). Thank you for your work on this, Gautham.

Regarding HDFS-16466 (permissions), winutils.exe does contain an
implementation for mapping between POSIX and Windows permissions. I believe
that code has been stable for a long time. Is the intent of HDFS-16466 to
port this logic over to the new library?

Chris Nauroth


On Fri, Apr 29, 2022 at 3:11 AM Gautham Banasandra <ga...@apache.org>
wrote:

> Hi Steve,
>
> Yes, there won't be a need for winutils.exe once we provide a JNI
> implementation under a common file system interface.
>
> Thanks,
> --Gautham
>
> On Fri, 29 Apr 2022 at 00:54, Steve Loughran <st...@cloudera.com> wrote:
>
> >
> >
> > On Sun, 20 Feb 2022 at 18:42, Gautham Banasandra <ga...@apache.org>
> > wrote:
> >
> >> Hi all,
> >>
> >> I've been working on getting Hadoop to build on Windows for quite some
> >> time
> >> now. We're now at a stage where we can parallelize the effort and
> complete
> >> this sooner. I've outlined the parts that are remaining. Please get in
> >> touch with me if anyone wishes to join hands in realizing this goal.
> >>
> >> *Why do we need Hadoop to run on Windows?*
> >> Windows has a very large user base. The modern alternative softwares to
> >> Hadoop (like Kubernetes) are cross platform by design. We have to
> >> acknowledge the fact it isn't easy to get Hadoop running on Windows. The
> >> reason why we haven't seen much adoption of Hadoop on Windows is
> probably
> >> because of issues like compilation, requiring work-arounds every step of
> >> the way etc. If we were to nail these issues, I believe it would
> >> tremendously expand the usage of Hadoop.
> >>
> >>
> >> *Phase 3 : Resolving systemic issues*
> >> 1. [HADOOP-13223] winutils.exe is a bug nexus and should be killed with
> an
> >> axe. - ASF JIRA (apache.org)
> >> <https://issues.apache.org/jira/browse/HADOOP-13223>
> >> The Hadoop environment is modeled closer to that of Linux than Windows.
> >> Thus, we see a lot of functional gaps between running Hadoop on Linux
> v/s
> >> Windows, which have become the source of bugs when it comes to running
> >> Hadoop on Windows. One such issue is that of winutils.exe. We can aim to
> >> address issues like these in this phase. I plan to provide JNI
> >> implementation for each platform and unify these under a common file
> >> system
> >> interface. So that we get stack traces for exceptions thrown in these
> >> layers and mostly so that we don't have any disparity between the
> >> platforms.
> >>
> >>
> > i for one endorse this jira.
> >
> > given a lot of it is for fs permissions, maybe whatever you do can
> > downgrade, so that running spark local on a windows laptop becomes easy.
> > those people do not need the posix permissions model
> >
> >
>

Re: [DISCUSS] Hadoop on Windows

Posted by Gautham Banasandra <ga...@apache.org>.
Hi Chris,

Yes, the JNI layer will reuse most of the parts of winutils. HDFS-16466
will only provide the cross-platform equivalent of the permissions flags
such as S_IXGRP, S_IROTH etc.

Thanks,
--Gautham

On Tue, 10 May 2022 at 21:46, Chris Nauroth <cn...@apache.org> wrote:

> +1 from me as well for HADOOP-13223 (replacement of winutils.exe with a
> JNI implementation). Thank you for your work on this, Gautham.
>
> Regarding HDFS-16466 (permissions), winutils.exe does contain an
> implementation for mapping between POSIX and Windows permissions. I believe
> that code has been stable for a long time. Is the intent of HDFS-16466 to
> port this logic over to the new library?
>
> Chris Nauroth
>
>
> On Fri, Apr 29, 2022 at 3:11 AM Gautham Banasandra <ga...@apache.org>
> wrote:
>
>> Hi Steve,
>>
>> Yes, there won't be a need for winutils.exe once we provide a JNI
>> implementation under a common file system interface.
>>
>> Thanks,
>> --Gautham
>>
>> On Fri, 29 Apr 2022 at 00:54, Steve Loughran <st...@cloudera.com> wrote:
>>
>> >
>> >
>> > On Sun, 20 Feb 2022 at 18:42, Gautham Banasandra <ga...@apache.org>
>> > wrote:
>> >
>> >> Hi all,
>> >>
>> >> I've been working on getting Hadoop to build on Windows for quite some
>> >> time
>> >> now. We're now at a stage where we can parallelize the effort and
>> complete
>> >> this sooner. I've outlined the parts that are remaining. Please get in
>> >> touch with me if anyone wishes to join hands in realizing this goal.
>> >>
>> >> *Why do we need Hadoop to run on Windows?*
>> >> Windows has a very large user base. The modern alternative softwares to
>> >> Hadoop (like Kubernetes) are cross platform by design. We have to
>> >> acknowledge the fact it isn't easy to get Hadoop running on Windows.
>> The
>> >> reason why we haven't seen much adoption of Hadoop on Windows is
>> probably
>> >> because of issues like compilation, requiring work-arounds every step
>> of
>> >> the way etc. If we were to nail these issues, I believe it would
>> >> tremendously expand the usage of Hadoop.
>> >>
>> >>
>> >> *Phase 3 : Resolving systemic issues*
>> >> 1. [HADOOP-13223] winutils.exe is a bug nexus and should be killed
>> with an
>> >> axe. - ASF JIRA (apache.org)
>> >> <https://issues.apache.org/jira/browse/HADOOP-13223>
>> >> The Hadoop environment is modeled closer to that of Linux than Windows.
>> >> Thus, we see a lot of functional gaps between running Hadoop on Linux
>> v/s
>> >> Windows, which have become the source of bugs when it comes to running
>> >> Hadoop on Windows. One such issue is that of winutils.exe. We can aim
>> to
>> >> address issues like these in this phase. I plan to provide JNI
>> >> implementation for each platform and unify these under a common file
>> >> system
>> >> interface. So that we get stack traces for exceptions thrown in these
>> >> layers and mostly so that we don't have any disparity between the
>> >> platforms.
>> >>
>> >>
>> > i for one endorse this jira.
>> >
>> > given a lot of it is for fs permissions, maybe whatever you do can
>> > downgrade, so that running spark local on a windows laptop becomes easy.
>> > those people do not need the posix permissions model
>> >
>> >
>>
>