You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Wangda Tan <wh...@gmail.com> on 2019/08/13 08:36:52 UTC

Re: Any thoughts making Submarine a separate Apache project?

Hi folks,

I just drafted a proposal which is targetted to send to PMC list and board
for thoughts. Thanks Xun Liu for providing thoughts about future
directions/architecture, and reviews from Keqiu Hu.

Title: "Apache Submarine for Apache Top-Level Project"

https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit

I plan to send it to PMC list/board next Monday, so any
comments/suggestions are welcome.

Thanks,
Wangda


On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:

> Thanks Vinod for these great suggestions. I agree most of your comments
> above.
>  "For the Apache Hadoop community, this will be treated simply as
> code-change and so need a committer +1?". IIUC, this should be treated as
> feature branch merge, so may be 3 committer +1 is needed here according to
> https://hadoop.apache.org/bylaws.html?
>
> bq. Can somebody who have cycles and been on the ASF lists for a while
> look into the process here?
> I can check with ASF members who has experience on this if no one haven't
> yet.
>
> Thanks,
>
> Junping
>
> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>
>> Looks like there's a meaningful push behind this.
>>
>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>> this enthusiasm turns into building a real, independent but more
>> importantly a sustainable community.
>>
>> Given that there were two official releases off the Apache Hadoop
>> project, I doubt if you'd need to go through the incubator process. Instead
>> you can directly propose a new TLP at ASF board. The last few times this
>> happened was with ORC, and long before that with Hive, HBase etc. Can
>> somebody who have cycles and been on the ASF lists for a while look into
>> the process here?
>>
>> For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1? You can be more gently by formally
>> doing a vote once a process doc is written down.
>>
>> Back to the sustainable community point, as part of drafting this
>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>> PMC/Committers can exercise their will to join this new project as
>> PMC/Committers respectively without any additional constraints.
>>
>> Thanks
>> +Vinod
>>
>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>> >
>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>> from
>> > 20+ contributors!
>> >
>> > So I think we should move it forward, any suggestions about what we
>> should
>> > do?
>> >
>> > Best,
>> > Wangda
>> >
>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>> >
>> >> +1, This is neo from TiDB & TiKV community.
>> >> Thanks Xun for bring this up.
>> >>
>> >> Our CNCF project's open source distributed KV storage system TiKV,
>> >> Hadoop submarine's machine learning engine helps us to optimize data
>> >> storage,
>> >> helping us solve some problems in data hotspots and data shuffers.
>> >>
>> >> We are ready to improve the performance of TiDB in our open source
>> >> distributed relational database TiDB and also using the hadoop
>> submarine
>> >> machine learning engine.
>> >>
>> >> I think if submarine can be independent, it will develop faster and
>> better.
>> >> Thanks to the hadoop community for developing submarine!
>> >>
>> >> Best Regards,
>> >> neo
>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>> >> https://github.com/tikv
>> >>
>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>> >>
>> >>> @adam.antal
>> >>>
>> >>> The submarine development team has completed the following
>> preparations:
>> >>> 1. Established a temporary test repository on Github.
>> >>> 2. Change the package name of hadoop submarine from
>> org.hadoop.submarine
>> >> to
>> >>> org.submarine
>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>> >>> 4. On the Github docked travis-ci system, all test cases have been
>> >> tested;
>> >>> 5. Several Hadoop submarine users completed the system test using the
>> >> code
>> >>> in this repository.
>> >>>
>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>> >>>
>> >>>> Hi
>> >>>>
>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/). We
>> >> are
>> >>>> a major in electrical engineering. Our teaching teams and students
>> use
>> >>>> bigoop submarine for big data analysis and automation control of
>> >>> electrical
>> >>>> equipment.
>> >>>>
>> >>>> Many thanks to the hadoop community for providing us with machine
>> >>> learning
>> >>>> tools like submarine.
>> >>>>
>> >>>> I wish hadoop submarine is getting better and better.
>> >>>>
>> >>>>
>> >>>> ==============================
>> >>>> 赵欣
>> >>>> 东南大学电气工程学院
>> >>>>
>> >>>> -----------------------------------------------------
>> >>>>
>> >>>> Zhao XIN
>> >>>>
>> >>>> School of Electrical Engineering
>> >>>>
>> >>>> ==============================
>> >>>> 2019-07-18
>> >>>>
>> >>>>
>> >>>> *From:* Xun Liu <li...@apache.org>
>> >>>> *Date:* 2019-07-18 09:46
>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>> >>>> project?
>> >>>>
>> >>>>
>> >>>> ---------- Forwarded message ---------
>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>> >>>> Date: 2019年7月17日周三 下午3:17
>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>> >> project?
>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>> >>>> runlin512@gmail.com>
>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>> >>> common-dev@hadoop.apache.org>,
>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>> >>>> submarine-dev@hadoop.apache.org>
>> >>>>
>> >>>>
>> >>>> +1 ，Good idea, we are very much looking forward to it.
>> >>>>
>> >>>> ------------------------------
>> >>>> dashuiguailuyun@gmail.com
>> >>>>
>> >>>>
>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>> >>>> *Date:* 2019-07-17 14:55
>> >>>> *To:* runlin zhang <ru...@gmail.com>
>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>> >>>> <co...@hadoop.apache.org>; yarn-dev <yarn-dev@hadoop.apache.org
>> >;
>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>> >>>> <su...@hadoop.apache.org>
>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>> project?
>> >>>> +1, this is a very great idea.
>> >>>> As Hadoop repository has already grown huge and contains many
>> >> projects, I
>> >>>> think in general it's a good idea to separate projects in the early
>> >>> phase.
>> >>>>
>> >>>>
>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>> wrote:
>> >>>>
>> >>>>> +1 ，That will be great ！
>> >>>>>
>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>> >>>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>> >>> learning
>> >>>>>> workloads running with big data workloads together on Hadoop
>> >>> clusters.
>> >>>>>>
>> >>>>>> There are a bunch of integrations of Submarine to other projects
>> >> are
>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>> >>> next
>> >>>>> step
>> >>>>>> of Submarine is going to integrate with more projects like Apache
>> >>>> Arrow,
>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine learning
>> >>> use
>> >>>>>> cases like model serving, notebook management, advanced training
>> >>>>>> optimizations (like auto parameter tuning, memory cache
>> >> optimizations
>> >>>> for
>> >>>>>> large datasets for training, etc.), and make it run on other
>> >>> platforms
>> >>>>> like
>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY
>> >>>>> project
>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>> >>>> codebase
>> >>>>>> (Page #30.
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>> >>>>>> ).
>> >>>>>>
>> >>>>>> This expands the scope of the original Submarine project in
>> >> exciting
>> >>>> new
>> >>>>>> ways. Toward that end, would it make sense to create a separate
>> >>>> Submarine
>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>> >>>> allow
>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>> >>>>>>
>> >>>>>> There will be lots of technical details to work out, but any
>> >> initial
>> >>>>>> thoughts on this?
>> >>>>>>
>> >>>>>> Best Regards,
>> >>>>>> Xun Liu
>> >>>>>
>> >>>>>
>> >>>>>
>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>

Re: Any thoughts making Submarine a separate Apache project?

Posted by Wangda Tan <wh...@gmail.com>.

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0

@Vinod Kumar Vavilapalli <vi...@apache.org> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wh...@gmail.com> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <li...@apache.org>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>>> >>>> runlin512@gmail.com>
>>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>>> >>> common-dev@hadoop.apache.org>,
>>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>>> >>>> submarine-dev@hadoop.apache.org>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> dashuiguailuyun@gmail.com
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <ru...@gmail.com>
>>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>>> >>>> <co...@hadoop.apache.org>; yarn-dev <
>>> yarn-dev@hadoop.apache.org>;
>>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>>> >>>> <su...@hadoop.apache.org>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>

Re: Any thoughts making Submarine a separate Apache project?

Posted by Wangda Tan <wh...@gmail.com>.

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0

@Vinod Kumar Vavilapalli <vi...@apache.org> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wh...@gmail.com> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <li...@apache.org>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>>> >>>> runlin512@gmail.com>
>>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>>> >>> common-dev@hadoop.apache.org>,
>>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>>> >>>> submarine-dev@hadoop.apache.org>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> dashuiguailuyun@gmail.com
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <ru...@gmail.com>
>>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>>> >>>> <co...@hadoop.apache.org>; yarn-dev <
>>> yarn-dev@hadoop.apache.org>;
>>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>>> >>>> <su...@hadoop.apache.org>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>

Re: Any thoughts making Submarine a separate Apache project?

Posted by Wangda Tan <wh...@gmail.com>.

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0

@Vinod Kumar Vavilapalli <vi...@apache.org> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wh...@gmail.com> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <li...@apache.org>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>>> >>>> runlin512@gmail.com>
>>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>>> >>> common-dev@hadoop.apache.org>,
>>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>>> >>>> submarine-dev@hadoop.apache.org>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> dashuiguailuyun@gmail.com
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <ru...@gmail.com>
>>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>>> >>>> <co...@hadoop.apache.org>; yarn-dev <
>>> yarn-dev@hadoop.apache.org>;
>>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>>> >>>> <su...@hadoop.apache.org>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>

Re: Any thoughts making Submarine a separate Apache project?

Posted by Wangda Tan <wh...@gmail.com>.

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0

@Vinod Kumar Vavilapalli <vi...@apache.org> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wh...@gmail.com> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <li...@apache.org>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>>> >>>> runlin512@gmail.com>
>>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>>> >>> common-dev@hadoop.apache.org>,
>>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>>> >>>> submarine-dev@hadoop.apache.org>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> dashuiguailuyun@gmail.com
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <ru...@gmail.com>
>>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>>> >>>> <co...@hadoop.apache.org>; yarn-dev <
>>> yarn-dev@hadoop.apache.org>;
>>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>>> >>>> <su...@hadoop.apache.org>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>

Re: Any thoughts making Submarine a separate Apache project?

Posted by Wangda Tan <wh...@gmail.com>.

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0

@Vinod Kumar Vavilapalli <vi...@apache.org> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wh...@gmail.com> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <du...@gmail.com> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <vi...@apache.org> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wh...@gmail.com> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <ne...@pingcap.com> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <li...@apache.org> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <xi...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <li...@apache.org>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <xi...@seu.edu.cn>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： dashuiguailuyun@gmail.com <da...@gmail.com>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <sn...@cloudera.com.invalid>, runlin zhang <
>>> >>>> runlin512@gmail.com>
>>> >>>> Cc: Xun Liu <li...@apache.org>, common-dev <
>>> >>> common-dev@hadoop.apache.org>,
>>> >>>> yarn-dev <ya...@hadoop.apache.org>, hdfs-dev <
>>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
>>> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
>>> >>>> submarine-dev@hadoop.apache.org>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> dashuiguailuyun@gmail.com
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <sn...@cloudera.com.INVALID>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <ru...@gmail.com>
>>> >>>> *CC:* Xun Liu <li...@apache.org>; Hadoop Common
>>> >>>> <co...@hadoop.apache.org>; yarn-dev <
>>> yarn-dev@hadoop.apache.org>;
>>> >>>> Hdfs-dev <hd...@hadoop.apache.org>; mapreduce-dev
>>> >>>> <ma...@hadoop.apache.org>; submarine-dev
>>> >>>> <su...@hadoop.apache.org>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <ru...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <li...@apache.org> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>