You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Wei-Chiu Chuang <we...@apache.org> on 2020/08/25 16:47:24 UTC

Mandarin Hadoop online sync this week

Hello,

There hasn't been a Mandarin online sync for quite some time. I'd like to
call for one this week:

Date/time:

8/27 Thursday Beijing Time 1PM
8/26 Wednesday US Pacific Time 10PM

Link:
https://cloudera.zoom.us/j/880548968

Past sync summary:
https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
Brahma,

Thanks for bring this up. I don't have a specific topic in mind.

For me, I'd like to use this as an opportunity to get a feeling of what
features/problems the community feel interested in. And in general, what
are the Hadoop versions the community are using? Can we stop 2.9.x
development and concentrate on 3.2/3.3 and trunk? I am interested in
arranging a meetup for a broader audience in English for topics like this
soon too.

Other things I'd like to explore (if there's time), is how do we involve
the Chinese community better. How can we grow more Chinese/Asian committers?

On Tue, Aug 25, 2020 at 9:58 AM Brahma Reddy Battula <br...@apache.org>
wrote:

> HI,
>
> what you are planning for this week?
>
> On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hello,
>>
>> There hasn't been a Mandarin online sync for quite some time. I'd like to
>> call for one this week:
>>
>> Date/time:
>>
>> 8/27 Thursday Beijing Time 1PM
>> 8/26 Wednesday US Pacific Time 10PM
>>
>> Link:
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sync summary:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>
>
> --
>
>
>
> --Brahma Reddy Battula
>

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula

Re: Mandarin Hadoop online sync this week

Posted by Wei-Chiu Chuang <we...@apache.org>.
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>

Re: Mandarin Hadoop online sync this week

Posted by Brahma Reddy Battula <br...@apache.org>.
HI,

what you are planning for this week?

On Tue, Aug 25, 2020 at 10:18 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>


-- 



--Brahma Reddy Battula