You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Wei-Chiu Chuang <we...@cloudera.com.INVALID> on 2019/11/19 04:23:14 UTC

This Week's APAC Hadoop storage community sync

Hi!

I am happy to have Feilong from Intel to join us and talk about the Storage
Class Memory support for HDFS ( HDFS-13762
<https://issues.apache.org/jira/browse/HDFS-13762>). This is a new feature
that will land in the next Hadoop minor release (Hadoop 3.3.0)

Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm Beijing
Time / Thursday Nov 21 9:30am Bangalore Time

Join Zoom Meeting

https://cloudera.zoom.us/j/880548968

Past sessions:
https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.
Thanks again for the great presentation given by Feilong.

Feilong also graciously agreed to share the presentation slides. To not
spam your inbox, I uploaded the slides in my personal Google Drive folder
here:
https://drive.google.com/file/d/1OxtviDL6N7jwhQDAe2r1oM-MbioeSgpb/view?usp=sharing


Here's the notes I took for your reference:

11/20/2019 Feilong from Intel will discuss the Storage Class Memory support
(HDFS-13762 <https://issues.apache.org/jira/browse/HDFS-13762>)

Attendee: weichiu, feilong, haiyang, Hui, rakesh, Yisheng


   -

   HDFS-13762 extends HDFS centralized cache management to use persistent
   memory as the cache. It offers larger memory at a lower price.
   -

   The cache (NativePmemMappableBlockLoader) bypasses kernel space,
   directly to the pmem (via PMDK library). There’s another implementation
   (PmemMappableBlockLoader) that uses Java NIO/Common File API to access
   pmem. The former is about 20% better in performance.
   -

   A configuration dfs.datanode.cache.pmem.dirs determines whether to use
   DRAM or Pmem to cache.
   -

   Can support more than one pmem devices.
   -

   Will ship with Hadoop 3.3.0, 3.2.2 and 3.1.4.
   -

   Benchmark: DFSIO. If input data is fully cached in Pmem, Pmem is 6x than
   DRAM. Otherwise if a small input data set is cached fully in DRAM, Pmem is
   about 50% throughput
   -

   HDFS-14740: Pmem read cache recovery pending community review. Upon DN
   restart, scan pmem and find existing cached block files. Hierarchical cache
   storage.
   -

   Alibaba evaluated HBase BucketCache with Pmem. There’s a benchmark
   report somewhere that Rakesh


On Wed, Nov 20, 2019 at 9:13 PM Wei-Chiu Chuang <we...@cloudera.com>
wrote:

> Friendly reminder: this event is happening in an hour!
>
> On Mon, Nov 18, 2019 at 11:38 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> I'm really bad at converting time across different time zones.
>>
>> Here's the correction:
>> Pacific Time: Wed Nov 20 10pm
>> Beijing Time: Thu Nov 21 2pm
>> India Time: Thu Nov 21 11:30am
>>
>> https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
>> [image: Screen Shot 2019-11-18 at 11.37.02 PM.png]
>>
>> On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
>> wrote:
>>
>>> Hi!
>>>
>>> I am happy to have Feilong from Intel to join us and talk about the
>>> Storage Class Memory support for HDFS ( HDFS-13762
>>> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
>>> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>>>
>>> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
>>> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>>>
>>> Join Zoom Meeting
>>>
>>> https://cloudera.zoom.us/j/880548968
>>>
>>> Past sessions:
>>>
>>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>>
>>>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.
Thanks again for the great presentation given by Feilong.

Feilong also graciously agreed to share the presentation slides. To not
spam your inbox, I uploaded the slides in my personal Google Drive folder
here:
https://drive.google.com/file/d/1OxtviDL6N7jwhQDAe2r1oM-MbioeSgpb/view?usp=sharing


Here's the notes I took for your reference:

11/20/2019 Feilong from Intel will discuss the Storage Class Memory support
(HDFS-13762 <https://issues.apache.org/jira/browse/HDFS-13762>)

Attendee: weichiu, feilong, haiyang, Hui, rakesh, Yisheng


   -

   HDFS-13762 extends HDFS centralized cache management to use persistent
   memory as the cache. It offers larger memory at a lower price.
   -

   The cache (NativePmemMappableBlockLoader) bypasses kernel space,
   directly to the pmem (via PMDK library). There’s another implementation
   (PmemMappableBlockLoader) that uses Java NIO/Common File API to access
   pmem. The former is about 20% better in performance.
   -

   A configuration dfs.datanode.cache.pmem.dirs determines whether to use
   DRAM or Pmem to cache.
   -

   Can support more than one pmem devices.
   -

   Will ship with Hadoop 3.3.0, 3.2.2 and 3.1.4.
   -

   Benchmark: DFSIO. If input data is fully cached in Pmem, Pmem is 6x than
   DRAM. Otherwise if a small input data set is cached fully in DRAM, Pmem is
   about 50% throughput
   -

   HDFS-14740: Pmem read cache recovery pending community review. Upon DN
   restart, scan pmem and find existing cached block files. Hierarchical cache
   storage.
   -

   Alibaba evaluated HBase BucketCache with Pmem. There’s a benchmark
   report somewhere that Rakesh


On Wed, Nov 20, 2019 at 9:13 PM Wei-Chiu Chuang <we...@cloudera.com>
wrote:

> Friendly reminder: this event is happening in an hour!
>
> On Mon, Nov 18, 2019 at 11:38 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> I'm really bad at converting time across different time zones.
>>
>> Here's the correction:
>> Pacific Time: Wed Nov 20 10pm
>> Beijing Time: Thu Nov 21 2pm
>> India Time: Thu Nov 21 11:30am
>>
>> https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
>> [image: Screen Shot 2019-11-18 at 11.37.02 PM.png]
>>
>> On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
>> wrote:
>>
>>> Hi!
>>>
>>> I am happy to have Feilong from Intel to join us and talk about the
>>> Storage Class Memory support for HDFS ( HDFS-13762
>>> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
>>> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>>>
>>> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
>>> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>>>
>>> Join Zoom Meeting
>>>
>>> https://cloudera.zoom.us/j/880548968
>>>
>>> Past sessions:
>>>
>>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>>
>>>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.
Friendly reminder: this event is happening in an hour!

On Mon, Nov 18, 2019 at 11:38 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> I'm really bad at converting time across different time zones.
>
> Here's the correction:
> Pacific Time: Wed Nov 20 10pm
> Beijing Time: Thu Nov 21 2pm
> India Time: Thu Nov 21 11:30am
>
> https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
> [image: Screen Shot 2019-11-18 at 11.37.02 PM.png]
>
> On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
> wrote:
>
>> Hi!
>>
>> I am happy to have Feilong from Intel to join us and talk about the
>> Storage Class Memory support for HDFS ( HDFS-13762
>> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
>> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>>
>> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
>> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>>
>> Join Zoom Meeting
>>
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sessions:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.
Friendly reminder: this event is happening in an hour!

On Mon, Nov 18, 2019 at 11:38 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> I'm really bad at converting time across different time zones.
>
> Here's the correction:
> Pacific Time: Wed Nov 20 10pm
> Beijing Time: Thu Nov 21 2pm
> India Time: Thu Nov 21 11:30am
>
> https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
> [image: Screen Shot 2019-11-18 at 11.37.02 PM.png]
>
> On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
> wrote:
>
>> Hi!
>>
>> I am happy to have Feilong from Intel to join us and talk about the
>> Storage Class Memory support for HDFS ( HDFS-13762
>> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
>> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>>
>> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
>> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>>
>> Join Zoom Meeting
>>
>> https://cloudera.zoom.us/j/880548968
>>
>> Past sessions:
>>
>> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>>
>>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@apache.org>.
I'm really bad at converting time across different time zones.

Here's the correction:
Pacific Time: Wed Nov 20 10pm
Beijing Time: Thu Nov 21 2pm
India Time: Thu Nov 21 11:30am
https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
[image: Screen Shot 2019-11-18 at 11.37.02 PM.png]

On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
wrote:

> Hi!
>
> I am happy to have Feilong from Intel to join us and talk about the
> Storage Class Memory support for HDFS ( HDFS-13762
> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>
> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>
> Join Zoom Meeting
>
> https://cloudera.zoom.us/j/880548968
>
> Past sessions:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@apache.org>.
I'm really bad at converting time across different time zones.

Here's the correction:
Pacific Time: Wed Nov 20 10pm
Beijing Time: Thu Nov 21 2pm
India Time: Thu Nov 21 11:30am
https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
[image: Screen Shot 2019-11-18 at 11.37.02 PM.png]

On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
wrote:

> Hi!
>
> I am happy to have Feilong from Intel to join us and talk about the
> Storage Class Memory support for HDFS ( HDFS-13762
> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>
> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>
> Join Zoom Meeting
>
> https://cloudera.zoom.us/j/880548968
>
> Past sessions:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>

Re: This Week's APAC Hadoop storage community sync

Posted by Wei-Chiu Chuang <we...@apache.org>.
I'm really bad at converting time across different time zones.

Here's the correction:
Pacific Time: Wed Nov 20 10pm
Beijing Time: Thu Nov 21 2pm
India Time: Thu Nov 21 11:30am
https://www.timeanddate.com/worldclock/converter.html?iso=20191121T060000&p1=tz_pt&p2=33&p3=tz_ist
[image: Screen Shot 2019-11-18 at 11.37.02 PM.png]

On Mon, Nov 18, 2019 at 8:23 PM Wei-Chiu Chuang <we...@cloudera.com>
wrote:

> Hi!
>
> I am happy to have Feilong from Intel to join us and talk about the
> Storage Class Memory support for HDFS ( HDFS-13762
> <https://issues.apache.org/jira/browse/HDFS-13762>). This is a new
> feature that will land in the next Hadoop minor release (Hadoop 3.3.0)
>
> Date/Time: Wednesday Nov 20 10pm Pacific Time / Thursday Nov 21 1pm
> Beijing Time / Thursday Nov 21 9:30am Bangalore Time
>
> Join Zoom Meeting
>
> https://cloudera.zoom.us/j/880548968
>
> Past sessions:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>