You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/16 11:04:57 UTC

[GitHub] [arrow] paddyhoran commented on a change in pull request #6898: ARROW-8399: [Rust] Extend memory alignments to include other architectures

paddyhoran commented on a change in pull request #6898:
URL: https://github.com/apache/arrow/pull/6898#discussion_r426143335



##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated
+// from [cuneiform](https://crates.io/crates/cuneiform) crate. Approach mimicks Intel TBB's
+// cache_aligned_allocator which to exploit cache locality and minimize prefetch signals with

Review comment:
       This sentence does not read right to me.  Maybe "This approach mimicks Intel TBB's cache_aligned_allocator which exploits cache locality and minimizes prefetch signals resulting in less ..."

##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated
+// from [cuneiform](https://crates.io/crates/cuneiform) crate. Approach mimicks Intel TBB's
+// cache_aligned_allocator which to exploit cache locality and minimize prefetch signals with
+// less round trip time between the layers of storage. For further info: https://software.intel.com/en-us/node/506094
+
+// 32-bit architecture and things other than netburst microarchitecture are using 64 bytes.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+// Intel x86_64:
+// L2D streamer from L1:
+// Loads data or instructions from memory to the second-level cache. To use the streamer,
+// organize the data or instructions in blocks of 128 bytes, aligned on 128 bytes.
+// - https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86_64")]
+pub const ALIGNMENT: usize = (1 << 7);
+
+//

Review comment:
       Empty line.

##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated

Review comment:
       This -> These

##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated
+// from [cuneiform](https://crates.io/crates/cuneiform) crate. Approach mimicks Intel TBB's
+// cache_aligned_allocator which to exploit cache locality and minimize prefetch signals with
+// less round trip time between the layers of storage. For further info: https://software.intel.com/en-us/node/506094
+
+// 32-bit architecture and things other than netburst microarchitecture are using 64 bytes.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+// Intel x86_64:
+// L2D streamer from L1:
+// Loads data or instructions from memory to the second-level cache. To use the streamer,
+// organize the data or instructions in blocks of 128 bytes, aligned on 128 bytes.
+// - https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86_64")]
+pub const ALIGNMENT: usize = (1 << 7);
+
+//
+// 24Kc:
+// Data Line Size
+// - https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00346-2B-24K-DTS-04.00.pdf
+// - https://gitlab.e.foundation/e/devices/samsung/n7100/stable_android_kernel_samsung_smdk4412/commit/2dbac10263b2f3c561de68b4c369bc679352ccee
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips")]
+pub const ALIGNMENT: usize = (1 << 5);
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips64")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for powerpc
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for the ppc 64
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc64")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+//
+// e.g.: sifive
+// - https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt#L41
+// in general all of them are the same.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "riscv")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+//
+// This is fixed.
+// - https://docs.huihoo.com/doxygen/linux/kernel/3.7/arch_2s390_2include_2asm_2cache_8h.html
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "s390x")]
+pub const ALIGNMENT: usize = (1 << 8);
+
+//
+// This is also fixed.

Review comment:
       Same.

##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated
+// from [cuneiform](https://crates.io/crates/cuneiform) crate. Approach mimicks Intel TBB's
+// cache_aligned_allocator which to exploit cache locality and minimize prefetch signals with
+// less round trip time between the layers of storage. For further info: https://software.intel.com/en-us/node/506094
+
+// 32-bit architecture and things other than netburst microarchitecture are using 64 bytes.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+// Intel x86_64:
+// L2D streamer from L1:
+// Loads data or instructions from memory to the second-level cache. To use the streamer,
+// organize the data or instructions in blocks of 128 bytes, aligned on 128 bytes.
+// - https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86_64")]
+pub const ALIGNMENT: usize = (1 << 7);
+
+//
+// 24Kc:
+// Data Line Size
+// - https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00346-2B-24K-DTS-04.00.pdf
+// - https://gitlab.e.foundation/e/devices/samsung/n7100/stable_android_kernel_samsung_smdk4412/commit/2dbac10263b2f3c561de68b4c369bc679352ccee
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips")]
+pub const ALIGNMENT: usize = (1 << 5);
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips64")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for powerpc
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for the ppc 64
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc64")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+//
+// e.g.: sifive
+// - https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt#L41
+// in general all of them are the same.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "riscv")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+//
+// This is fixed.

Review comment:
       Empty line and "This is fixed."?

##########
File path: rust/arrow/src/memory.rs
##########
@@ -16,12 +16,125 @@
 // under the License.
 
 //! Defines memory-related functions, such as allocate/deallocate/reallocate memory
-//! regions.
+//! regions, cache and allocation alignments.
 
 use std::alloc::Layout;
 use std::mem::align_of;
 
-pub const ALIGNMENT: usize = 64;
+// NOTE: Below code is written for spatial/temporal prefetcher optimizations. Memory allocation
+// should align well with usage pattern of cache access and block sizes on layers of storage levels from
+// registers to non-volatile memory. This alignments are all cache aware alignments incorporated
+// from [cuneiform](https://crates.io/crates/cuneiform) crate. Approach mimicks Intel TBB's
+// cache_aligned_allocator which to exploit cache locality and minimize prefetch signals with
+// less round trip time between the layers of storage. For further info: https://software.intel.com/en-us/node/506094
+
+// 32-bit architecture and things other than netburst microarchitecture are using 64 bytes.
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+// Intel x86_64:
+// L2D streamer from L1:
+// Loads data or instructions from memory to the second-level cache. To use the streamer,
+// organize the data or instructions in blocks of 128 bytes, aligned on 128 bytes.
+// - https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "x86_64")]
+pub const ALIGNMENT: usize = (1 << 7);
+
+//
+// 24Kc:
+// Data Line Size
+// - https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00346-2B-24K-DTS-04.00.pdf
+// - https://gitlab.e.foundation/e/devices/samsung/n7100/stable_android_kernel_samsung_smdk4412/commit/2dbac10263b2f3c561de68b4c369bc679352ccee
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips")]
+pub const ALIGNMENT: usize = (1 << 5);
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "mips64")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for powerpc
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc")]
+pub const ALIGNMENT: usize = (1 << 5);
+
+// Defaults for the ppc 64
+/// Cache and allocation multiple alignment size
+#[cfg(target_arch = "powerpc64")]
+pub const ALIGNMENT: usize = (1 << 6);
+
+//
+// e.g.: sifive

Review comment:
       Empty line




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org