You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/04/29 18:03:12 UTC

[GitHub] [arrow] vertexclique opened a new pull request #7061: ARROW-8629: [Rust] – Eliminate indirection of zero sized allocations

vertexclique opened a new pull request #7061:
URL: https://github.com/apache/arrow/pull/7061


   1. Solves UB passing through OS.
   2. Improves performance by removing indirections at intermediate array builds. (Increases throughput and decreases latency)
   
   <details>
     <summary>Click to see Benchmark Output!</summary>
     
     ```
   add 512                 time:   [18.299 us 18.416 us 18.542 us]
                           change: [+0.7553% +1.9764% +3.0260%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   add 512 simd            time:   [15.412 us 15.521 us 15.642 us]
                           change: [-2.2152% -0.7069% +0.7204%] (p = 0.36 > 0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   subtract 512            time:   [17.845 us 17.941 us 18.033 us]
                           change: [-3.4447% -2.3703% -1.3902%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     2 (2.00%) high mild
     2 (2.00%) high severe
   
   subtract 512 simd       time:   [15.237 us 15.359 us 15.499 us]
                           change: [-3.2620% -0.8411% +1.0491%] (p = 0.49 > 0.05)
                           No change in performance detected.
   Found 7 outliers among 100 measurements (7.00%)
     5 (5.00%) high mild
     2 (2.00%) high severe
   
   multiply 512            time:   [17.737 us 17.854 us 17.991 us]
                           change: [-3.8368% -2.6758% -1.4903%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) high mild
     5 (5.00%) high severe
   
   multiply 512 simd       time:   [15.158 us 15.331 us 15.536 us]
                           change: [-3.5028% -1.6229% +0.4772%] (p = 0.15 > 0.05)
                           No change in performance detected.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) high mild
     8 (8.00%) high severe
   
   divide 512              time:   [18.429 us 18.563 us 18.697 us]
                           change: [-1.6791% -0.2733% +1.1091%] (p = 0.70 > 0.05)
                           No change in performance detected.
   Found 23 outliers among 100 measurements (23.00%)
     8 (8.00%) low mild
     8 (8.00%) high mild
     7 (7.00%) high severe
   
   divide 512 simd         time:   [16.719 us 16.799 us 16.884 us]
                           change: [-2.6393% -1.7489% -0.9737%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high severe
   
   sum 512 no simd         time:   [8.1375 us 8.1648 us 8.1968 us]
                           change: [-4.7533% -3.5008% -2.0949%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) high mild
     3 (3.00%) high severe
   
   limit 512, 256 no simd  time:   [7.6325 us 7.6738 us 7.7205 us]
                           change: [-1.1158% +0.6820% +2.4545%] (p = 0.46 > 0.05)
                           No change in performance detected.
   Found 4 outliers among 100 measurements (4.00%)
     2 (2.00%) high mild
     2 (2.00%) high severe
   
   limit 512, 512 no simd  time:   [7.6268 us 7.6679 us 7.7144 us]
                           change: [-1.9252% +0.2950% +2.5081%] (p = 0.81 > 0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) high mild
     2 (2.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/array_from_vec-cc4617b21b5f717d
   array_from_vec 128      time:   [867.08 ns 878.05 ns 890.59 ns]
                           change: [-4.2805% -1.7058% +0.6140%] (p = 0.21 > 0.05)
                           No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   array_from_vec 256      time:   [972.12 ns 979.21 ns 986.07 ns]
                           change: [-4.5408% -2.1099% +0.5039%] (p = 0.11 > 0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high severe
   
   array_from_vec 512      time:   [1.2980 us 1.3059 us 1.3137 us]
                           change: [-5.8054% -2.8723% +0.0052%] (p = 0.05 > 0.05)
                           No change in performance detected.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) low mild
     2 (2.00%) high mild
     4 (4.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/boolean_kernels-4f1add90b9dd7b4c
   and                     time:   [46.317 us 46.515 us 46.730 us]
                           change: [-3.8904% -1.6471% +0.6492%] (p = 0.21 > 0.05)
                           No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) high mild
     4 (4.00%) high severe
   
   and simd                time:   [10.781 us 10.852 us 10.938 us]
                           change: [-11.091% -9.3126% -7.3149%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   or                      time:   [46.131 us 46.346 us 46.582 us]
                           change: [-10.908% -8.7461% -6.1048%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     3 (3.00%) high mild
     5 (5.00%) high severe
   
   or simd                 time:   [10.728 us 10.783 us 10.844 us]
                           change: [-12.225% -10.449% -8.7204%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     5 (5.00%) high mild
     1 (1.00%) high severe
   
   not                     time:   [25.311 us 25.406 us 25.512 us]
                           change: [-14.527% -12.309% -9.2841%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   not simd                time:   [5.6415 us 5.6910 us 5.7399 us]
                           change: [-8.8171% -6.8428% -4.7340%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 11 outliers among 100 measurements (11.00%)
     6 (6.00%) high mild
     5 (5.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/builder-6a49123b1fedb178
   bench_primitive         time:   [334.25 us 337.08 us 340.41 us]
                           thrpt:  [11.475 GiB/s 11.588 GiB/s 11.687 GiB/s]
                    change:
                           time:   [-11.214% -6.6152% -2.5522%] (p = 0.00 < 0.05)
                           thrpt:  [+2.6191% +7.0838% +12.630%]
                           Performance has improved.
   Found 12 outliers among 100 measurements (12.00%)
     4 (4.00%) high mild
     8 (8.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/cast_kernels-fe4a160bda20f75b
   cast int32 to int32 512 time:   [674.87 ns 677.20 ns 679.66 ns]
                           change: [-4.9499% -2.4753% +1.3820%] (p = 0.07 > 0.05)
                           No change in performance detected.
   Found 7 outliers among 100 measurements (7.00%)
     3 (3.00%) high mild
     4 (4.00%) high severe
   
   cast int32 to uint32 512
                           time:   [13.541 us 13.629 us 13.734 us]
                           change: [-10.276% -7.0629% -3.5614%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     4 (4.00%) high mild
     4 (4.00%) high severe
   
   cast int32 to float32 512
                           time:   [13.871 us 13.967 us 14.085 us]
                           change: [-6.5555% -3.4718% -0.4564%] (p = 0.02 < 0.05)
                           Change within noise threshold.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) high mild
     3 (3.00%) high severe
   
   cast int32 to float64 512
                           time:   [13.864 us 13.918 us 13.978 us]
                           change: [-9.2226% -7.3604% -5.0705%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) high mild
     7 (7.00%) high severe
   
   cast int32 to int64 512 time:   [13.957 us 14.068 us 14.199 us]
                           change: [-12.117% -10.875% -9.5831%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     4 (4.00%) high mild
     4 (4.00%) high severe
   
   cast float32 to int32 512
                           time:   [14.655 us 14.715 us 14.783 us]
                           change: [-17.597% -15.300% -12.720%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     2 (2.00%) high mild
     4 (4.00%) high severe
   
   cast float64 to float32 512
                           time:   [15.287 us 15.536 us 15.815 us]
                           change: [-6.1582% -4.8992% -3.6371%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 21 outliers among 100 measurements (21.00%)
     1 (1.00%) low severe
     5 (5.00%) low mild
     5 (5.00%) high mild
     10 (10.00%) high severe
   
   cast float64 to uint64 512
                           time:   [14.915 us 15.027 us 15.145 us]
                           change: [-12.399% -10.036% -8.1195%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   cast int64 to int32 512 time:   [13.810 us 13.924 us 14.058 us]
                           change: [-7.0986% -4.1922% -1.6775%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     2 (2.00%) high mild
     4 (4.00%) high severe
   
   cast date64 to date32 512
                           time:   [16.246 us 16.344 us 16.439 us]
                           change: [-5.0859% -2.4719% -0.2628%] (p = 0.05 < 0.05)
                           Change within noise threshold.
   Found 12 outliers among 100 measurements (12.00%)
     2 (2.00%) low mild
     3 (3.00%) high mild
     7 (7.00%) high severe
   
   cast date32 to date64 512
                           time:   [15.791 us 15.882 us 15.976 us]
                           change: [-5.0716% -2.9360% -1.2361%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) low mild
     1 (1.00%) high severe
   
   cast time32s to time32ms 512
                           time:   [2.0995 us 2.1090 us 2.1177 us]
                           change: [-0.8536% +0.4528% +1.6874%] (p = 0.50 > 0.05)
                           No change in performance detected.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) low mild
     1 (1.00%) high severe
   
   cast time32s to time64us 512
                           time:   [16.979 us 17.082 us 17.196 us]
                           change: [-2.1463% -0.3436% +1.1561%] (p = 0.71 > 0.05)
                           No change in performance detected.
   Found 15 outliers among 100 measurements (15.00%)
     4 (4.00%) low mild
     7 (7.00%) high mild
     4 (4.00%) high severe
   
   cast time64ns to time32s 512
                           time:   [18.235 us 18.339 us 18.448 us]
                           change: [-5.2148% -3.9091% -2.6382%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   Benchmarking cast timestamp_ns to timestamp_s 512: Collecting 100 samples in estimated 5.0018 s (6.1M iterati                                                                                                             cast timestamp_ns to timestamp_s 512
                           time:   [818.85 ns 824.05 ns 829.93 ns]
                           change: [+6.0716% +8.1383% +10.818%] (p = 0.00 < 0.05)
                           Performance has regressed.
   Found 17 outliers among 100 measurements (17.00%)
     3 (3.00%) high mild
     14 (14.00%) high severe
   
   Benchmarking cast timestamp_ms to timestamp_ns 512: Collecting 100 samples in estimated 5.0134 s (1.8M iterat                                                                                                             cast timestamp_ms to timestamp_ns 512
                           time:   [2.8115 us 2.8219 us 2.8330 us]
                           change: [+0.3325% +2.0477% +3.6332%] (p = 0.01 < 0.05)
                           Change within noise threshold.
   Found 7 outliers among 100 measurements (7.00%)
     3 (3.00%) high mild
     4 (4.00%) high severe
   
   cast timestamp_ms to i64 512
                           time:   [1.2090 us 1.2182 us 1.2280 us]
                           change: [-2.0055% -0.0832% +1.4288%] (p = 0.93 > 0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     1 (1.00%) high mild
     2 (2.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/comparison_kernels-37b960b99ca4c83f
   eq 512                  time:   [20.573 us 20.680 us 20.786 us]
                           change: [-5.0693% -2.6875% -0.9950%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     5 (5.00%) high mild
   
   eq 512 simd             time:   [20.218 us 20.344 us 20.487 us]
                           change: [-2.8292% -1.7114% -0.7110%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 7 outliers among 100 measurements (7.00%)
     2 (2.00%) high mild
     5 (5.00%) high severe
   
   neq 512                 time:   [20.521 us 20.695 us 20.898 us]
                           change: [+2.7209% +4.1109% +5.4791%] (p = 0.00 < 0.05)
                           Performance has regressed.
   Found 13 outliers among 100 measurements (13.00%)
     6 (6.00%) high mild
     7 (7.00%) high severe
   
   neq 512 simd            time:   [19.760 us 19.897 us 20.065 us]
                           change: [-1.4355% +0.4660% +2.3438%] (p = 0.64 > 0.05)
                           No change in performance detected.
   Found 12 outliers among 100 measurements (12.00%)
     7 (7.00%) high mild
     5 (5.00%) high severe
   
   lt 512                  time:   [19.709 us 19.779 us 19.854 us]
                           change: [-4.9646% -2.4118% -0.0782%] (p = 0.05 < 0.05)
                           Change within noise threshold.
   Found 8 outliers among 100 measurements (8.00%)
     6 (6.00%) high mild
     2 (2.00%) high severe
   
   lt 512 simd             time:   [19.849 us 20.042 us 20.250 us]
                           change: [-2.6799% -0.3414% +2.3032%] (p = 0.81 > 0.05)
                           No change in performance detected.
   Found 6 outliers among 100 measurements (6.00%)
     2 (2.00%) high mild
     4 (4.00%) high severe
   
   lt_eq 512               time:   [21.172 us 21.398 us 21.670 us]
                           change: [-1.5189% +0.7017% +2.6062%] (p = 0.53 > 0.05)
                           No change in performance detected.
   Found 8 outliers among 100 measurements (8.00%)
     7 (7.00%) high mild
     1 (1.00%) high severe
   
   lt_eq 512 simd          time:   [20.667 us 20.881 us 21.119 us]
                           change: [-1.2254% +1.6209% +4.2936%] (p = 0.26 > 0.05)
                           No change in performance detected.
   Found 12 outliers among 100 measurements (12.00%)
     3 (3.00%) high mild
     9 (9.00%) high severe
   
   gt 512                  time:   [20.677 us 20.865 us 21.080 us]
                           change: [+0.4445% +2.8196% +4.6280%] (p = 0.00 < 0.05)
                           Change within noise threshold.
   Found 13 outliers among 100 measurements (13.00%)
     2 (2.00%) low mild
     3 (3.00%) high mild
     8 (8.00%) high severe
   
   gt 512 simd             time:   [20.205 us 20.376 us 20.572 us]
                           change: [+1.1167% +3.5786% +6.4322%] (p = 0.00 < 0.05)
                           Performance has regressed.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) high mild
     6 (6.00%) high severe
   
   gt_eq 512               time:   [20.388 us 20.457 us 20.530 us]
                           change: [-2.0628% -0.1724% +1.9895%] (p = 0.89 > 0.05)
                           No change in performance detected.
   Found 7 outliers among 100 measurements (7.00%)
     2 (2.00%) low mild
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   gt_eq 512 simd          time:   [20.050 us 20.158 us 20.266 us]
                           change: [-2.9668% -0.9237% +1.2537%] (p = 0.41 > 0.05)
                           No change in performance detected.
   Found 4 outliers among 100 measurements (4.00%)
     1 (1.00%) high mild
     3 (3.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/csv_writer-8ba12ecd5b1b7b63
   record_batches_to_csv   time:   [219.15 us 223.37 us 228.97 us]
                           change: [-8.1368% -4.7450% -1.1190%] (p = 0.01 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     5 (5.00%) high mild
     3 (3.00%) high severe
   
        Running /Users/mahmut/projects/arrow/rust/target/release/deps/take_kernels-25a2e28970e1415d
   take u8 256             time:   [19.674 us 19.728 us 19.787 us]
                           change: [-11.093% -8.4831% -5.6285%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     4 (4.00%) high mild
     3 (3.00%) high severe
   
   take u8 512             time:   [37.091 us 37.422 us 37.802 us]
                           change: [-6.8020% -4.5601% -1.5989%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     4 (4.00%) high mild
     4 (4.00%) high severe
   
   take u8 1024            time:   [70.986 us 71.458 us 71.997 us]
                           change: [-3.7991% -1.7970% +1.2059%] (p = 0.14 > 0.05)
                           No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   take i32 256            time:   [19.762 us 19.829 us 19.901 us]
                           change: [-6.3630% -4.4731% -2.2024%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     2 (2.00%) high mild
     7 (7.00%) high severe
   
   take i32 512            time:   [37.087 us 37.233 us 37.401 us]
                           change: [-2.8627% -1.0442% +1.0592%] (p = 0.33 > 0.05)
                           No change in performance detected.
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) high mild
     5 (5.00%) high severe
   
   take i32 1024           time:   [72.893 us 73.715 us 74.699 us]
                           change: [-1.8626% +0.0693% +2.4955%] (p = 0.95 > 0.05)
                           No change in performance detected.
   Found 11 outliers among 100 measurements (11.00%)
     7 (7.00%) high mild
     4 (4.00%) high severe
   
   take bool 256           time:   [22.519 us 22.696 us 22.885 us]
                           change: [-0.4679% +1.0003% +3.4266%] (p = 0.39 > 0.05)
                           No change in performance detected.
   Found 22 outliers among 100 measurements (22.00%)
     8 (8.00%) low severe
     6 (6.00%) low mild
     5 (5.00%) high mild
     3 (3.00%) high severe
   
   take bool 512           time:   [41.304 us 41.471 us 41.654 us]
                           change: [-3.0177% -1.0061% +1.3302%] (p = 0.44 > 0.05)
                           No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   
   take bool 1024          time:   [80.453 us 80.677 us 80.919 us]
                           change: [-2.8769% -0.7005% +2.0128%] (p = 0.63 > 0.05)
                           No change in performance detected.
   Found 9 outliers among 100 measurements (9.00%)
     2 (2.00%) high mild
     7 (7.00%) high severe
     ```
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] andygrove commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

andygrove commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r417658203



##########
File path: rust/arrow/src/memory.rs
##########
@@ -20,34 +20,56 @@
 
 use std::alloc::Layout;
 use std::mem::align_of;
+use std::ptr::NonNull;
 
 pub const ALIGNMENT: usize = 64;
 
+///
+/// As you can see this is global and lives as long as the program lives.
+/// Be careful to not write anything to this pointer in any scenario.
+/// If you use allocation methods shown here you won't have any problems.
+const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(0xDEADBEEF as *mut u8) };
+
 pub fn allocate_aligned(size: usize) -> *mut u8 {
     unsafe {

Review comment:
       I'm not familiar with the use case of calling this with zero. Should that even be allowed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] andygrove closed pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

andygrove closed pull request #7061:
URL: https://github.com/apache/arrow/pull/7061


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#issuecomment-631132553


   I have addressed the comments and formatting problems.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] sunchao commented on pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

sunchao commented on pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#issuecomment-629593043


   @vertexclique hmm I'm seeing multiple tests failing:
   
   ```
   ---- ipc::reader::tests::read_generated_streams stdout ----
   thread 'ipc::reader::tests::read_generated_streams' panicked at 'memory not aligned', arrow/src/buffer.rs:165:9
   
   ---- ipc::reader::tests::read_generated_files stdout ----
   thread 'ipc::reader::tests::read_generated_files' panicked at 'memory not aligned', arrow/src/buffer.rs:165:9
   
   ---- ipc::writer::tests::read_and_rewrite_generated_streams stdout ----
   thread 'ipc::writer::tests::read_and_rewrite_generated_streams' panicked at 'memory not aligned', arrow/src/buffer.rs:165:9
   
   ---- ipc::writer::tests::read_and_rewrite_generated_files stdout ----
   thread 'ipc::writer::tests::read_and_rewrite_generated_files' panicked at 'memory not aligned', arrow/src/buffer.rs:165:9
   ```
   
   these seem related. Could you take a look? Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r419000857



##########
File path: rust/arrow/src/memory.rs
##########
@@ -20,34 +20,56 @@
 
 use std::alloc::Layout;
 use std::mem::align_of;
+use std::ptr::NonNull;
 
 pub const ALIGNMENT: usize = 64;
 
+///
+/// As you can see this is global and lives as long as the program lives.
+/// Be careful to not write anything to this pointer in any scenario.
+/// If you use allocation methods shown here you won't have any problems.
+const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(0xDEADBEEF as *mut u8) };
+
 pub fn allocate_aligned(size: usize) -> *mut u8 {
     unsafe {
-        let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
-        std::alloc::alloc_zeroed(layout)
+        if size == 0 {
+            // In a perfect world, there is no need to request zero size allocation.
+            // Currently, passing zero sized layout to alloc is UB.
+            // This will dodge allocator api for any type.
+            BYPASS_PTR.as_mut()
+        } else {
+            let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
+            std::alloc::alloc_zeroed(layout)
+        }
     }
 }
 
-pub unsafe fn free_aligned(p: *mut u8, size: usize) {
-    std::alloc::dealloc(p, Layout::from_size_align_unchecked(size, ALIGNMENT));
+pub unsafe fn free_aligned(ptr: *mut u8, size: usize) {
+    if size != 0x00 && ptr != BYPASS_PTR.as_mut() {
+        std::alloc::dealloc(ptr, Layout::from_size_align_unchecked(size, ALIGNMENT));
+    }
 }
 
 pub unsafe fn reallocate(ptr: *mut u8, old_size: usize, new_size: usize) -> *mut u8 {
-    let new_ptr = std::alloc::realloc(
-        ptr,
-        Layout::from_size_align_unchecked(old_size, ALIGNMENT),
-        new_size,
-    );
-    if !new_ptr.is_null() && new_size > old_size {
-        new_ptr.add(old_size).write_bytes(0, new_size - old_size);
+    if old_size == 0x00 && ptr == BYPASS_PTR.as_mut() {

Review comment:
       Addressed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] andygrove commented on pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

andygrove commented on pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#issuecomment-630489059


   @vertexclique Looks like a formatting issue needs to be resolved:
   
   ```
   Diff in /arrow/rust/arrow/src/memory.rs at line 28:
    /// As you can see this is global and lives as long as the program lives.
    /// Be careful to not write anything to this pointer in any scenario.
    /// If you use allocation methods shown here you won't have any problems.
   -const BYPASS_PTR: NonNull<u8> =
   -    unsafe { NonNull::new_unchecked(ALIGNMENT as *mut u8) };
   +const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(ALIGNMENT as *mut u8) };
    
    pub fn allocate_aligned(size: usize) -> *mut u8 {
        unsafe {
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r418999354



##########
File path: rust/arrow/src/util/bit_util.rs
##########
@@ -148,11 +148,17 @@ pub fn count_set_bits_offset(data: &[u8], offset: usize, length: usize) -> usize
 /// Returns the ceil of `value`/`divisor`
 #[inline]
 pub fn ceil(value: usize, divisor: usize) -> usize {
-    let mut result = value / divisor;
-    if value % divisor != 0 {
-        result += 1
-    };
-    result
+    if value == 0_usize {

Review comment:
       Oh it is, meanwhile looking for zero sized allocations I came across with this, from this chunk of code:
   ```
   impl BufferBuilderTrait<BooleanType> for BufferBuilder<BooleanType> {
       fn new(capacity: usize) -> Self {
           let byte_capacity = bit_util::ceil(capacity, 8);
           let actual_capacity = bit_util::round_upto_multiple_of_64(byte_capacity);
           let mut buffer = MutableBuffer::new(actual_capacity);
           buffer.set_null_bits(0, actual_capacity);
           Self {
               buffer,
               len: 0,
               _marker: PhantomData,
           }
       }
   ```
   
   BufferBuilderTrait is using this code for every reallocation. Ceil is not euclidean C- division. According to this paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/divmodnote-letter.pdf So I thought better to use established C-division in this case, where things got improved from that side too.
   
   Separate PR should be open to fixing this in Parquet too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] sunchao commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

sunchao commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r418434103



##########
File path: rust/arrow/src/util/bit_util.rs
##########
@@ -148,11 +148,17 @@ pub fn count_set_bits_offset(data: &[u8], offset: usize, length: usize) -> usize
 /// Returns the ceil of `value`/`divisor`
 #[inline]
 pub fn ceil(value: usize, divisor: usize) -> usize {
-    let mut result = value / divisor;
-    if value % divisor != 0 {
-        result += 1
-    };
-    result
+    if value == 0_usize {

Review comment:
       what is the motivation for this change? I guess it is not related to zero sized allocation?

##########
File path: rust/arrow/src/memory.rs
##########
@@ -20,34 +20,56 @@
 
 use std::alloc::Layout;
 use std::mem::align_of;
+use std::ptr::NonNull;
 
 pub const ALIGNMENT: usize = 64;
 
+///
+/// As you can see this is global and lives as long as the program lives.
+/// Be careful to not write anything to this pointer in any scenario.
+/// If you use allocation methods shown here you won't have any problems.
+const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(0xDEADBEEF as *mut u8) };
+
 pub fn allocate_aligned(size: usize) -> *mut u8 {
     unsafe {
-        let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
-        std::alloc::alloc_zeroed(layout)
+        if size == 0 {
+            // In a perfect world, there is no need to request zero size allocation.
+            // Currently, passing zero sized layout to alloc is UB.
+            // This will dodge allocator api for any type.
+            BYPASS_PTR.as_mut()
+        } else {
+            let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
+            std::alloc::alloc_zeroed(layout)
+        }
     }
 }
 
-pub unsafe fn free_aligned(p: *mut u8, size: usize) {
-    std::alloc::dealloc(p, Layout::from_size_align_unchecked(size, ALIGNMENT));
+pub unsafe fn free_aligned(ptr: *mut u8, size: usize) {
+    if size != 0x00 && ptr != BYPASS_PTR.as_mut() {
+        std::alloc::dealloc(ptr, Layout::from_size_align_unchecked(size, ALIGNMENT));
+    }
 }
 
 pub unsafe fn reallocate(ptr: *mut u8, old_size: usize, new_size: usize) -> *mut u8 {
-    let new_ptr = std::alloc::realloc(
-        ptr,
-        Layout::from_size_align_unchecked(old_size, ALIGNMENT),
-        new_size,
-    );
-    if !new_ptr.is_null() && new_size > old_size {
-        new_ptr.add(old_size).write_bytes(0, new_size - old_size);
+    if old_size == 0x00 && ptr == BYPASS_PTR.as_mut() {

Review comment:
       is the first condition necessary? what if caller (accidentally) passed a positive `old_size` with the `BYPASS_PTR`? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#issuecomment-629201947


   @sunchao rebased.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r419000857



##########
File path: rust/arrow/src/memory.rs
##########
@@ -20,34 +20,56 @@
 
 use std::alloc::Layout;
 use std::mem::align_of;
+use std::ptr::NonNull;
 
 pub const ALIGNMENT: usize = 64;
 
+///
+/// As you can see this is global and lives as long as the program lives.
+/// Be careful to not write anything to this pointer in any scenario.
+/// If you use allocation methods shown here you won't have any problems.
+const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(0xDEADBEEF as *mut u8) };
+
 pub fn allocate_aligned(size: usize) -> *mut u8 {
     unsafe {
-        let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
-        std::alloc::alloc_zeroed(layout)
+        if size == 0 {
+            // In a perfect world, there is no need to request zero size allocation.
+            // Currently, passing zero sized layout to alloc is UB.
+            // This will dodge allocator api for any type.
+            BYPASS_PTR.as_mut()
+        } else {
+            let layout = Layout::from_size_align_unchecked(size, ALIGNMENT);
+            std::alloc::alloc_zeroed(layout)
+        }
     }
 }
 
-pub unsafe fn free_aligned(p: *mut u8, size: usize) {
-    std::alloc::dealloc(p, Layout::from_size_align_unchecked(size, ALIGNMENT));
+pub unsafe fn free_aligned(ptr: *mut u8, size: usize) {
+    if size != 0x00 && ptr != BYPASS_PTR.as_mut() {
+        std::alloc::dealloc(ptr, Layout::from_size_align_unchecked(size, ALIGNMENT));
+    }
 }
 
 pub unsafe fn reallocate(ptr: *mut u8, old_size: usize, new_size: usize) -> *mut u8 {
-    let new_ptr = std::alloc::realloc(
-        ptr,
-        Layout::from_size_align_unchecked(old_size, ALIGNMENT),
-        new_size,
-    );
-    if !new_ptr.is_null() && new_size > old_size {
-        new_ptr.add(old_size).write_bytes(0, new_size - old_size);
+    if old_size == 0x00 && ptr == BYPASS_PTR.as_mut() {

Review comment:
       Addressed. True, that can be a case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] vertexclique commented on a change in pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

Posted by GitBox <gi...@apache.org>.

vertexclique commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r417823393



##########
File path: rust/arrow/src/memory.rs
##########
@@ -20,34 +20,56 @@
 
 use std::alloc::Layout;
 use std::mem::align_of;
+use std::ptr::NonNull;
 
 pub const ALIGNMENT: usize = 64;
 
+///
+/// As you can see this is global and lives as long as the program lives.
+/// Be careful to not write anything to this pointer in any scenario.
+/// If you use allocation methods shown here you won't have any problems.
+const BYPASS_PTR: NonNull<u8> = unsafe { NonNull::new_unchecked(0xDEADBEEF as *mut u8) };
+
 pub fn allocate_aligned(size: usize) -> *mut u8 {
     unsafe {

Review comment:
       It is allowed in the language itself, though my personal opinion that this should be allowed neither in this codebase nor in general at the language level. By definition, it still goes to the OS and does the necessary kernel procedure.
   
   I didn't want to change this code piece at first, wanted to fix it at the caller site. But I've discovered that the forgetting this at the caller site or checking it creates yet another responsibility to a newcomer to code. So I've explicitly put it in the place of memory operations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org