You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "frankliee (via GitHub)" <gi...@apache.org> on 2023/04/06 10:00:42 UTC

[GitHub] [spark] frankliee commented on a diff in pull request #40646: [WIP][SPARK-42696]Speed up parquet reading with Java Vector API

frankliee commented on code in PR #40646:
URL: https://github.com/apache/spark/pull/40646#discussion_r1159572505


##########
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java:
##########
@@ -60,6 +71,22 @@ private enum MODE {
   private int bitWidth;
   private int bytesWidth;
   private BytePacker packer;
+  private BytePacker packerVector512;
+  private static final int BITS_PER_BYTE = 8;
+  // register of avx512 are 512 bits, and can load up to 64 bytes
+  private static final int BYTES_PER_VECTOR_512 = 64;
+
+  // values are bit packed 8 at a time, so reading bitWidth will always work
+  private static final int NUM_VALUES_TO_PACK = 8;
+  private static final Boolean vector512Support;
+  static {
+    if (supportVector512FromCPUFlags()

Review Comment:
   Do we need to create SparkContext in static code ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org