You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by beliefer <be...@163.com> on 2022/08/24 04:09:03 UTC

[DISCUSS] Improve the performance of Spark Decimal.

Hi all,




Recently, we found many SQL query could improve performance by replace Spark Decimal to Double. This is confirmed by our many investigations and tests. We have investigated other databases or big data systems and found that some projects also have problems or PR for improving the performance of Java BigDecimal.




Please refer:




Hive: https://github.com/apache/hive/commit/98893823dc57a9c187ad5c97c9c8ce03af1fa734#diff-b4a1ba785de48d9b18f680e38c39c76e86d939c7c7a2dc75be776ff6b11999df




Hive uses an int array with a length of 4 to implement the new Decimal128 type.




Trino: https://github.com/trinodb/trino/pull/10051




Trino references the github project https://github.com/martint/int128. The project int128 uses an long array with a length of 2 to implement the Int128 type. Trino further encapsulates Int128 and implements the new decimal type.




In combination with the memory format of spark SQL, Hive not only stores 4 ints to unsaferow, but also stores one byte for positive and negative, and one short for scale.The Trino method only needs to store 2 long to unsaferow in succession, and needs to store an int to represent scale. Because UnsafeRow store one 8-byte word per field, so the Hive method need store 6 * 8 = 48 bytes, but Trino method only need stroe 3 * 8 = 24 bytes.




Considering that we can express the positive and negative and scale information required by Hive method through type metadata, we still need to occupy 4 * 8 = 32 bytes. At the same time, we can store the scale information required by the Trino method into the type metadata, so we only need to occupy 2 * 8 = 16 bytes.




We already implement a new Decimal128 type with Int128 and implement the + operator of Decimal128 in my forked Spark. After the performance comparison between Decimal128(Int128) and Spark Decimal, we get the benchmark and know Decimal128 has 1.5x ~ 3x performance improvement compared with Spark Decimal.




The attachment of the email contains relevant design documents and benchmarks.




What do you think of this idea? 

If you support this idea, I could create a first PR for further and deeper discussion.

Re:[DISCUSS] Improve the performance of Spark Decimal.

Posted by beliefer <be...@163.com>.
There is a WIP PR https://github.com/apache/spark/pull/37536, only implement the add operator of Decimal128







At 2022-08-24 12:09:03, "beliefer" <be...@163.com> wrote:

Hi all,




Recently, we found many SQL query could improve performance by replace Spark Decimal to Double. This is confirmed by our many investigations and tests. We have investigated other databases or big data systems and found that some projects also have problems or PR for improving the performance of Java BigDecimal.




Please refer:




Hive: https://github.com/apache/hive/commit/98893823dc57a9c187ad5c97c9c8ce03af1fa734#diff-b4a1ba785de48d9b18f680e38c39c76e86d939c7c7a2dc75be776ff6b11999df




Hive uses an int array with a length of 4 to implement the new Decimal128 type.




Trino: https://github.com/trinodb/trino/pull/10051




Trino references the github project https://github.com/martint/int128. The project int128 uses an long array with a length of 2 to implement the Int128 type. Trino further encapsulates Int128 and implements the new decimal type.




In combination with the memory format of spark SQL, Hive not only stores 4 ints to unsaferow, but also stores one byte for positive and negative, and one short for scale.The Trino method only needs to store 2 long to unsaferow in succession, and needs to store an int to represent scale. Because UnsafeRow store one 8-byte word per field, so the Hive method need store 6 * 8 = 48 bytes, but Trino method only need stroe 3 * 8 = 24 bytes.




Considering that we can express the positive and negative and scale information required by Hive method through type metadata, we still need to occupy 4 * 8 = 32 bytes. At the same time, we can store the scale information required by the Trino method into the type metadata, so we only need to occupy 2 * 8 = 16 bytes.




We already implement a new Decimal128 type with Int128 and implement the + operator of Decimal128 in my forked Spark. After the performance comparison between Decimal128(Int128) and Spark Decimal, we get the benchmark and know Decimal128 has 1.5x ~ 3x performance improvement compared with Spark Decimal.




The attachment of the email contains relevant design documents and benchmarks.




What do you think of this idea? 

If you support this idea, I could create a first PR for further and deeper discussion.