You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/01 09:47:06 UTC

[GitHub] [iceberg] chulucninh09 opened a new pull request, #4929: Improve binary pack/unpack performance using cached struct

chulucninh09 opened a new pull request, #4929:
URL: https://github.com/apache/iceberg/pull/4929

   Instead of construct `Struct` each time in conversion functions using `struct(c_type).unpack`, we can create struct one in module and use the cached structs to pack and unpack, getting performance gain.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #4929:
URL: https://github.com/apache/iceberg/pull/4929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chulucninh09 commented on a diff in pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
chulucninh09 commented on code in PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#discussion_r888590997


##########
python/src/iceberg/conversions.py:
##########
@@ -53,6 +53,13 @@
 from iceberg.utils.decimal import decimal_to_bytes, unscaled_to_decimal
 
 
+_bool_struct = Struct("<?")

Review Comment:
   I changed it to all caps with leading underscore to follow the convention @Fokko @kbendick @samredai 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a diff in pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
kbendick commented on code in PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#discussion_r887069062


##########
python/src/iceberg/conversions.py:
##########
@@ -53,6 +53,13 @@
 from iceberg.utils.decimal import decimal_to_bytes, unscaled_to_decimal
 
 
+_bool_struct = Struct("<?")

Review Comment:
   I would personally still think of it as a constant regardless of complexity, but I agree so long as they're not used outside of this file that's the more important thing to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#issuecomment-1146859786

   Thanks, @chulucninh09!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#discussion_r886814317


##########
python/src/iceberg/conversions.py:
##########
@@ -53,6 +53,13 @@
 from iceberg.utils.decimal import decimal_to_bytes, unscaled_to_decimal
 
 
+_bool_struct = Struct("<?")

Review Comment:
   Since they are private (starting with an underscore), they shouldn't be used outside of the file. We could also consolidate this later on with my open PR: https://github.com/apache/iceberg/pull/4920/files#diff-9747a5712c71edff07e52c56ae999d30fa9dde758ba94e376acd16819e055731R23-R27



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] samredai commented on a diff in pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
samredai commented on code in PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#discussion_r886809691


##########
python/src/iceberg/conversions.py:
##########
@@ -53,6 +53,13 @@
 from iceberg.utils.decimal import decimal_to_bytes, unscaled_to_decimal
 
 
+_bool_struct = Struct("<?")

Review Comment:
   Do these count as constants? If so the python style guide recommends all caps for the name.
   
   *PEP8 - Constants*
   > Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX_OVERFLOW and TOTAL.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chulucninh09 commented on pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
chulucninh09 commented on PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#issuecomment-1145562231

   > @chulucninh09 Can you run the linter? You can easily do this by running:
   > 
   > ```shell
   > make install
   > make lint
   > ```
   > 
   > This will make the CI pass 👍🏻
   
   I linted the code. And also I change the naming to all cap with `_` to follow the convention


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
Fokko commented on PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#issuecomment-1145155081

   @chulucninh09 Can you run the linter? You can easily do this by running:
   ```python
   make install
   make lint
   ```
   This will make the CI pass 👍🏻 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chulucninh09 commented on a diff in pull request #4929: Improve binary pack/unpack performance using cached struct

Posted by GitBox <gi...@apache.org>.
chulucninh09 commented on code in PR #4929:
URL: https://github.com/apache/iceberg/pull/4929#discussion_r887008616


##########
python/src/iceberg/conversions.py:
##########
@@ -53,6 +53,13 @@
 from iceberg.utils.decimal import decimal_to_bytes, unscaled_to_decimal
 
 
+_bool_struct = Struct("<?")

Review Comment:
   @samredai @Fokko correct me if I'm wrong. I consider this cached instance instead of constant because most of the time const is scalar value to me. These are class instance, kind of too complex object to consider it const by my understanding



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org