You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2024/03/11 19:27:44 UTC
[I] Add a section to the documentation explaining that PGO can help up substantially (25%) and maybe offer some tips for users to use it? [arrow-datafusion]

alamb opened a new issue, #9561:
URL: https://github.com/apache/arrow-datafusion/issues/9561

   > Add a section to the documentation explaining that PGO can help up substantially (25%) and maybe offer some tips for users to use it?
   
   Yes, it would be a great option. It requires almost no resources to maintain (write once and link to this discussion for the results). In this case, users who are interested in optimizing `arrow-datafusion` more will be able to use this information as an additional optimization opportunity. I have several examples of how such documentation can be written (it's for applications but anyway - for a library case it should look a similar way):
   
   * ClickHouse: https://clickhouse.com/docs/en/operations/optimizing-performance/profile-guided-optimization
   * Databend: https://databend.rs/doc/contributing/pgo
   * Vector: https://vector.dev/docs/administration/tuning/pgo/
   * Nebula: https://docs.nebula-graph.io/3.5.0/8.service-tuning/enable_autofdo_for_nebulagraph/
   * GCC: Official [docs](https://gcc.gnu.org/install/build.html), section "Building with profile feedback" (even AutoFDO build is supported)
   * Clang:
     - https://llvm.org/docs/HowToBuildWithPGO.html
     - https://llvm.org/docs/AdvancedBuilds.html
   * Rustc: https://rustc-dev-guide.rust-lang.org/building/optimized-build.html#profile-guided-optimization
   * tsv-utils: https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md
   
   > Provide pre-gathered PGO data somehow, so users could build DataFusion with profiles guided from TPCH (or clickbench).
   
   Unfortunately, this way is a bit trickier in practice. Pre-gathered PGO profiles have multiple issues - e.g. incompatibilities between different compiler versions, a profile skew (when a PGO profile is gathered for an older version of the code. When time flies, pre-gathered PGO profiles become less and less efficient so some kind of regular PGO profile regeneration is required). 
   
   I could suggest another similar way - integrate into the build scripts the way to build the library with enabled PGO (based on some workload like TPCH, Clickbench, any other target workload, or any combination of them - it's up to discussion). On the one hand, users will be able to build the PGO-optimized version of the library. On another hand, you won't waste your maintenance resources on maintaining always up-to-date pre-gathered PGO profiles (however, this process can be simplified with CI).
   
   Some examples of PGO build integration into the build scripts:
   
   * Rustc: a CI [tool](https://github.com/rust-lang/rust/tree/master/src/tools/opt-dist) for the multi-stage build
   * GCC:
     - Official [docs](https://gcc.gnu.org/install/build.html), section "Building with profile feedback" (even AutoFDO build is supported)
     - A [part](https://github.com/gcc-mirror/gcc/blob/master/configure#L7896) in a "wonderful" `configure` script.
   * Clang:
     - [Docs](https://llvm.org/docs/HowToBuildWithPGO.html)
     - [MinGW build script](https://github.com/msys2/MINGW-packages/commit/4dd91d1d4dfef17f1f451c3a8f59303be855e4b5)
   * Python:
     - CPython: [README](https://github.com/python/cpython#profile-guided-optimization)
     - Pyston: [README](https://github.com/pyston/pyston#building)
   * Go: [Bash script](https://github.com/golang/go/blob/master/src/cmd/compile/profile.sh)
   * Swift: [CMake script](https://github.com/apple/swift/blob/main/CMakeLists.txt#L364)
   * V8: [Bazel flag](https://github.com/v8/v8/blob/main/BUILD.gn#L184)
   * ChakraCore: [Scripts](https://github.com/chakra-core/ChakraCore/tree/master/Build/scripts/pgo)
   * Chromium: [Script](https://chromium.googlesource.com/chromium/src/build/config/+/refs/heads/main/compiler/pgo/BUILD.gn)
   * Firefox: [Docs](https://firefox-source-docs.mozilla.org/build/buildsystem/pgo.html)
      - Thunderbird has PGO support too
   * PHP - [Makefile command](https://github.com/php/php-src/blob/master/build/Makefile.global#L138) and old Centminmod [scripts](https://github.com/centminmod/php_pgo_training_scripts)
   * MySQL: [CMake script](https://github.com/mysql/mysql-server/blob/8.0/cmake/fprofile.cmake)
   * YugabyteDB: [GitHub commit](https://github.com/yugabyte/yugabyte-db/commit/34cb791ed9d3d5f8ae9a9b9e9181a46485e1981d)
   * FoundationDB: [Script](https://github.com/apple/foundationdb/blob/1a6114a66f3de508c0cf0a45f72f3687ba05750c/contrib/generate_profile.sh)
   * Zstd: [Makefile](https://github.com/facebook/zstd/blob/dev/programs/Makefile#L232)
   * [Foot](https://codeberg.org/dnkl/foot): [Scripts](https://codeberg.org/dnkl/foot/src/branch/master/pgo)
   * Windows Terminal: [GitHub PR](https://github.com/microsoft/terminal/pull/10071)
   * Pydantic-core: [GitHub PR](https://github.com/pydantic/pydantic-core/pull/741)
   * file.d: [GitHub PR](https://github.com/ozontech/file.d/pull/469)
   * OceanBase: [CMake flag](https://github.com/oceanbase/oceanbase/blob/master/cmake/Env.cmake#L55)
   * ISPC: [CMake scipts](https://github.com/ispc/ispc/tree/main/superbuild)
   * NodeJS: [Configure script](https://github.com/nodejs/node/commit/9be15559cc0bfe506d9cdfba4ad0f4beacf5ce17)
   * Android Open Source Project (AOSP):
     - [Official documentation](https://source.android.com/docs/core/perf/pgo)
     - Committed PGO profiles: [repository](https://android.googlesource.com/toolchain/pgo-profiles/+/refs/heads/main)
   * DMD: [Custom build rule](https://github.com/dlang/dmd/blob/master/compiler/src/build.d#L553)
   * LDC: [GitHub action](https://github.com/ldc-developers/ldc/blob/master/.github/actions/2a-build-pgo/action.yml)
   * tsv-utils: [Makefile](https://github.com/eBay/tsv-utils/blob/master/makefile#L56)
   * Erlang OTP: [Makefile](https://github.com/erlang/otp/blob/master/Makefile.in#L484)
   * Clingo (PGO enabled only in Spack): [Package recipe](https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/clingo-bootstrap/package.py#L96)
   * SWI-Prolog:
     - [Script](https://github.com/SWI-Prolog/swipl-devel/blob/master/scripts/pgo-compile.sh)
     - [CMake module](https://github.com/SWI-Prolog/swipl-devel/blob/master/cmake/PGO.cmake)
   * hck: [Justfile](https://github.com/sstadick/hck/blob/master/justfile#L27)
   
   If you have some prebuilt versions of the library (e.g. a Python wheel), you can think about pre-optimizing these prebuilt binaries with PGO (based on TPCH, Clickbench, etc.). As an example - Pydantic-core: [GitHub PR](https://github.com/pydantic/pydantic-core/pull/741).
   
   > In general I don't think many organizations will set up PGO with their workload (as they often don't have an easy-to-run / representative benchmark available during builds or perhaps don't want to slow down their build process by running benchmarks at the same time)
   
   It's a pity but I agree with you. Current PGO adoption across the industry is low (except for companies like Google and Facebook - which use PGO and similar optimization technologies like LLVM BOLT). This is the situation that I am trying to change by showing positive PGO effects for different applications.
   
   _Originally posted by @zamazan4ik in https://github.com/apache/arrow-datafusion/discussions/9507#discussioncomment-8730858_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org