You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2021/11/29 01:24:15 UTC

[impala] 04/04: IMPALA-10994: Normalize the pip package name part of download URL.

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit f566e7dee7b64dcf309bfd497b0296f8e547dfb7
Author: yx91490 <yx...@126.com>
AuthorDate: Mon Nov 1 12:33:50 2021 +0800

    IMPALA-10994: Normalize the pip package name part of download URL.
    
    According to PEP-0503, pip repo server doesn't support unnormalized URL
    access, and some package name within
    'infra/python/deps/*requirements.txt' are unnormalized, e.g. 'Cython',
    and pip_download.py will concat $PYPI_MIRROR and package name to get
    download URL directly, which maybe unnormalized.
    
    Fix this by normalize package name in download URL using the
    recommanded method in PEP-0503.
    
    Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
    Reviewed-on: http://gerrit.cloudera.org:8080/17987
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 infra/python/deps/pip_download.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/infra/python/deps/pip_download.py b/infra/python/deps/pip_download.py
index b289ce1..d56e028 100755
--- a/infra/python/deps/pip_download.py
+++ b/infra/python/deps/pip_download.py
@@ -82,7 +82,8 @@ def get_package_info(pkg_name, pkg_version):
   # to sort them and return the first value in alphabetical order. This ensures that the
   # same result is always returned even if the ordering changed on the server.
   candidates = []
-  url = '{0}/simple/{1}/'.format(PYPI_MIRROR, pkg_name)
+  normalized_name = re.sub(r"[-_.]+", "-", pkg_name).lower()
+  url = '{0}/simple/{1}/'.format(PYPI_MIRROR, normalized_name)
   print('Getting package info from {0}'.format(url))
   # The web page should be in PEP 503 format (https://www.python.org/dev/peps/pep-0503/).
   # We parse the page with regex instead of an html parser because that requires