You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/06 11:50:05 UTC

[GitHub] [airflow] potiuk opened a new pull request, #24871: Run "fix-ownership" with sudo rather than docker image if specified

potiuk opened a new pull request, #24871:
URL: https://github.com/apache/airflow/pull/24871

   The "fix-ownership" command uses our breeze docker image to fix
   ownership of files generated by breeze. This is nice from the user
   perspective because there is no need to authenticate with password
   and - since we already have breeze image handy - it is fast and
   painless. However in CI, there are often cases where "fix-ownership"
   does not really have an image available and needs to pull one, which
   takes 1 minute. However the user in CI has sudoer capability without
   password, so we can use it to run fix-ownership with just plain
   sudo/chown.
   
   We switch all the CI jobs to "USE_SUDO" and also avoid running
   fix-ownership on other systems that Linux, as the ownership of
   files creaed from docker as root is only problem when the filesystem
   is directly mounted and used in the container (which happens only on
   Linux). MacOS and Windows have "user-space" filesystem that are
   much slower, but then they perform user-remapping on their own
   and files created in container have automatically ownership of
   the user who runs the docker container command.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24871:
URL: https://github.com/apache/airflow/pull/24871#issuecomment-1176185496

   Seems it works. For most cases it will be under 1s as it used to be, but it will be way faster for the edge-cases.
   
   <img width="1656" alt="Screenshot 2022-07-06 at 14 51 30" src="https://user-images.githubusercontent.com/595491/177554347-426101ba-9fc2-4343-8157-af12a0f24f7f.png">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916350016


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"

Review Comment:
   Thanks for the context 🎉



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916336469


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"

Review Comment:
   See above - I think it's not an error. 
   
   One more comment - why: mainly because "fix-ownership" should be used in CI scripts. Which usually run a sequence  of commands. And "fix-ownership" main goal is to make sure that the checked out directory of Airlfow is in "pristine" state"  - i.e. no files have "root" ownership (because if they are then it makes further check-outs potentially failing when run as non-root user).
   
   So any kind of sequence of operations that we run in CI is:
   
   1) checkout 
   2) do some stuff (potentially in docker producing root-owned files on linux)
   3) fix-ownership 
   
   Result: -> "pristine" workspace, no "root owned" files
   
   if we start to run our CI on Mac or Windows (we will likely eventually run some CI jobs on Windows at the very least) , if you error out in step 3), you have to do script like this:
   
   1) checkout
   2) do some stuff (potentially in docker producing root-owned files on linux)
   3) if Linux: fix-ownership
   
   I prefer to run the conditional inside fix-ownership command and NOT error-out on windows. because otherwise i have to repeat "If linux" in all places.
   
   We already run fix-ownershio in 25 places in our CI
   
   ```
   [jarek:~/code/airflow-copy] [airflow-3.9] v2-3-test+ ± grep fix-ownership .github/workflows/*.yml | wc
         25     100    1526
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #24871:
URL: https://github.com/apache/airflow/pull/24871


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24871:
URL: https://github.com/apache/airflow/pull/24871#issuecomment-1176127473

   That should speed up a little some of the jobs, it should also significantly speed up the speed of cancelling a job. Currently there are some cases when image is not yet build and when we run `fix-ownership` it will pull the image before running and it adds about a minute to cancelling a job. After this one is merged, cancelling should be much faster in those cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916321910


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"
+        )
+        sys.exit(0)

Review Comment:
   Should this be non-zero?



##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"

Review Comment:
   Not sure this is a warning either. Error maybe?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24871:
URL: https://github.com/apache/airflow/pull/24871#issuecomment-1178178129

   Small thing but should speed up certain buids


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916336469


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"

Review Comment:
   See above - I think it's not an error. 
   
   One more comment - why: mainly because "fix-ownership" shoudl be used in CI scripts. Which usually run a sequence  of commands. And "fix-ownership" main goal is to make sure that the checked out directory of Airlfow is in "pristine" state"  - i.e. no files have "root" ownership (because if they are then it makes further check-outs potentially failing when run as non-root user).
   
   So any kind of sequence of operations that we run in CI is:
   
   1) checkout 
   2) do some stuff (potentially in docker producing root-owned files on linux)
   3) fix-ownership 
   
   Result: -> "pristine" workspace, no "root owned" files
   
   if we start to run our CI on Mac or Windows (we will likely eventually run some CI jobs on Windows at the very least) , if you error out in step 3), you have to do script like this:
   
   1) checkout
   2) do some stuff (potentially in docker producing root-owned files on linux)
   3) if Linux: fix-ownership
   
   I prefer to run the conditional inside fix-ownership command and NOT error-out on windows. because otherwise i have to repeat "If linux" in all places.
   
   We already run fix-ownershio in 25 places in our CI
   
   ```
   [jarek:~/code/airflow-copy] [airflow-3.9] v2-3-test+ ± grep fix-ownership .github/workflows/*.yml | wc
         25     100    1526
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916339053


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"

Review Comment:
   BTW. Just to give a little context on why we need the `fix-ownership` in CI at all.
   
   The "root" (pun intended!) problenm here is that we are re-using workers sometimes. This means that when we "leave" the job, we should do some cleanup. I would prefer (and this is what I normally do) to  always start from scratch on CI. I call it "TRUE CLEAN STATE" of the CI worker. This is what GitHub public runners do - you always get empty, pristine, clean workspace when you start your job. I would very, very much prefer it - we already had some cases where left-overs from previous jobs (like those root-owned fles) created problem for the next worker (when they tried to checkout latest branch, they failed as they could not delete the root-owned files).
   
   But in our case currently we use Self-hosted runners which are often-reused-AWS instances. until we move to K8S, we cannot really do much about it (unless we want to loose extra time on spinning the machines) and we do everything possible to clean-up the machines after the job is complete, so that the next job gets a "clean state" - this "fix-ownership" is part of the cleanup. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #24871: Run "fix-ownership" with sudo rather than docker image if specified

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916333504


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: {result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be same as host user.")
+@click.option(
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and your system is {system}"
+        )
+        sys.exit(0)

Review Comment:
   Yeah I asked myself the same question (and even changed it as originally I had error there). I think warning and exit(0) is better.  
   
   The thing is that when you run "fix-ownership" you expect the ownership to be fixed at the end. And in a weird way, on Mac an Windows you achieve your goal - because the files did not have wrong ownership in the first place. So the goal of the command is achieved.
   
   That's why I think warning and "exit(0)" is fine :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org