You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/04 09:01:52 UTC

[GitHub] [airflow-site] Bowrna opened a new pull request, #623: replacing site.sh with python

Bowrna opened a new pull request, #623:
URL: https://github.com/apache/airflow-site/pull/623

   closes: #338
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1384518413

   There are a number of problems with dependencies the current approach has (and it caused us some problems in the recent past) so I think this is somethign we have to do anyway soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196301009

   > Have you run `git submodules update --init`?
   
   Yes after running this the `build-site` command works. Thanks, @rossturk. 
   I have error occurring in `check-site-links` command and it fails with following error message.
   
   ```
   sathishkannan@Sathishs-MacBook-Air airflow-site % python site_tool.py check-site-links
   Checking if node module exists
   Check if landing-pages/dist/index.html file exists
   /opt/homebrew/bin/lynx
   Working directory: /Users/sathishkannan/code/airflow-site/landing-pages
   Dist directory: ./dist/
   
   ./check-links.sh: line 61: readarray: command not found
   sathishkannan@Sathishs-MacBook-Air airflow-site % ./site.sh check-site-links
   2022-07-27 11:32:33:INFO: Checking if node module exists
   2022-07-27 11:32:33:INFO: Check if /Users/sathishkannan/code/airflow-site/landing-pages/dist/index.html file exists
   2022-07-27 11:32:33:INFO: Changing to /Users/sathishkannan/code/airflow-site/landing-pages/ and running: ./check-links.sh@
   /opt/homebrew/bin/lynx
   Working directory: /Users/sathishkannan/code/airflow-site/landing-pages
   Dist directory: ./dist/
   
   ./check-links.sh: line 61: readarray: command not found
   sathishkannan@Sathishs-MacBook-Air airflow-site %
   ```
   Am I missing downloading anything here? I checked in CONTRIBUTORS.md guide but didn't find anything relevant to this. 
   I am working on Mac M1 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1199182911

   > > When executing in terminal I see `index.html` from all providers and its different version say example `/dist//docs/apache-airflow-providers-imap/1.0.0/_modules/index.html`. But i don't see them when executing it via `check-links.sh`
   > 
   > Could you share your view for this point? This is also another difference in the output when executing it in different ways @potiuk .
   
   I think it's just the matter of when it is called and whether the files are there when it is executed. Some earlier steps likely prepare the files in dist folder and maybe that step is missing when you run it manually ? Can't think of any other reason. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196761388

   I think it has never been supposed to be run on Mac :). 
   
   And one of the reasons we want to rewrite everything in Python, is that bash is no longer "common" accross multiple platform. The MacOS Bash is Bash 3 (for licencing reason - Apple refused to use Bash4 due to GPLv3 (https://news.ycombinator.com/item?id=20102597)  and a lot of tools there are not really POSIX compliant, so there are plenty of hacks around that (readarray is one of the features of Bash4).
   
   The ultiimate goal we have is that Breeze and all our tools can be used WITHOUT having or using bash in the Host. Fully, Once we get there, the next step will be to make sure that it works on Windows.
   
   And that should be our goal here as well (even though site.sh is not really used by many people). So - let's just not mention in CONTRIBUTING.md that you need to install bash from brew. I deliberately avoided that necessity, because that it yet another prerequisite that adds friction to the first-time experience. 
   
   Ideally just `Python3 + Docker + Docker-Compose` should be enough to do everything developer needs to do in Airlfow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] rossturk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
rossturk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1195629717

   
   > Error: Error building site: failed to render pages: render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   
   Hmmmm I may be wrong here but if I remember correctly this is one of the errors that occurs when the submodules aren't checked out.
   
   Have you run `git submodules update --init`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1374354911

   @potiuk @mik-laj 
   https://github.com/apache/airflow-site/blob/main/landing-pages/check-links.sh#L65
   
   `mapfile -t links < <(printf '%s\n' "${pages[@]}" | xargs -n 1 lynx -listonly -nonumbers -dump | grep -v " ")
   `
   Here to check the site links, we rely on the lynx command to get the clean url from the html file. Could i use the same command in python or is there any other way that helps to get the url in clean way?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] eladkal commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1359363323

   @Bowrna are you still working on this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1192201127

   > @Bowrna see: https://stackoverflow.com/a/12498485 This generates a relative path for use in a container.
   
   Could you tell me why we need to convert the absolute path into relative path to run the script?  Wouldn't it run fine with the absolute path? @mik-laj 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] rossturk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
rossturk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196829117

   > > I think perhaps CONTRIBUTING.md should instruct folks to install bash from Homebrew. I haven't tested it myself, but this looks promising:
   > 
   > 
   > 
   > This has profound consequences. Many tools which "assume" bash is v3 on Mac will stop working if you make Bash 4 default. Even if you try to install it via `brew` it will tell you that it is `cask-only`, which means that it is not becoming a default `bash` - you need to perform an extra step to make it default precisely because it might break a lot of things on Mac.  
   
   I'm sorry, what are you suggesting as an alternative?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1194209955

   > Is this redirect still valid even after removing the docker part @mik-laj @potiuk @turbaszek @rossturk
   
   Nope. Not needed. I think it was not even needed before for a long time already. This is likely a remnant of a very old approach - long time defunct.
   
   I think what is important is that after buildng the site locallly, you can start the server and locally browse the site, also a check that after deploying to airflow site it still works should happen, but this is something we can likely do only after we merge it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196842738

   > I'm sorry, what are you suggesting as an alternative?
   
   Remove Bash. Completely. I think this whole task is about removing bash and replacing it with Python :).
   I thin the problem @Bowrna has (correct me @Bowrna if I am wrong) that you cannot really run the OLD script corrrectly. This can be fixed locally by updating bash with brew and setting temporary PATH to get it first on path. Once the script is replaced with Bash, there is no need to mention it in  CONTRIBUTING - because ..... there will be no bash to run any more.
   
   Or am I missing something :)? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1192624020

   Because in container it will be in a different location ("/opt/airlfow/.....") rather than `<YOUR LOCAL AIRLFOW SOURCES>`/....


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1385390566

   @potiuk Thanks for the update. I am happy to help in switching to Pelikan part :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] mik-laj commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1374371461

   We don't use this check, so I think we can drop it for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1198518078

   It fills the array 'pages' with the list of HTML files found. The problem you see is likely the `-print 0` flag- rather than producing files separated by "spaces" it separates them with \0 (null character) so that there is no confusion between space being separator between files and one that is part of the file name.
   
   so if you have 
   
   * `a file.html`
   * `another file.html`
   
   This wil produce
   
   "a file.html\0another file.html"
   
   Then when you read it with -d '' you will get array:
   
   ['a file.html', 'another file.html'] 
   
   Otherwise you'd get:
   
   ['a', 'file.html', 'another', 'file.html']
   
   The problem with differences you see is that the "\0" separated files are not broken over multiple lines because they are really a very long line with no spaces to allow the break happen.
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1195507360

   @potiuk I get the following error when building the site and I am not sure how I can solve this. This occurs when `hugo` command runs under the hood.  This happens when running the existing `site.sh` script too
   
   
   ```
   Start building sites …
   hugo v0.101.0+extended darwin/arm64 BuildDate=unknown
   INFO 2022/07/26 19:15:23 syncing static files to /
   WARN 2022/07/26 19:15:23 found no layout file for "HTML" for layout "search" for kind "page": You should create a template file which matches Hugo Layouts Lookup Rules for this combination.
   WARN 2022/07/26 19:15:23 found no layout file for "HTML" for kind "section": You should create a template file which matches Hugo Layouts Lookup Rules for this combination.
   WARN 2022/07/26 19:15:23 found no layout file for "HTML" for kind "section": You should create a template file which matches Hugo Layouts Lookup Rules for this combination.
   WARN 2022/07/26 19:15:23 found no layout file for "HTML" for kind "taxonomy": You should create a template file which matches Hugo Layouts Lookup Rules for this combination.
   WARN 2022/07/26 19:15:23 found no layout file for "HTML" for kind "taxonomy": You should create a template file which matches Hugo Layouts Lookup Rules for this combination.
   ERROR 2022/07/26 19:15:23 render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   ERROR 2022/07/26 19:15:23 render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   ERROR 2022/07/26 19:15:23 render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   ERROR 2022/07/26 19:15:23 render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   Error: Error building site: failed to render pages: render of "page" failed: "/Users/sathishkannan/code/airflow-site/landing-pages/site/layouts/blog/baseof.html:23:7": execute of template failed: template: blog/single.html:23:7: executing "blog/single.html" at <partial "head.html" .>: error calling partial: partial "head.html" not found
   Total in 825 ms
   error Command failed with exit code 255.
   info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
   error Command failed with exit code 255.
   info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1192715026

   Ah I see @potiuk. As the sh file is replaced to run without docker dependency, i forgot about this point. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1364671183

   As I am currently working on this PR, I will post the parts that I have completed and parts that are pending to keep on track.
   Following are the commands supported in the `site.sh` bash script that's being moved to python:
   1. install-node-deps (successfully moved to python as of now)
   2. preview-landing-pages (successfully moved to python as of now)
   3. build-site (successfully moved to python as of now)
   4. build-landing-pages (successfully moved to python as of now)
   5. check-site-links (only pending part)
   6. prepare-theme (successfully moved to python as of now)
   7. lint-js (successfully moved to python as of now)
   8. lint-css (successfully moved to python as of now)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by "Bowrna (via GitHub)" <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1514300961

   could i close this PR @potiuk as this is not relevant if switching to Pelican?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] turbaszek commented on a diff in pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
turbaszek commented on code in PR #623:
URL: https://github.com/apache/airflow-site/pull/623#discussion_r917361209


##########
site.py:
##########
@@ -0,0 +1,233 @@
+from pathlib import Path
+import os
+import sys
+import argparse
+import subprocess
+from typing import Union, List, Optional, Mapping
+
+IMAGE_NAME="airflow-site"
+CONTAINER_NAME="airflow-site-c"
+SITE_DIST="landing-pages/dist"
+THEME_GEN="sphinx_airflow_theme/sphinx_airflow_theme/static/_gen"
+RunCommandResult = Union[subprocess.CompletedProcess, subprocess.CalledProcessError]
+
+
+def run_command(cmd: List[str],
+    env: Optional[Mapping[str, str]] = None,
+    cwd: Optional[Path] = None,
+    input: Optional[str] = None,
+    check: bool = True,
+     **kwargs,
+)-> RunCommandResult:
+    workdir: str = str(cwd) if cwd else os.getcwd()
+    try:
+        cmd_env = os.environ.copy()
+        cmd_env.setdefault("HOME", str(Path.home()))
+        if env:
+            cmd_env.update(env)
+        return subprocess.run(cmd, input=input, check=check, env=cmd_env, cwd=workdir, **kwargs)
+    except subprocess.CalledProcessError as ex:
+        if ex.stdout:
+            print(ex.stdout)
+        if ex.stderr:
+            print(ex.stderr)
+        return ex
+
+
+def check_docker_environment():
+    is_docker_env = False
+    docker_env_file = Path(".dockerenv")
+    if docker_env_file.exists():
+        is_docker_env = True
+    return is_docker_env

Review Comment:
   ```suggestion
   
       return docker_env_file.exists()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] turbaszek commented on a diff in pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
turbaszek commented on code in PR #623:
URL: https://github.com/apache/airflow-site/pull/623#discussion_r917361174


##########
site.py:
##########
@@ -0,0 +1,233 @@
+from pathlib import Path
+import os
+import sys
+import argparse
+import subprocess
+from typing import Union, List, Optional, Mapping
+
+IMAGE_NAME="airflow-site"
+CONTAINER_NAME="airflow-site-c"
+SITE_DIST="landing-pages/dist"
+THEME_GEN="sphinx_airflow_theme/sphinx_airflow_theme/static/_gen"
+RunCommandResult = Union[subprocess.CompletedProcess, subprocess.CalledProcessError]
+
+
+def run_command(cmd: List[str],
+    env: Optional[Mapping[str, str]] = None,
+    cwd: Optional[Path] = None,
+    input: Optional[str] = None,
+    check: bool = True,
+     **kwargs,
+)-> RunCommandResult:
+    workdir: str = str(cwd) if cwd else os.getcwd()
+    try:
+        cmd_env = os.environ.copy()
+        cmd_env.setdefault("HOME", str(Path.home()))
+        if env:
+            cmd_env.update(env)
+        return subprocess.run(cmd, input=input, check=check, env=cmd_env, cwd=workdir, **kwargs)
+    except subprocess.CalledProcessError as ex:
+        if ex.stdout:
+            print(ex.stdout)
+        if ex.stderr:
+            print(ex.stderr)
+        return ex

Review Comment:
   Do we need to return an error here? Would it be better to raise it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196959230

   Ah, I see the whole point now. Currently to see how this `check-site-links` command, I have to install bash via brew and see what is happening under the hood. I will try converting that part to python after understanding what is happening in that part of code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1361295855

   @eladkal I didn't work on this for a long time as I was working on other PR. I will start working on this PR and one another in this weekend as I have some time due to the upcoming holiday week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1374436647

   @mik-laj Could you tell me if the check-site-links command in site.sh is used or not? Because that command uses this script file and here they have used lynx to extract URL? 
   
   Could you give me more explanation on that part? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1327905926

   Approved to see.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] rossturk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
rossturk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196844281

   > > I'm sorry, what are you suggesting as an alternative?
   > 
   > Remove Bash. Completely. I think this whole task is about removing bash and replacing it with Python :). I think the problem @Bowrna has (correct me @Bowrna if I am wrong) that you cannot really run the OLD script corrrectly. This can be fixed locally by updating bash with brew and setting temporary PATH to get it first on path. Once the script is replaced with Bash, there is no need to mention it in CONTRIBUTING - because ..... there will be no bash to run any more.
   > 
   > Or am I missing something :)?
   
   Yes, you are missing that this PR title is "replace site.sh" and the script we are talking about is `landing-pages/check-links.sh`. Are you suggesting that the scope of this PR is expanded to include *all* shell scripts?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196848597

   > Yes, you are missing that this PR title is "replace site.sh" and the script we are talking about is landing-pages/check-links.sh. Are you suggesting that the scope of this PR is expanded to include all shell scripts?
   
   Ahhhh. I see now :facepalm: . 
   
   Precisely. This was the whole purpose of the task - to replace all the bash usage. It makes no sense only to replace site.sh. The reason why we are doing it is not because "we do not like bash", but because "we want to get rid of bash from the host entirely - to be able to run it all on MacOS, Linux, Windows or anything else that has "native" Python (and potentially has bash).
   
   So the TITLE of the task is wrong. Should be "Remove Bash from the process of building the site".
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1307093209

   Hey @Bowrna are you going to continue with that one :) ? Not the highest prirority, so if we do not plan it, maybe better to close that one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1384517156

   H,,, I honestly think we should not invest too much in it. We are discussing about switching to Pelikan which is going to change the process completely anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1368175440

   @potiuk @mik-laj I am considering to use beautiful soup library, to extract the links from html files for check-site-links command. Is it fine or is there any better way to handle it? If there is please let me know. 
   
   Thanks. :) 
   I hope to finish this before this year end 🤞 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1193242269

   https://github.com/apache/airflow-site/blob/fd7af05318e34f4efd6b8446ba8729112f30ee08/site.sh#L199
   
   Is this redirect still valid even after removing the docker part @mik-laj @potiuk @turbaszek @rossturk 
   Could anyone help me to understand?
   
   I have this doubt because we are redirecting to the path "/docs/${package_name}/stable/index.html". So I wanted to ensure if this is still valid even after removing the docker part for building site. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] rossturk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
rossturk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196330859

   > ./check-links.sh: line 61: readarray: command not found
   
   @Bowrna I think this has to do with the version of bash. If I try this on my M1 Mac, I get an error:
   
   ```bash
   bash-3.2$ bash -version
   bash -version
   GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21)
   Copyright (C) 2007 Free Software Foundation, Inc.
   
   bash-3.2$ readarray foo < <(ls -al)
   readarray foo < <(ls -al)
   ls -al
   bash: readarray: command not found
   ```
   
   But if I try it on my Linux machine, I get this:
   
   ```bash
   rturk@caymus:~$ bash --version
   GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
   Copyright (C) 2019 Free Software Foundation, Inc.
   License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
   
   This is free software; you are free to change and redistribute it.
   There is NO WARRANTY, to the extent permitted by law.
   
   rturk@caymus:~$ readarray foo < <(ls -al)
   
   rturk@caymus:~$ echo ${foo[8]}
   drwxr-xr-x 2 rturk caymus 4096 Oct 6 2021 .dbt
   ```
   
   I think perhaps the `CONTRIBUTING.md` should instruct folks to install bash from Homebrew. I haven't tested it myself, but this looks promising:
   
   ```
   % brew info bash
   bash: stable 5.1.16 (bottled), HEAD
   ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196766358

   > I think perhaps CONTRIBUTING.md should instruct folks to install bash from Homebrew. I haven't tested it myself, but this looks promising:
   
   This has profound consequences. Many tools which "assume" bash is v3 on Mac will stop working if you make Bash 4 default. Even if you try to install it via `brew` it will tell you that it is 'cask-only`, which means that it is not becoming a default `bash` - you need to perform an extra step to make it default precisely because it might break a lot of things on Mac.  
   
   Going bash-less is the only good route we can take - now when we have Python3.7+ everywhere (previously various options of having Python2 + Python 3 (3.5 as well)  created similar set of problems, but since everyone is now on Python 3.7+ (3.6 EOL reached) basing "lowest common denominator" on Python 3.7 is great for such an environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] mik-laj commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1311674507

   > https://github.com/apache/airflow-site/blob/fd7af05318e34f4efd6b8446ba8729112f30ee08/site.sh#L199
   > 
   > Is this redirect still valid even after removing the docker part @mik-laj @potiuk @turbaszek @rossturk Could anyone help me to understand?
   
   It is common to delete part of a URL to go to the parent location.
   For example, when a user visits a URL:
   ```
   https://airflow.apache.org/docs/apache-airflow-providers-alibaba/2.1.0/index.html
   ```
   then the user can delete part of the URL and go to:
   ```
   https://airflow.apache.org/docs/apache-airflow-providers-alibaba/
   ```
   When we do not have this redirect, the user will see a 404 error.
   
   In other words, all directories the user has access to should have an index.html file to improve UX.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1309719028

   I will work on this part this weekend @potiuk. As I got stuck on resolving the issue, I started fixing other issues and slowly I became dormant in working on this issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1514416642

   yes
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on a diff in pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on code in PR #623:
URL: https://github.com/apache/airflow-site/pull/623#discussion_r924405759


##########
site.py:
##########
@@ -0,0 +1,233 @@
+from pathlib import Path
+import os
+import sys
+import argparse
+import subprocess
+from typing import Union, List, Optional, Mapping
+
+IMAGE_NAME="airflow-site"
+CONTAINER_NAME="airflow-site-c"
+SITE_DIST="landing-pages/dist"
+THEME_GEN="sphinx_airflow_theme/sphinx_airflow_theme/static/_gen"
+RunCommandResult = Union[subprocess.CompletedProcess, subprocess.CalledProcessError]
+
+
+def run_command(cmd: List[str],
+    env: Optional[Mapping[str, str]] = None,
+    cwd: Optional[Path] = None,
+    input: Optional[str] = None,
+    check: bool = True,
+     **kwargs,
+)-> RunCommandResult:
+    workdir: str = str(cwd) if cwd else os.getcwd()
+    try:
+        cmd_env = os.environ.copy()
+        cmd_env.setdefault("HOME", str(Path.home()))
+        if env:
+            cmd_env.update(env)
+        return subprocess.run(cmd, input=input, check=check, env=cmd_env, cwd=workdir, **kwargs)
+    except subprocess.CalledProcessError as ex:
+        if ex.stdout:
+            print(ex.stdout)
+        if ex.stderr:
+            print(ex.stderr)
+        return ex

Review Comment:
   This is a common method to execute the script via Python subprocess. If I could return the error, then I can log the error message according to the command executed. If you have any other options please let me know @turbaszek 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1191187869

   https://github.com/apache/airflow-site/blob/988ae480223f723efb0b2f97992d30f42128a0ef/site.sh#L77-L114
   
   Can someone explain to me what's happening in this code part? i find it difficult to understand what's happening here.
   
   @mik-laj could you help me? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] mik-laj commented on pull request #623: replacing site.sh with python

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1191193166

   @Bowrna see: https://stackoverflow.com/a/12498485   This generates a relative path for use in a container.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1196858392

   And the root cause of the confusion was that when the original issue was created, there was indeed only one `site.sh` :
   
   See the description of the issue by @mik-laj - so that's why the title was wrong 
   
   https://github.com/apache/airflow-site/issues/338
   
   > In this project, we have one bash script - [./site.sh](https://github.com/apache/airflow-site/blob/master/site.sh) that starts to grow more and more, so its maintenance becomes more problematic. I think it's a good idea to rewrite it into Python.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1199166426

   > When executing in terminal I see `index.html` from all providers and its different version say example `/dist//docs/apache-airflow-providers-imap/1.0.0/_modules/index.html`. But i don't see them when executing it via `check-links.sh`
   
   Could you share your view for this point? This is also another difference in the output when executing it in different ways @potiuk .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1198501378

   Trying to understand the below bash command. Could anyone tell me what's happening in this command
   https://github.com/apache/airflow-site/blob/445586c1552918e9531cd13d9cd64c0e63f5da03/landing-pages/check-links.sh#L59-L63
   When trying to execute the above script I just enabled print and saw it's logging a few html files.
   <img width="1432" alt="Screenshot 2022-07-28 at 11 48 17 PM" src="https://user-images.githubusercontent.com/10162465/181609178-a9f4de9a-564a-4a0a-b9ef-1bb9b3de236d.png">
   
   But when I took out that command and executed it via terminal it shows a big list of HTML files. I don't understand how this is different from executing in a script. When executing in terminal I see `index.html` from all providers and its different version say example `/dist//docs/apache-airflow-providers-imap/1.0.0/_modules/index.html`. But i don't see them when executing it via `check-links.sh`
   <img width="1440" alt="Screenshot 2022-07-29 at 12 04 18 AM" src="https://user-images.githubusercontent.com/10162465/181612520-f7e68860-35b0-4b4c-a779-d58ec85a3a26.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1199220881

   I have tried executing both of ways consecutively without running any other step in between. Let me debug this again and come with more information @potiuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1374438304

   Also, can this code be reviewed? I will try fixing this PR and getting it merged to main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] Bowrna commented on pull request #623: Remove Bash from the process of building the site

Posted by GitBox <gi...@apache.org>.
Bowrna commented on PR #623:
URL: https://github.com/apache/airflow-site/pull/623#issuecomment-1382626850

   Hello all :) do I have any update on this PR? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow-site] potiuk closed pull request #623: Remove Bash from the process of building the site

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed pull request #623: Remove Bash from the process of building the site
URL: https://github.com/apache/airflow-site/pull/623


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org