You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/14 22:12:32 UTC

[GitHub] [airflow] ecerulm opened a new pull request #13073: Script to generate integrations.json

ecerulm opened a new pull request #13073:
URL: https://github.com/apache/airflow/pull/13073


   Closes #12613


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745115624


   @mik-laj just mentioning you since I can't "request a review" from you.
   
   You can invoke the script as:
   ```
   python scripts/tools/generate-integrations-json.py >../airflow-site/landing-pages/site/data/integrations.json
   ```
   
   and the output look like 
   
   ```
   [
       {
           "name": "Discord",
           "url": "/docs/apache-airflow-providers-discord/stable/index.html"
       },
       {
           "name": "Amazon Athena",
           "url": "/docs/apache-airflow-providers-amazon/stable/index.html",
           "logo": "/integration-logos/aws/Amazon-Athena_light-bg@4x.png"
       },
       {
           "name": "Amazon CloudFormation",
           "url": "/docs/apache-airflow-providers-amazon/stable/index.html"
       },
       {
           "name": "Amazon CloudWatch Logs",
           "url": "/docs/apache-airflow-providers-amazon/stable/index.html",
           "logo": "/integration-logos/aws/Amazon-CloudWatch_light-bg@4x.png"
       },
   ...
   ]
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543155399



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"

Review comment:
       The output filename should not be customized as this is a source of potential typos.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543211206



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"
+    )
+
+AIRFLOW_ROOT = os.path.abspath(os.path.join(os.path.dirname(airflow.__file__), os.pardir))
+
+result_integrations = []
+for provider_file in glob(f"{AIRFLOW_ROOT}/airflow/providers/**/provider.yaml", recursive=True):
+    with open(provider_file) as f:
+        provider_info = yaml.load(f, Loader=yaml.CLoader)

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543605676



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,83 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+import shutil
+
+# pylint: disable=no-name-in-module
+from docs.exts.provider_yaml_utils import load_package_data
+
+# pylint: enable=no-name-in-module
+
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+ROOT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, os.pardir))
+DOCS_DIR = os.path.join(ROOT_DIR, 'docs')
+
+if __name__ != "__main__":
+    raise SystemExit(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the ./generate-integrations-json.py command"
+    )
+
+if not (
+    AIRFLOW_SITE_DIR
+    and os.path.isdir(AIRFLOW_SITE_DIR)
+    and os.path.isdir(os.path.join(AIRFLOW_SITE_DIR, 'docs-archive'))
+):
+    raise SystemExit(
+        'Before using this script, set the environment variable AIRFLOW_SITE_DIRECTORY. This variable '
+        'should contain the path to the airflow-site repository directory. '
+        '${AIRFLOW_SITE_DIRECTORY}/docs-archive must exists.'
+    )
+
+ALL_PROVIDER_YAMLS = load_package_data()
+
+result_integrations = []

Review comment:
       sure, `sorted(, key=lambda x: x['name']` added 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745134187


   [The Workflow run](https://github.com/apache/airflow/actions/runs/422700773) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543152201



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       Where will these files be saved? Can you automatically check if these files exist? 
   
   Related file: https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543319400



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745223108


   [The Workflow run](https://github.com/apache/airflow/actions/runs/423006945) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745818779


   Sure, you mean just updating `integrations.json` and the `integration-logos` right? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543150702



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"
+    )
+
+AIRFLOW_ROOT = os.path.abspath(os.path.join(os.path.dirname(airflow.__file__), os.pardir))
+
+result_integrations = []
+for provider_file in glob(f"{AIRFLOW_ROOT}/airflow/providers/**/provider.yaml", recursive=True):
+    with open(provider_file) as f:
+        provider_info = yaml.load(f, Loader=yaml.CLoader)

Review comment:
       ```suggestion
           provider_info = yaml.safe_load(f)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543154660



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"
+    )
+
+AIRFLOW_ROOT = os.path.abspath(os.path.join(os.path.dirname(airflow.__file__), os.pardir))
+
+result_integrations = []
+for provider_file in glob(f"{AIRFLOW_ROOT}/airflow/providers/**/provider.yaml", recursive=True):
+    with open(provider_file) as f:
+        provider_info = yaml.load(f, Loader=yaml.CLoader)
+        for integration in provider_info.get('integrations', []):
+            doc_url = integration.get("how-to-guide")
+            if doc_url:
+                doc_url = doc_url[0].strip()
+                doc_url = re.sub(f'/{provider_info["package-name"]}/', r"\g<0>stable/", doc_url)
+                doc_url = re.sub(r'\.rst', '.html', doc_url)
+            else:
+                doc_url = f"/docs/{provider_info['package-name'].lower()}/stable/index.html"
+            logo = integration.get("logo")
+
+            result = {
+                'name': integration['integration-name'],
+                'url': doc_url,
+            }
+            if logo:
+                result['logo'] = logo
+            result_integrations.append(result)
+
+print(

Review comment:
       Ideally, this script should update the necessary files in our target repository - `aiirflow-site`. In the next step, we can create a PR that will contain all the necessary changes, or add it to our CI pepline.
   See example: 
   https://github.com/apache/airflow/blob/master/docs/publish_docs.py
   Doc: https://github.com/apache/airflow/blob/master/dev/README_RELEASE_PROVIDER_PACKAGES.md#publish-documentation
   https://github.com/apache/airflow/blob/master/dev/README_RELEASE_AIRFLOW.md#publish-documentation




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543341167



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,81 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+import shutil
+
+# pylint: disable=no-name-in-module
+from docs.exts.provider_yaml_utils import load_package_data
+
+# pylint: enable=no-name-in-module
+
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+ROOT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, os.pardir))
+DOCS_DIR = os.path.join(ROOT_DIR, 'docs')
+
+if __name__ != "__main__":
+    raise SystemExit(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the ./generate-integrations-json.py command"
+    )
+
+if not (
+    AIRFLOW_SITE_DIR
+    and os.path.isdir(AIRFLOW_SITE_DIR)
+    and os.path.isdir(os.path.join(AIRFLOW_SITE_DIR, 'docs-archive'))
+):
+    raise SystemExit(
+        'Before using this script, set the environment variable AIRFLOW_SITE_DIRECTORY. This variable '
+        'should contain the path to the airflow-site repository directory. '
+        '${AIRFLOW_SITE_DIRECTORY}/docs-archive must exists.'
+    )
+
+ALL_PROVIDER_YAMLS = load_package_data()
+
+result_integrations = []
+for provider_info in ALL_PROVIDER_YAMLS:
+    for integration in provider_info.get('integrations', []):
+        doc_url = integration.get("how-to-guide")
+        if doc_url:
+            doc_url = doc_url[0].strip()
+            doc_url = re.sub(f'/{provider_info["package-name"]}/', r"\g<0>stable/", doc_url)
+            doc_url = re.sub(r'\.rst', '.html', doc_url)
+        else:
+            doc_url = f"/docs/{provider_info['package-name'].lower()}/stable/index.html"
+        logo = integration.get("logo")
+
+        result = {
+            'name': integration['integration-name'],
+            'url': doc_url,
+        }
+        if logo:
+            result['logo'] = logo
+        result_integrations.append(result)
+
+with open(os.path.join(AIRFLOW_SITE_DIR, 'landing-pages/site/static/integrations.json'), 'w') as f:

Review comment:
       Here we write directly to the `airflow-site` location. `landing-pages/site/data/integrations.json` is a symbolic link to this one.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745134562


   [The Workflow run](https://github.com/apache/airflow/actions/runs/422707251) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745508118


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-744832136


   [The Workflow run](https://github.com/apache/airflow/actions/runs/422002272) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543319078



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-746113780


   @ecerulm Yes. We added a lot of new integration, so we need to regenerate `integration.json`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745509725


   Fantastic! Can you now prepare a new PR with an update to airflow-site?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543177060



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       I guess the best idea would be to copy these files into this repository to the `/docs/integration-logos/` directory.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543175130



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       Those files are today in the [`airflow-site` repo at landing-pages/site/static/integration-logos](https://github.com/apache/airflow-site/tree/master/landing-pages/site/static/integration-logos)
   
   I don't know if we can check that the files exist in **other repo** as a pre commit hook, I mean I can but it will fail in CI I guess because `airflow-site` won't be there at that time. 
   
   So do you want to copy the logos into this repo instead, and then let this script generate:
   
       ../airflow-site/landing-pages/site/data/integrations.json
       ../airflow-site/landing-pages/site/static/integrations.json
        ../airflow-site/landing-pages/site/static/integrations-logos/**
    




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543233995



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       SGTM




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj merged pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #13073:
URL: https://github.com/apache/airflow/pull/13073


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543183940



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+from glob import glob
+
+import yaml
+
+import airflow
+
+if __name__ != "__main__":
+    raise Exception(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the './generate-integrations-json.py >integrations.json' command"

Review comment:
       Ok, I'll take the approach in https://github.com/apache/airflow/blob/master/docs/publish_docs.py and just take `AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')` and output directly to a file in that location




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745509725


   Fantastic! Can you now prepare a new PR with an update to airflow-site (target branch: airflow-2-0-docs)? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#issuecomment-745223276


   [The Workflow run](https://github.com/apache/airflow/actions/runs/423012753) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543460839



##########
File path: scripts/tools/generate-integrations-json.py
##########
@@ -0,0 +1,83 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import json
+import os
+import re
+import shutil
+
+# pylint: disable=no-name-in-module
+from docs.exts.provider_yaml_utils import load_package_data
+
+# pylint: enable=no-name-in-module
+
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+ROOT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, os.pardir))
+DOCS_DIR = os.path.join(ROOT_DIR, 'docs')
+
+if __name__ != "__main__":
+    raise SystemExit(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the ./generate-integrations-json.py command"
+    )
+
+if not (
+    AIRFLOW_SITE_DIR
+    and os.path.isdir(AIRFLOW_SITE_DIR)
+    and os.path.isdir(os.path.join(AIRFLOW_SITE_DIR, 'docs-archive'))
+):
+    raise SystemExit(
+        'Before using this script, set the environment variable AIRFLOW_SITE_DIRECTORY. This variable '
+        'should contain the path to the airflow-site repository directory. '
+        '${AIRFLOW_SITE_DIRECTORY}/docs-archive must exists.'
+    )
+
+ALL_PROVIDER_YAMLS = load_package_data()
+
+result_integrations = []

Review comment:
       Can the items in this array be sorted? I am concerned about noise in the history of git.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ecerulm commented on a change in pull request #13073: Script to generate integrations.json

Posted by GitBox <gi...@apache.org>.
ecerulm commented on a change in pull request #13073:
URL: https://github.com/apache/airflow/pull/13073#discussion_r543188956



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -30,21 +30,26 @@ integrations:
     tags: [azure]
   - integration-name: Microsoft Azure Blob Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/blobs/
+    logo: /integration-logos/azure/Blob Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Container Instances
     external-doc-url: https://azure.microsoft.com/en-us/services/container-instances/
+    logo: /integration-logos/azure/Container Instances.svg
     tags: [azure]
   - integration-name: Microsoft Azure Cosmos DB
     external-doc-url: https://azure.microsoft.com/en-us/services/cosmos-db/
+    logo: /integration-logos/azure/Azure Cosmos DB.svg
     tags: [azure]
   - integration-name: Microsoft Azure Data Explorer
     external-doc-url: https://azure.microsoft.com/en-us/services/data-explorer/
     tags: [azure]
   - integration-name: Microsoft Azure Data Lake Storage
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/data-lake-storage/
+    logo: /integration-logos/azure/Data Lake Storage.svg
     tags: [azure]
   - integration-name: Microsoft Azure Files
     external-doc-url: https://azure.microsoft.com/en-us/services/storage/files/
+    logo: /integration-logos/azure/Azure Files.svg

Review comment:
       Ok, I'll do that 
   * copy the logos into `/docs/integration-logos`
   * check in the pre commit hook that the logo file  referenced in each `provider.yaml` actually exist
   * generate the `integrations.json` file directly into `$AIRFLOW_SITE_DIRECTORY/landing-pages/site/{data,static}/integrations.json`
   * copy files over from `/docs/integration-logos` into `$AIRFLOW_SITE_DIRECTORY/landing-pages/site/static/integration-logos`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org