You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/11 17:50:27 UTC

[GitHub] [airflow] potiuk opened a new issue #11423: Separate out documentation building and publishing per provider

potiuk opened a new issue #11423:
URL: https://github.com/apache/airflow/issues/11423


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-724280904


   I think eventually we might need a doc per provider version. We can fully automate it - once we automate it for "latest" it will be almost no effort to automate it for "per-provider-version". And it would be rather confusing for people looking at the provider's doc from latest version while they will be using another. 
   
   Just the fact that we agreed to Semver and agreed that we might have breaking changes pretty much implies that we need to have "per-version"  documentation. Imagine we have 1.0.0 versions of Google provider and then we introduce 2.0.0 which will introduce breaking changes (for example after we migrate to Google 2.0 Python APIS). We need to provide docs for both versions for quite some time. And It's even likely that we will release a 1.0.1 Google provider with bugfixes for 1.0.0.
   
   I think we have no choice but to implement all of it, including the possibility of choosing version per provider - this IMHO is pretty much sealed when we agreed to allow for breaking changes for each provider. And it's not even difficult - we can (and should) fully automate it.
   
   It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-739984962


   I guess this is done @mik-laj ? can we close it ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-722226813


   I worked on this ticket yesterday / today and managed to build documentation for providers package..
   https://wicked-army.surge.sh/
   I haven't migrated all the content yet, but the most difficult case - Google package have been successfully migrated fully, along with reference documentation for Python API and configuration.
   https://wicked-army.surge.sh/google/html/index.html
   
   There are two more serious issues that need to be discussed.
   1. **ReadtheDocs**:  Unfortunately, we will have to abandon ReadTheDocs to build the documentation. It doesn't allow you to run your own build scripts. and we can only have one documentation for the repository. Besides, it causes various problems over which we have little control. For example, now Python API reference documentation does not build properly - https://airflow.readthedocs.io/en/latest/_api/index.html
   We will probably be able to solve this problem quickly if we receive financial support for CI. Then we will be able to build the documentation ourselves and publish on S3/GCS or other.
   
   2. **Operators and hooks**: This page have information from all providers, so it is not possible to divide it.
   https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html
   In its present form, it cannot remain if we want to have a separate docs per provider. I propose that we maintain the same information in the YAML file and then reuse them as needed.
   For development purposes, we can generate a markdown file which we will store in the repository.
   For production/website, we can also display this data as a markdown on website, or ... as build a complex interface similar to [Terraofmr Registry](https://registry.terraform.io/browse/providers?tier=official%2Cpartner).  It could be fairly simple if we have all data in YAML and we had a contributor with React experience.
   
   CC: @ryw @potiuk @iadi7ya @francescomucio @jward-bw @jhtimmins @kaxil @paolaperaza @pcandoalmeida @xinbinhuang
   
   Related issue: https://github.com/apache/airflow-site/issues/301


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731641602


   
   > 
   > > Now I even think that publishing on Github Releases might be easier for us as we won't have to provide credentials.
   > 
   > Yep - that's much better. Just remember this will only work from master merge or workfow_run (the {PR token is read only)
   > 
   
   In my head I had is only publishing a new release on tags - but we could probably have a "latest" release too that we overwrite the release blob for.
   
   And if you want to test out the theme built from a pr we can use the upload-artifact action


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731570401


   @potiuk  The website and theme will share some files, more specifically you must have the site build output files to be able to build the theme. For this reason, moving this theme to a separate repository could be problematic.
   
   We can think about using Pypi, but if this is actually going to be for internal use only and we don't expect users to install this theme, I don't think we should make it easy to find this theme. If publication on a private repository of packages will not be a big problem for us.
   
   Now I even think that publishing on Github Releases might be easier for us as we won't have to provide credentials.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-729621052


   I already have the first successful build of full documentation on S3:
   http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/
   For now, the content for google provider only is migrated, but in the follow-up PR we can migrate the rest of the content. Today I will open PR with what I have already managed to do.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-730688382


   Hello.
   
   Today I would like to discuss the next step - Sphinx theme for our documentation. This theme is currently being developed in the [`airflow-site`](https://github.com/apache/airflow-site) repository, but the theme package for installation is not published anywhere. Quite simply, if you want to build a production documentation, you have to install this theme on your own. This is reasonably OK if we only build documentation once every few months, but this is far from ideal. 
   
   The production and development documentation looks completely different. This means that if there is an error in the theme, we find out about it after publishing the documentation and any change is then much more difficult. This usually means that we have to edit each HTML file individually.
   
   I would like to improve it now and install theme in Breeze and also provide a way to install this theme if you want to build documentation locally. I would not like to publish this package on Pypi so as not to clutter the public repository with packages that will not be used by other projects.
   
   I think the easiest way is to build a theme on Github Action for airfllow-site and then publish theme to S3. Then we will be able to install the theme with the command:
   ```
   pip install airflow-sphinx-theme --extra-index-url https://apache-airflow-pypi.s3-website.eu-central-1.amazonaws.com/
   ```
   This looks like a simple task if we use [https://github.com/novemberfiveco/s3pypi](https://github.com/novemberfiveco/s3pypi).
   I was thinking about installing with pip+git:
   ```
   pip install git+https://github.com
   ```
   Unfortunately, this won't work as this theme has a complex build process. We must first build a website to generate the necessary artifacts to build a theme package.
   
   CC: @potiuk @ryw @kaxil 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-722226813


   I worked on this ticket yesterday / today and managed to build documentation for providers package..
   https://wicked-army.surge.sh/
   I haven't migrated all the content yet, but the most difficult case - Google package have been successfully migrated fully, along with reference documentation for Python API and configuration.
   https://wicked-army.surge.sh/google/html/index.html
   
   There are two more serious issues that need to be discussed.
   1. **ReadtheDocs**:  Unfortunately, we will have to abandon ReadTheDocs to build the documentation. It doesn't allow you to run your own build scripts. and we can only have one documentation for the repository. Besides, it causes various problems over which we have little control. For example, now reference documentation does not build properly - https://airflow.readthedocs.io/en/latest/_api/index.html
   We will probably be able to solve this problem quickly if we receive financial support for CI. Then we will be able to build the documentation ourselves and publish on S3/GCS or other.
   
   2. **Operators and hooks**: This page have information from all providers, so it is not possible to divide it.
   https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html
   In its present form, it cannot remain if we want to have a separate docs per provider. I propose that we maintain the same information in the YAML file and then reuse them as needed.
   For development purposes, we can generate a markdown file which we will store in the repository.
   For production/website, we can also display this data as a markdown on website, or ... as build a complex interface similar to [Terraofmr Registry](https://registry.terraform.io/browse/providers?tier=official%2Cpartner).  It could be fairly simple if we have all data in YAML and we had a contributor with React experience.
   
   CC: @ryw @potiuk @iadi7ya @francescomucio @jward-bw @jhtimmins @kaxil @paolaperaza @pcandoalmeida @xinbinhuang
   
   Related issue: https://github.com/apache/airflow-site/issues/301


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-732291896


   > I have prepared a PR that publishes the theme package on Github Action.
   > [apache/airflow-site#308](https://github.com/apache/airflow-site/pull/308)
   
   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #11423:
URL: https://github.com/apache/airflow/issues/11423


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-727163761


   > In the short term, is the idea to build this into the airflow website next to the other docs? Trying to think what is simplest to get v1 out there. Do we want to provide versioned docs for each provider, I don't think so - just "latest"
   
   We update vendors very often, so I think it's worth breaking down these dossiers as soon as possible. If we are going to publish these documents, we must also give the opportunity to look at the archival version of the documentation. Mainly, so that the user can check whether a given operator is available in a given or needs to update to the latest version.
   
   > We could have sublinks across the top "Airflow" and "Airflow Providers" as a way to navigate to this providers docs?
   
   I would like us to have an index (at the address: https://airflow.apache.org/docs/ ) that will describe all the products we release. For now, my focus is only on Airflow-core, a provider packages, but in the future we may add documentation for the rest of the products we release. 
   https://github.com/apache/airflow/issues/11152
   
   When the user selects a product, they gets a view similar to:
   https://airflow.apache.org/docs/stable/#
   However, there will be some differences for providers: 
   - the search will work for content from the current product and version.
   - The title/breadcrumbs will contain information about the name of the package.
   
   > It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible.
   
   I think we should be prepared with the documentation for "Day 0". Otherwise we will have mixed content for different products and versions in one documentation. However, this documentation will not be easily updated, e.g. links will still point to out-of-date documents.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731567357


   Maybe - we can do better than that. Why don't we create a separate repository "apache/airflow-doc-theme" and put all the theme there ? then we can develop it separately and point to the tags/versions of the code (without even releasing it) same way as we do with airflow now:
   
   ```
   pip install https://github.com/apache/airflow-doc-theme/<BRANCH_OR_TAG>.tar.gz#egg=apache-airflow-doc-theme
   ```
   
   This will run setup.py locally, to build the theme. But maybe this is not as complex and can be done? The benefit is that if we decide to move it to PyPI, we can publish pre-built binary themes there similarly to NumPy prebuilt packages (PyPI accepts different variants of releases).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731104247


   Sounds great!
   
   Another option might be to publish it as a Release on Github, and then we could install it as
   
   ```
   pip install \
       https://github.com/apache/airflow-site/releases-download/1.0.0/apache_airflow_docs_theme-1.0.0-py3-none-any.whl
   ```
   
   (To test this I uploaded the artifacts to 2.0.0b3 on Airflow: https://github.com/apache/airflow/releases/tag/2.0.0b3)
   
   The advantage of using Github is Actions already has credentials to create releases (I think?) and we then dont need to manage keys for S3. Disadvantage is that we could only point at specific releases, and couldn't do `airflow-docs-theme>=1.2.3` to instance.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-727163761


   > In the short term, is the idea to build this into the airflow website next to the other docs? Trying to think what is simplest to get v1 out there. Do we want to provide versioned docs for each provider, I don't think so - just "latest"
   
   We update vendors very often, so I think it's worth breaking down these dossiers as soon as possible. If we are going to publish these documents, we must also give the opportunity to look at the archival version of the documentation. Mainly, so that the user can check whether a given operator is available in a given or needs to update to the latest version.
   
   > We could have sublinks across the top "Airflow" and "Airflow Providers" as a way to navigate to this providers docs?
   
   I would like us to have an index (at the address: https://airflow.apache.org/docs/ ) that will describe all the products we release. For now, my focus is only on Airflow-core, a provider packages, but in the future we may add documentation for the rest of the products we release. 
   https://github.com/apache/airflow/issues/11152
   
   When the user selects a product, they gets a view similar to:
   https://airflow.apache.org/docs/stable/#
   However, there will be some differences for providers: 
   - the search will work for content from the current product and version.
   - The title/breadcrumbs will contain information about the name of the package.
   
   > It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible.
   
   I think we should be prepared with the documentation for "Day 0". Otherwise we will have mixed content for different products and versions in one documentation. However, this documentation will not be easily updated, e.g. links will still point to out-of-date documents.
   
   If we do not split the documentation, they will have problems with publishing some documents at "Day 0", e.g. changelog for provider packages. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-730688382


   Hello.
   
   Today I would like to discuss the next step - Sphinx theme for our documentation. This theme is currently being developed in the [`airflow-site`](https://github.com/apache/airflow-site) repository, but the theme package for installation is not published anywhere. Quite simply, if you want to build a production documentation, you have to install this theme on your own. This is reasonably OK if we only build documentation once every few months, but this is far from ideal. 
   
   The production and development documentation looks completely different. This means that if there is an error in the theme, we find out about it after publishing the documentation and any change is then much more difficult. This usually means that we have to edit each HTML file individually.
   
   I would like to improve it now and install theme in Breeze and also provide a way to install this theme if you want to build documentation locally. I would not like to publish this package on Pypi so as not to clutter the public repository with packages that will not be used by other projects.
   
   I think the easiest way is to build a theme on Github Action for airfllow-site and then publish theme to S3. Then we will be able to install the theme with the command:
   ```
   pip install airflow-sphinx-theme --extra-index-url https://apache-airflow-pypi.s3-website.eu-central-1.amazonaws.com/
   ```
   I was thinking about installing with pip+git:
   ```
   pip install git+https://github.com
   ```
   Unfortunately, this won't work as this theme has a complex build process. We must first build a website to generate the necessary artifacts to build a theme package.
   
   CC: @potiuk @ryw @kaxil 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-732288060


   I have prepared a PR that publishes the theme package on Github Action.
   https://github.com/apache/airflow-site/pull/308


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-724280904


   I think eventually we might need a doc per provider version. We can fully automate it - once we automate it for "latest" it will be almost no effort to automate it for "per-provider-version". And it would be rather confusing for people looking at the provider's doc from latest version while they will be using another. 
   
   Just the fact that we agreed to Semver and agreed that we might have breaking changes pretty much implies that we need to have "per-version"  documentation. Imagine we have 1.0.0 versions of Google provider and then we introduce 2.0.0 which will introduce breaking changes (for example after we migrate to Google 2.0 Python APIS). We need to provide docs for both version for quite some time. And It's even likely that we will release a 1.0.1 Google provider with bugfixes for 1.0.0.
   
   I think we have no choice but to implement all of it, including the possibility of choosing version per provider - this IMHO is pretty much sealed when we agreed to allow for breaking changes for each provider. And it's not even difficult - we can (and should) fully automate it.
   
   It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-727164603


   During the split, I would also like to introduce one additional change - migrate the development version of the documentation to the official template. Now all contributors are using documentation that has a different template and sometimes the final documentation is buggy as a result. If everyone used one template, the bugs would be fixed faster.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731568434


   (and we could combine both - keeping theme in separate repo and making them available as release as well).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731568434


   And we could combine both - keeping theme in separate repo and making them available as release as well). 
   
   Also - releasing it to PyPI is super easy. I think we should also consider simply releasing it via PyPI. Once we have the right set of artifacts, it is as easy as running "twine upload". 
   
   I am not sure why we excluded that so easily? Is there any problem with that @mik-laj  since this is standard way of distributing packages? I do not think there is a "clutter" or any kind there, to be honest if it makes our life easier.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-740121141


   Last PR: https://github.com/apache/airflow/pull/12892
   
   It would be nice to have this merged before the RC1 release, but that doesn't affect end users, so we can also merge it later and do one release more manually.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731572504


   I am also wondering if publishing on Pypi will result in us having to meet some releasing requirements.  Any user will be able to easily find this theme and install it. If we use a private repository, the files will be available only to the developers of this project. Ideally, we would also be able to configure the full CI / CD so that the package is available without any of our intervention. Even a manual little extra step of publishing the artifact would be a pain for us.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731567357


   We can do better than that. Why don't we create a separate repository "apache/airflow-doc-theme" and put all the theme there ? then we can develop it separately and point to the tags/versions of the code (without even releasing it) same way as we do with airflow now:
   
   ```
   pip install https://github.com/apache/airflow-doc-themen/<BRANCH_OR_TAG>.tar.gz#egg=apache-airflow-doc-theme
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731104247


   Sounds great!
   
   Another option might be to publish it as a Release on Github, and then we could install it as
   
   ```
   pip install https://github.com/apache/airflow-site/releases-download/1.0.0/apache_airflow_docs_theme-1.0.0-py3-none-any.whl```
   ```
   
   (To test this I uploaded the artifacts to 2.0.0b3 on Airflow: https://github.com/apache/airflow/releases/tag/2.0.0b3)
   
   The advantage of using Github is Actions already has credentials to create releases (I think?) and we then dont need to manage keys for S3. Disadvantage is that we could only point at specific releases, and couldn't do `airflow-docs-theme>=1.2.3` to instance.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-729621052


   I already have the first successful build of full documentation on S3:
   http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/
   For now, the content for google only is migrated, but in the follow-up PR we can migrate the rest of the content. Today I will open PR with what I have already managed to do.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-724280904


   I think eventually we might need a doc per provider version. We can fully automate it - once we automate it for "latest" it will be almost no effort to automate it for "per-provider-version". And it would be rather confusing for people looking at the provider's doc from latest version while they will be using another. 
   
   Just the fact that we agreed to Semver and agreed that we might have breaking changes pretty much implies that we need to have "per-version"  documentation. Imagine we have 1.0.0 versions of Google provider and then we introduce 2.0.0 which will introduce breaking changes (for example after we migrate to Google 2.0 Python APIS). We need to provide docs for both versions for quite some time. And It's even likely that we will release a 1.0.1 Google provider with bugfixes for 1.0.0 (though this still waits for #11425 to be completed).
   
   I think we have no choice but to implement all of it, including the possibility of choosing version per provider - this IMHO is pretty much sealed when we agreed to allow for breaking changes for each provider. And it's not even difficult - we can (and should) fully automate it.
   
   It does not have to be there for "Day 0" - like when we release 2.0.0 and set of 1.0.0 providers, it can be "no version" but very soon after we have to support versions. And our tooling has to be prepared for that (and have it automated), because keeping it manually updated will be impossible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731620568


   > @potiuk The website and theme will share some files, more specifically you must have the site build output files to be able to build the theme. For this reason, moving this theme to a separate repository could be problematic.
   
   I see. Fine for me.
   
   > Now I even think that publishing on Github Releases might be easier for us as we won't have to provide credentials.
   
   Yep - that's much better. Just remember this will only work from master merge or workfow_run (the {PR token is read only)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-706744810


   Hey @mik-laj @kaxil  - I guess it would be great to separate out the docs per-provider - both generation and publishing.  This would only be needed when we go to 2.0 so we have quite some time for that, But I thought you might be the best people to take care about it :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-727889593


   This is cool! And yeah! if we can make it split from day 0. I'd really love that!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-730688382


   Hello.
   
   Today I would like to discuss the next step - Sphinx theme for our documentation. This theme is currently being developed in the [`airflow-site`](https://github.com/apache/airflow-site) repository, but the theme package for installation is not published anywhere. Quite simply, if you want to build a production documentation, you have to install this theme on your own. This is reasonably OK if we only build documentation once every few months, but this is far from ideal. 
   
   The production and development documentation looks completely different. This means that if there is an error in the theme, we find out about it after publishing the documentation and any change is then much more difficult. This usually means that we have to edit each HTML file individually.
   
   I would like to improve it now and install theme in Breeze and also provide a way to install this theme if you want to build documentation locally. I would not like to publish this package on Pypi so as not to clutter the public repository with packages that will not be used by other projects.
   
   I think the easiest way is to build a theme on Github Action for airfllow-site and then publish theme to S3. Then we will be able to install the theme with the command:
   ```
   pip install airflow-sphinx-theme --extra-index-url https://apache-airflow-pypi.s3-website.eu-central-1.amazonaws.com/
   ```
   This looks like a simple task if we use [https://github.com/novemberfiveco/s3pypi](s3pypi).
   I was thinking about installing with pip+git:
   ```
   pip install git+https://github.com
   ```
   Unfortunately, this won't work as this theme has a complex build process. We must first build a website to generate the necessary artifacts to build a theme package.
   
   CC: @potiuk @ryw @kaxil 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ryw commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
ryw commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-724300717


   Agree - we should ship v1 as "universal docs" since it's the first release for all the providers, but we'll have to address the problem pretty soon as providers start to independently update + release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ryw commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
ryw commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-724227934


   Hi @mik-laj - I reviewed this and chatted w/ @kaxil today, looks good structurally for v1.
   
   In the short term, is the idea to build this into the airflow website next to the other docs? Trying to think what is simplest to get v1 out there. Do we want to provide versioned docs for each provider, I don't think so - just "latest" 
   
   We could have sublinks across the top "Airflow" and "Airflow Providers" as a way to navigate to this providers docs?
   
   Happy to jump on a call to brainstorm.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-722226813


   I worked on this ticket yesterday / today and managed to build documentation for providers package..
   https://wicked-army.surge.sh/
   I haven't migrated all the content yet, but the most difficult case - Google package have been successfully migrated fully, along with reference documentation for Python API and configuration.
   https://wicked-army.surge.sh/google/html/index.html
   
   There are two more serious issues that need to be discussed.
   1. **ReadtheDocs**:  Unfortunately, we will have to abandon ReadTheDocs to build the documentation. It doesn't allow you to run your own build scripts. and we can only have one documentation for the repository. Besides, it causes various problems over which we have little control. For example, now Python API reference documentation does not build properly - https://airflow.readthedocs.io/en/latest/_api/index.html
   We will probably be able to solve this problem quickly if we receive financial support for CI. Then we will be able to build the documentation ourselves and publish on S3/GCS or other.
   
   2. **Operators and hooks reference**: This page have information from all providers, so it is not possible to divide it.
   https://airflow.readthedocs.io/en/latest/operators-and-hooks-ref.html
   In its present form, it cannot remain if we want to have a separate docs per provider. I propose that we maintain the same information in the YAML file and then reuse them as needed.
   For development purposes, we can generate a markdown file which we will store in the repository.
   For production/website, we can also display this data as a markdown on website, or ... as build a complex interface similar to [Terraofmr Registry](https://registry.terraform.io/browse/providers?tier=official%2Cpartner).  It could be fairly simple if we have all data in YAML and we had a contributor with React experience.
   
   CC: @ryw @potiuk @iadi7ya @francescomucio @jward-bw @jhtimmins @kaxil @paolaperaza @pcandoalmeida @xinbinhuang
   
   Related issue: https://github.com/apache/airflow-site/issues/301


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-722281789


   Will take a look shortly :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-731567357


   We can do better than that. Why don't we create a separate repository "apache/airflow-doc-theme" and put all the theme there ? then we can develop it separately and point to the tags/versions of the code (without even releasing it) same way as we do with airflow now:
   
   ```
   pip install https://github.com/apache/airflow-doc-theme/<BRANCH_OR_TAG>.tar.gz#egg=apache-airflow-doc-theme
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #11423: Separate out documentation building and publishing per provider

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11423:
URL: https://github.com/apache/airflow/issues/11423#issuecomment-740121141


   Last PR: https://github.com/apache/airflow/pull/12892


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org