You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Fan Liya <li...@gmail.com> on 2022/03/25 04:03:54 UTC

[DISCUSS] Best practice for synchronizing master and site branches

Hi all,

As part of the release process, we need to synchronize the master and
site branches (Please see
https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
Usually, the site is behind the master branch by some commits.
If the existing commits in the site branch are in the same order as in
the master branch, the task is easy: just switch to the site branch,
and run

git rebase master

However, if some commits are in different orders, it can be tricky.
For example, the master branch may have the following commits (in
order):

A, B, X1, X2, ... , Xn.

and the site branch may have the following commits (in order):

B, A, X1, X2.

Basically we have two choices:

1. We can live with the out of order commits, because after
cherry-picking commits X3, X4, ... , Xn to the site branch, the file
contents will be consistent.

The problem is that, since the two branches have diverged, we cannot
use the rebase command. Instead, we have to manually cherry-pick
commits individually, which requires large effort. In addition, for
any subsequent release processes, we have to manually cherry-pick each
commit.

2. We need to make the commits order consistent, which will make it
easy for subsequent releases.
However, the problem is that, to make the commits order consistent,
some git force push command is unavoidable, which is risky to some
extent.

So what is the recommended way to do this? Thanks in advance for your feedback!

Best,
Liya Fan

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Infra has added the Github Actions secret to push to the calcite-site 
repo [1].

Let's keep the process we have at the moment, but automate it. I'll 
start by making test repos mirroring the 4 calcite-* repos on my account 
to test the automation.

In the meantime can members of the community let me know which 
part/pages of the Calcite and Avatica websites should only be published 
after a release? This is to prevent documentation updates for unreleased 
versions from being published before a release.

Francis

[1] https://issues.apache.org/jira/browse/INFRA-23044

On 30/03/2022 4:21 am, Julian Hyde wrote:
> I have never needed or wanted a versioned Javadoc URL for Calcite. Our APIs tend to grow over time.
> 
> The only requirement I see is that we don’t pollute the javadoc/doc of the latest released version with things that are not yet released. Which would lead to two versions: latest release and head.
> 
> I can see that the implementation might be simpler if we have multiple versions, but let’s be clear that that is not the requirement.
> 
> Julian
> 
> 
>> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
>>
>> I think it is a good idea to provide versioned JavaDocs.
>>
>> However, even if we only provide the JavaDoc of the latest release,
>> there is no need to maintain two branches (IMHO),
>> because the processes of updating the website and JavaDoc are
>> relatively separate processes (according to [1]).
>> With a single branch, it is feasible to update the website regularly,
>> and update JavaDocs only at release times.
>>
>> Best,
>> Liya Fan
>>
>> [1] https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>
>> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二 17:59写道:
>>>
>>> Hello everyone,
>>> I totally agree on automating the website publication and having a single
>>> branch, the less we do manually, the lower the chances to mess something up.
>>>
>>> I am also in favour of versioned docs in the website, it's confusing to
>>> land on updated pages from an older context like a message from the ML.
>>>
>>> Best regards,
>>> Alessandro
>>>
>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
>>> wrote:
>>>
>>>> Hey Julian,
>>>>
>>>> All very good points. I can definitely see the utility of the javadocs.
>>>> The analogue in Go would be godoc, with the difference being that the
>>>> godoc server automatically crawls the code across all versions to
>>>> generate the documentation.
>>>>
>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>> selector on the top left to look at the documentation for different
>>>> versions of the module / library in question.
>>>>
>>>> You mentioned that you do not want to have a version string in the URL.
>>>> Is there any particular reason for this? For example, if I were to end
>>>> up on the mailing list archives through a google search and there's a
>>>> message linking to the javadoc, it might be more helpful if the javadoc
>>>> was linked to a particular version of the release so that the context
>>>> around the discussion at the time makes more sense.
>>>>
>>>> We can have all javadocs for all releases of Calcite published and have
>>>> a selector to jump between versions, similar to godoc, for example, like
>>>> this javadoc for google cloud with a version selector on the bottom
>>>> right [2]. This would allow users to switch between different versions
>>>> and look at the version of the javadoc that's currently being used in
>>>> their project.
>>>>
>>>> Regarding the documentation on the website itself, would it make sense
>>>> if we have a versioned copy for each release? Currently, we only publish
>>>> the documentation for the latest release, so, if we were to look at
>>>> older messages from the mailing list and follow a link to the
>>>> documentation, the documentation could be incorrect or not relevant to
>>>> the message itself.
>>>>
>>>> Maybe we can have a folder for each release? For example:
>>>> -
>>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>>> -
>>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>>>
>>>> This would give each release their own documentation with a unique path.
>>>> For the current unreleased version, we can still put it in version of
>>>> the next release:
>>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>>> and
>>>> maybe have a message that says this is an unreleased version like
>>>> elasticsearch [3]. Links to this release's javadoc would work before and
>>>> after the release and would never break.
>>>>
>>>> The upside to this approach is that all documentation (even the
>>>> unreleased version) is published immediately, but they are versioned, so
>>>> there is no confusion. It also means that users of Calcite master would
>>>> be able to look at the docs online. This also simplifies the deployment
>>>> of site as we no longer need the site branch: the website can just be
>>>> built from master.
>>>>
>>>> Francis
>>>>
>>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
I have implemented automatic site builds for Calcite in a test repo.

See:
- 
https://github.com/F21/calcite-test/blob/master/.github/workflows/publish-non-release-website-updates.yml
- 
https://github.com/F21/calcite-test/blob/master/.github/workflows/publish-website-on-release.yml

For the automation that does the cherry-picking to site, I used the 
following rules: 
https://github.com/F21/calcite-test/blob/master/.github/workflows/publish-non-release-website-updates.yml#L7

Community members, please review the rules to see if they are sufficient 
for our use-case.

The rules are implemented as follows:
- All changes to files under the site folder triggers a cherry pick to 
site and a site build
- If the change is to a file under site/_docs, do not trigger a cherry 
pick and do not build
- If the change is to site/_docs/powered_by.md, trigger a cherry pick 
and a site build.

The order of the rules is due to 
https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-including-and-excluding-paths

Francis


On 30/03/2022 4:21 am, Julian Hyde wrote:
> I have never needed or wanted a versioned Javadoc URL for Calcite. Our APIs tend to grow over time.
> 
> The only requirement I see is that we don’t pollute the javadoc/doc of the latest released version with things that are not yet released. Which would lead to two versions: latest release and head.
> 
> I can see that the implementation might be simpler if we have multiple versions, but let’s be clear that that is not the requirement.
> 
> Julian
> 
> 
>> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
>>
>> I think it is a good idea to provide versioned JavaDocs.
>>
>> However, even if we only provide the JavaDoc of the latest release,
>> there is no need to maintain two branches (IMHO),
>> because the processes of updating the website and JavaDoc are
>> relatively separate processes (according to [1]).
>> With a single branch, it is feasible to update the website regularly,
>> and update JavaDocs only at release times.
>>
>> Best,
>> Liya Fan
>>
>> [1] https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>
>> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二 17:59写道:
>>>
>>> Hello everyone,
>>> I totally agree on automating the website publication and having a single
>>> branch, the less we do manually, the lower the chances to mess something up.
>>>
>>> I am also in favour of versioned docs in the website, it's confusing to
>>> land on updated pages from an older context like a message from the ML.
>>>
>>> Best regards,
>>> Alessandro
>>>
>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
>>> wrote:
>>>
>>>> Hey Julian,
>>>>
>>>> All very good points. I can definitely see the utility of the javadocs.
>>>> The analogue in Go would be godoc, with the difference being that the
>>>> godoc server automatically crawls the code across all versions to
>>>> generate the documentation.
>>>>
>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>> selector on the top left to look at the documentation for different
>>>> versions of the module / library in question.
>>>>
>>>> You mentioned that you do not want to have a version string in the URL.
>>>> Is there any particular reason for this? For example, if I were to end
>>>> up on the mailing list archives through a google search and there's a
>>>> message linking to the javadoc, it might be more helpful if the javadoc
>>>> was linked to a particular version of the release so that the context
>>>> around the discussion at the time makes more sense.
>>>>
>>>> We can have all javadocs for all releases of Calcite published and have
>>>> a selector to jump between versions, similar to godoc, for example, like
>>>> this javadoc for google cloud with a version selector on the bottom
>>>> right [2]. This would allow users to switch between different versions
>>>> and look at the version of the javadoc that's currently being used in
>>>> their project.
>>>>
>>>> Regarding the documentation on the website itself, would it make sense
>>>> if we have a versioned copy for each release? Currently, we only publish
>>>> the documentation for the latest release, so, if we were to look at
>>>> older messages from the mailing list and follow a link to the
>>>> documentation, the documentation could be incorrect or not relevant to
>>>> the message itself.
>>>>
>>>> Maybe we can have a folder for each release? For example:
>>>> -
>>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>>> -
>>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>>>
>>>> This would give each release their own documentation with a unique path.
>>>> For the current unreleased version, we can still put it in version of
>>>> the next release:
>>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>>> and
>>>> maybe have a message that says this is an unreleased version like
>>>> elasticsearch [3]. Links to this release's javadoc would work before and
>>>> after the release and would never break.
>>>>
>>>> The upside to this approach is that all documentation (even the
>>>> unreleased version) is published immediately, but they are versioned, so
>>>> there is no confusion. It also means that users of Calcite master would
>>>> be able to look at the docs online. This also simplifies the deployment
>>>> of site as we no longer need the site branch: the website can just be
>>>> built from master.
>>>>
>>>> Francis
>>>>
>>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Thanks for the feedback, everyone. I'll add comments and improve the 
structure to make things more clear and readable. I will also land 
similar changes in calcite-avatica and calcite-avatica-go soon so that 
we can get the first iteration out.

Francis

On 31/03/2022 6:22 am, Julian Hyde wrote:
> I had a quick look and it looks like clean well-thought-out code. I couldn’t figure what it was doing at a high level (e.g. what the generated URLs would look like), so I think some high-level comments would help.
> 
> +1
> 
> Thanks for your excellent work, Francis.
> 
> Julian
> 
> 
>> On Mar 30, 2022, at 11:54 AM, Stamatis Zampetakis <za...@gmail.com> wrote:
>>
>> Hi Francis,
>>
>> I went over the workflows and rules and everything looks good to me. Great
>> work!
>>
>> I'm +1 on merging this to master and I am OK with your suggestions about
>> Avatica.
>>
>> Thanks a lot for moving this forward. It will certainly save us a lot
>> of time in the future.
>>
>> When this goes in we need to update also our documentation. It will be a
>> good opportunity to test that everything is working properly.
>>
>> Best,
>> Stamatis
>>
>> On Wed, Mar 30, 2022, 7:50 AM Francis Chuang <fr...@apache.org>
>> wrote:
>>
>>> Forgot to mention in my last message, but I am now implementing the
>>> automation for calcite-avatica and calcite-avatica-go
>>>
>>> For those 2 repos, we never used a site branch as we usually push the
>>> site after a release. If there are any small updates to the site that
>>> occur after the release, we just built from master and pushed it as
>>> there is usually no unreleased updates to the docs due to avatica not
>>> having much updates. This is the same situation for avatica-go.
>>>
>>> Therefore, for calcite-avatica and calcite-avatica-go, I plan to:
>>> - Always build from master if there's an update to site.
>>> - For a release, build from master and build the javadocs and publish.
>>>
>>> I think this should we sufficient for our use-case for now and should
>>> improve the release process and site publishing process significantly.
>>> If we find edge cases in the future, we can deal with those at a later
>>> time.
>>>
>>> Please let me know what you guys think.
>>>
>>> Francis
>>>
>>> On 30/03/2022 4:21 am, Julian Hyde wrote:
>>>> I have never needed or wanted a versioned Javadoc URL for Calcite. Our
>>> APIs tend to grow over time.
>>>>
>>>> The only requirement I see is that we don’t pollute the javadoc/doc of
>>> the latest released version with things that are not yet released. Which
>>> would lead to two versions: latest release and head.
>>>>
>>>> I can see that the implementation might be simpler if we have multiple
>>> versions, but let’s be clear that that is not the requirement.
>>>>
>>>> Julian
>>>>
>>>>
>>>>> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
>>>>>
>>>>> I think it is a good idea to provide versioned JavaDocs.
>>>>>
>>>>> However, even if we only provide the JavaDoc of the latest release,
>>>>> there is no need to maintain two branches (IMHO),
>>>>> because the processes of updating the website and JavaDoc are
>>>>> relatively separate processes (according to [1]).
>>>>> With a single branch, it is feasible to update the website regularly,
>>>>> and update JavaDocs only at release times.
>>>>>
>>>>> Best,
>>>>> Liya Fan
>>>>>
>>>>> [1]
>>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>>>
>>>>> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二
>>> 17:59写道:
>>>>>>
>>>>>> Hello everyone,
>>>>>> I totally agree on automating the website publication and having a
>>> single
>>>>>> branch, the less we do manually, the lower the chances to mess
>>> something up.
>>>>>>
>>>>>> I am also in favour of versioned docs in the website, it's confusing to
>>>>>> land on updated pages from an older context like a message from the ML.
>>>>>>
>>>>>> Best regards,
>>>>>> Alessandro
>>>>>>
>>>>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <francischuang@apache.org
>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey Julian,
>>>>>>>
>>>>>>> All very good points. I can definitely see the utility of the
>>> javadocs.
>>>>>>> The analogue in Go would be godoc, with the difference being that the
>>>>>>> godoc server automatically crawls the code across all versions to
>>>>>>> generate the documentation.
>>>>>>>
>>>>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>>>>> selector on the top left to look at the documentation for different
>>>>>>> versions of the module / library in question.
>>>>>>>
>>>>>>> You mentioned that you do not want to have a version string in the
>>> URL.
>>>>>>> Is there any particular reason for this? For example, if I were to end
>>>>>>> up on the mailing list archives through a google search and there's a
>>>>>>> message linking to the javadoc, it might be more helpful if the
>>> javadoc
>>>>>>> was linked to a particular version of the release so that the context
>>>>>>> around the discussion at the time makes more sense.
>>>>>>>
>>>>>>> We can have all javadocs for all releases of Calcite published and
>>> have
>>>>>>> a selector to jump between versions, similar to godoc, for example,
>>> like
>>>>>>> this javadoc for google cloud with a version selector on the bottom
>>>>>>> right [2]. This would allow users to switch between different versions
>>>>>>> and look at the version of the javadoc that's currently being used in
>>>>>>> their project.
>>>>>>>
>>>>>>> Regarding the documentation on the website itself, would it make sense
>>>>>>> if we have a versioned copy for each release? Currently, we only
>>> publish
>>>>>>> the documentation for the latest release, so, if we were to look at
>>>>>>> older messages from the mailing list and follow a link to the
>>>>>>> documentation, the documentation could be incorrect or not relevant to
>>>>>>> the message itself.
>>>>>>>
>>>>>>> Maybe we can have a folder for each release? For example:
>>>>>>> -
>>>>>>>
>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>>>>>> -
>>>>>>>
>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>>>>>>
>>>>>>> This would give each release their own documentation with a unique
>>> path.
>>>>>>> For the current unreleased version, we can still put it in version of
>>>>>>> the next release:
>>>>>>>
>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>>>>>> and
>>>>>>> maybe have a message that says this is an unreleased version like
>>>>>>> elasticsearch [3]. Links to this release's javadoc would work before
>>> and
>>>>>>> after the release and would never break.
>>>>>>>
>>>>>>> The upside to this approach is that all documentation (even the
>>>>>>> unreleased version) is published immediately, but they are versioned,
>>> so
>>>>>>> there is no confusion. It also means that users of Calcite master
>>> would
>>>>>>> be able to look at the docs online. This also simplifies the
>>> deployment
>>>>>>> of site as we no longer need the site branch: the website can just be
>>>>>>> built from master.
>>>>>>>
>>>>>>> Francis
>>>>>>>
>>>>>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>>>>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>>>>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>>>>>>
>>>>
>>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Julian Hyde <jh...@gmail.com>.
I had a quick look and it looks like clean well-thought-out code. I couldn’t figure what it was doing at a high level (e.g. what the generated URLs would look like), so I think some high-level comments would help.

+1

Thanks for your excellent work, Francis.

Julian


> On Mar 30, 2022, at 11:54 AM, Stamatis Zampetakis <za...@gmail.com> wrote:
> 
> Hi Francis,
> 
> I went over the workflows and rules and everything looks good to me. Great
> work!
> 
> I'm +1 on merging this to master and I am OK with your suggestions about
> Avatica.
> 
> Thanks a lot for moving this forward. It will certainly save us a lot
> of time in the future.
> 
> When this goes in we need to update also our documentation. It will be a
> good opportunity to test that everything is working properly.
> 
> Best,
> Stamatis
> 
> On Wed, Mar 30, 2022, 7:50 AM Francis Chuang <fr...@apache.org>
> wrote:
> 
>> Forgot to mention in my last message, but I am now implementing the
>> automation for calcite-avatica and calcite-avatica-go
>> 
>> For those 2 repos, we never used a site branch as we usually push the
>> site after a release. If there are any small updates to the site that
>> occur after the release, we just built from master and pushed it as
>> there is usually no unreleased updates to the docs due to avatica not
>> having much updates. This is the same situation for avatica-go.
>> 
>> Therefore, for calcite-avatica and calcite-avatica-go, I plan to:
>> - Always build from master if there's an update to site.
>> - For a release, build from master and build the javadocs and publish.
>> 
>> I think this should we sufficient for our use-case for now and should
>> improve the release process and site publishing process significantly.
>> If we find edge cases in the future, we can deal with those at a later
>> time.
>> 
>> Please let me know what you guys think.
>> 
>> Francis
>> 
>> On 30/03/2022 4:21 am, Julian Hyde wrote:
>>> I have never needed or wanted a versioned Javadoc URL for Calcite. Our
>> APIs tend to grow over time.
>>> 
>>> The only requirement I see is that we don’t pollute the javadoc/doc of
>> the latest released version with things that are not yet released. Which
>> would lead to two versions: latest release and head.
>>> 
>>> I can see that the implementation might be simpler if we have multiple
>> versions, but let’s be clear that that is not the requirement.
>>> 
>>> Julian
>>> 
>>> 
>>>> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
>>>> 
>>>> I think it is a good idea to provide versioned JavaDocs.
>>>> 
>>>> However, even if we only provide the JavaDoc of the latest release,
>>>> there is no need to maintain two branches (IMHO),
>>>> because the processes of updating the website and JavaDoc are
>>>> relatively separate processes (according to [1]).
>>>> With a single branch, it is feasible to update the website regularly,
>>>> and update JavaDocs only at release times.
>>>> 
>>>> Best,
>>>> Liya Fan
>>>> 
>>>> [1]
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>> 
>>>> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二
>> 17:59写道:
>>>>> 
>>>>> Hello everyone,
>>>>> I totally agree on automating the website publication and having a
>> single
>>>>> branch, the less we do manually, the lower the chances to mess
>> something up.
>>>>> 
>>>>> I am also in favour of versioned docs in the website, it's confusing to
>>>>> land on updated pages from an older context like a message from the ML.
>>>>> 
>>>>> Best regards,
>>>>> Alessandro
>>>>> 
>>>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <francischuang@apache.org
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hey Julian,
>>>>>> 
>>>>>> All very good points. I can definitely see the utility of the
>> javadocs.
>>>>>> The analogue in Go would be godoc, with the difference being that the
>>>>>> godoc server automatically crawls the code across all versions to
>>>>>> generate the documentation.
>>>>>> 
>>>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>>>> selector on the top left to look at the documentation for different
>>>>>> versions of the module / library in question.
>>>>>> 
>>>>>> You mentioned that you do not want to have a version string in the
>> URL.
>>>>>> Is there any particular reason for this? For example, if I were to end
>>>>>> up on the mailing list archives through a google search and there's a
>>>>>> message linking to the javadoc, it might be more helpful if the
>> javadoc
>>>>>> was linked to a particular version of the release so that the context
>>>>>> around the discussion at the time makes more sense.
>>>>>> 
>>>>>> We can have all javadocs for all releases of Calcite published and
>> have
>>>>>> a selector to jump between versions, similar to godoc, for example,
>> like
>>>>>> this javadoc for google cloud with a version selector on the bottom
>>>>>> right [2]. This would allow users to switch between different versions
>>>>>> and look at the version of the javadoc that's currently being used in
>>>>>> their project.
>>>>>> 
>>>>>> Regarding the documentation on the website itself, would it make sense
>>>>>> if we have a versioned copy for each release? Currently, we only
>> publish
>>>>>> the documentation for the latest release, so, if we were to look at
>>>>>> older messages from the mailing list and follow a link to the
>>>>>> documentation, the documentation could be incorrect or not relevant to
>>>>>> the message itself.
>>>>>> 
>>>>>> Maybe we can have a folder for each release? For example:
>>>>>> -
>>>>>> 
>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>>>>> -
>>>>>> 
>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>>>>> 
>>>>>> This would give each release their own documentation with a unique
>> path.
>>>>>> For the current unreleased version, we can still put it in version of
>>>>>> the next release:
>>>>>> 
>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>>>>> and
>>>>>> maybe have a message that says this is an unreleased version like
>>>>>> elasticsearch [3]. Links to this release's javadoc would work before
>> and
>>>>>> after the release and would never break.
>>>>>> 
>>>>>> The upside to this approach is that all documentation (even the
>>>>>> unreleased version) is published immediately, but they are versioned,
>> so
>>>>>> there is no confusion. It also means that users of Calcite master
>> would
>>>>>> be able to look at the docs online. This also simplifies the
>> deployment
>>>>>> of site as we no longer need the site branch: the website can just be
>>>>>> built from master.
>>>>>> 
>>>>>> Francis
>>>>>> 
>>>>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>>>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>>>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>>>>> 
>>> 
>> 


Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Francis,

I went over the workflows and rules and everything looks good to me. Great
work!

I'm +1 on merging this to master and I am OK with your suggestions about
Avatica.

Thanks a lot for moving this forward. It will certainly save us a lot
of time in the future.

When this goes in we need to update also our documentation. It will be a
good opportunity to test that everything is working properly.

Best,
Stamatis

On Wed, Mar 30, 2022, 7:50 AM Francis Chuang <fr...@apache.org>
wrote:

> Forgot to mention in my last message, but I am now implementing the
> automation for calcite-avatica and calcite-avatica-go
>
> For those 2 repos, we never used a site branch as we usually push the
> site after a release. If there are any small updates to the site that
> occur after the release, we just built from master and pushed it as
> there is usually no unreleased updates to the docs due to avatica not
> having much updates. This is the same situation for avatica-go.
>
> Therefore, for calcite-avatica and calcite-avatica-go, I plan to:
> - Always build from master if there's an update to site.
> - For a release, build from master and build the javadocs and publish.
>
> I think this should we sufficient for our use-case for now and should
> improve the release process and site publishing process significantly.
> If we find edge cases in the future, we can deal with those at a later
> time.
>
> Please let me know what you guys think.
>
> Francis
>
> On 30/03/2022 4:21 am, Julian Hyde wrote:
> > I have never needed or wanted a versioned Javadoc URL for Calcite. Our
> APIs tend to grow over time.
> >
> > The only requirement I see is that we don’t pollute the javadoc/doc of
> the latest released version with things that are not yet released. Which
> would lead to two versions: latest release and head.
> >
> > I can see that the implementation might be simpler if we have multiple
> versions, but let’s be clear that that is not the requirement.
> >
> > Julian
> >
> >
> >> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
> >>
> >> I think it is a good idea to provide versioned JavaDocs.
> >>
> >> However, even if we only provide the JavaDoc of the latest release,
> >> there is no need to maintain two branches (IMHO),
> >> because the processes of updating the website and JavaDoc are
> >> relatively separate processes (according to [1]).
> >> With a single branch, it is feasible to update the website regularly,
> >> and update JavaDocs only at release times.
> >>
> >> Best,
> >> Liya Fan
> >>
> >> [1]
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> >>
> >> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二
> 17:59写道:
> >>>
> >>> Hello everyone,
> >>> I totally agree on automating the website publication and having a
> single
> >>> branch, the less we do manually, the lower the chances to mess
> something up.
> >>>
> >>> I am also in favour of versioned docs in the website, it's confusing to
> >>> land on updated pages from an older context like a message from the ML.
> >>>
> >>> Best regards,
> >>> Alessandro
> >>>
> >>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <francischuang@apache.org
> >
> >>> wrote:
> >>>
> >>>> Hey Julian,
> >>>>
> >>>> All very good points. I can definitely see the utility of the
> javadocs.
> >>>> The analogue in Go would be godoc, with the difference being that the
> >>>> godoc server automatically crawls the code across all versions to
> >>>> generate the documentation.
> >>>>
> >>>> As an example, see the godoc for protobuf [1]. There is a version
> >>>> selector on the top left to look at the documentation for different
> >>>> versions of the module / library in question.
> >>>>
> >>>> You mentioned that you do not want to have a version string in the
> URL.
> >>>> Is there any particular reason for this? For example, if I were to end
> >>>> up on the mailing list archives through a google search and there's a
> >>>> message linking to the javadoc, it might be more helpful if the
> javadoc
> >>>> was linked to a particular version of the release so that the context
> >>>> around the discussion at the time makes more sense.
> >>>>
> >>>> We can have all javadocs for all releases of Calcite published and
> have
> >>>> a selector to jump between versions, similar to godoc, for example,
> like
> >>>> this javadoc for google cloud with a version selector on the bottom
> >>>> right [2]. This would allow users to switch between different versions
> >>>> and look at the version of the javadoc that's currently being used in
> >>>> their project.
> >>>>
> >>>> Regarding the documentation on the website itself, would it make sense
> >>>> if we have a versioned copy for each release? Currently, we only
> publish
> >>>> the documentation for the latest release, so, if we were to look at
> >>>> older messages from the mailing list and follow a link to the
> >>>> documentation, the documentation could be incorrect or not relevant to
> >>>> the message itself.
> >>>>
> >>>> Maybe we can have a folder for each release? For example:
> >>>> -
> >>>>
> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
> >>>> -
> >>>>
> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
> >>>>
> >>>> This would give each release their own documentation with a unique
> path.
> >>>> For the current unreleased version, we can still put it in version of
> >>>> the next release:
> >>>>
> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
> >>>> and
> >>>> maybe have a message that says this is an unreleased version like
> >>>> elasticsearch [3]. Links to this release's javadoc would work before
> and
> >>>> after the release and would never break.
> >>>>
> >>>> The upside to this approach is that all documentation (even the
> >>>> unreleased version) is published immediately, but they are versioned,
> so
> >>>> there is no confusion. It also means that users of Calcite master
> would
> >>>> be able to look at the docs online. This also simplifies the
> deployment
> >>>> of site as we no longer need the site branch: the website can just be
> >>>> built from master.
> >>>>
> >>>> Francis
> >>>>
> >>>> [1] https://pkg.go.dev/google.golang.org/protobuf
> >>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
> >>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
> >>>>
> >
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Forgot to mention in my last message, but I am now implementing the 
automation for calcite-avatica and calcite-avatica-go

For those 2 repos, we never used a site branch as we usually push the 
site after a release. If there are any small updates to the site that 
occur after the release, we just built from master and pushed it as 
there is usually no unreleased updates to the docs due to avatica not 
having much updates. This is the same situation for avatica-go.

Therefore, for calcite-avatica and calcite-avatica-go, I plan to:
- Always build from master if there's an update to site.
- For a release, build from master and build the javadocs and publish.

I think this should we sufficient for our use-case for now and should 
improve the release process and site publishing process significantly. 
If we find edge cases in the future, we can deal with those at a later time.

Please let me know what you guys think.

Francis

On 30/03/2022 4:21 am, Julian Hyde wrote:
> I have never needed or wanted a versioned Javadoc URL for Calcite. Our APIs tend to grow over time.
> 
> The only requirement I see is that we don’t pollute the javadoc/doc of the latest released version with things that are not yet released. Which would lead to two versions: latest release and head.
> 
> I can see that the implementation might be simpler if we have multiple versions, but let’s be clear that that is not the requirement.
> 
> Julian
> 
> 
>> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
>>
>> I think it is a good idea to provide versioned JavaDocs.
>>
>> However, even if we only provide the JavaDoc of the latest release,
>> there is no need to maintain two branches (IMHO),
>> because the processes of updating the website and JavaDoc are
>> relatively separate processes (according to [1]).
>> With a single branch, it is feasible to update the website regularly,
>> and update JavaDocs only at release times.
>>
>> Best,
>> Liya Fan
>>
>> [1] https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>
>> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二 17:59写道:
>>>
>>> Hello everyone,
>>> I totally agree on automating the website publication and having a single
>>> branch, the less we do manually, the lower the chances to mess something up.
>>>
>>> I am also in favour of versioned docs in the website, it's confusing to
>>> land on updated pages from an older context like a message from the ML.
>>>
>>> Best regards,
>>> Alessandro
>>>
>>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
>>> wrote:
>>>
>>>> Hey Julian,
>>>>
>>>> All very good points. I can definitely see the utility of the javadocs.
>>>> The analogue in Go would be godoc, with the difference being that the
>>>> godoc server automatically crawls the code across all versions to
>>>> generate the documentation.
>>>>
>>>> As an example, see the godoc for protobuf [1]. There is a version
>>>> selector on the top left to look at the documentation for different
>>>> versions of the module / library in question.
>>>>
>>>> You mentioned that you do not want to have a version string in the URL.
>>>> Is there any particular reason for this? For example, if I were to end
>>>> up on the mailing list archives through a google search and there's a
>>>> message linking to the javadoc, it might be more helpful if the javadoc
>>>> was linked to a particular version of the release so that the context
>>>> around the discussion at the time makes more sense.
>>>>
>>>> We can have all javadocs for all releases of Calcite published and have
>>>> a selector to jump between versions, similar to godoc, for example, like
>>>> this javadoc for google cloud with a version selector on the bottom
>>>> right [2]. This would allow users to switch between different versions
>>>> and look at the version of the javadoc that's currently being used in
>>>> their project.
>>>>
>>>> Regarding the documentation on the website itself, would it make sense
>>>> if we have a versioned copy for each release? Currently, we only publish
>>>> the documentation for the latest release, so, if we were to look at
>>>> older messages from the mailing list and follow a link to the
>>>> documentation, the documentation could be incorrect or not relevant to
>>>> the message itself.
>>>>
>>>> Maybe we can have a folder for each release? For example:
>>>> -
>>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>>> -
>>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>>>
>>>> This would give each release their own documentation with a unique path.
>>>> For the current unreleased version, we can still put it in version of
>>>> the next release:
>>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>>> and
>>>> maybe have a message that says this is an unreleased version like
>>>> elasticsearch [3]. Links to this release's javadoc would work before and
>>>> after the release and would never break.
>>>>
>>>> The upside to this approach is that all documentation (even the
>>>> unreleased version) is published immediately, but they are versioned, so
>>>> there is no confusion. It also means that users of Calcite master would
>>>> be able to look at the docs online. This also simplifies the deployment
>>>> of site as we no longer need the site branch: the website can just be
>>>> built from master.
>>>>
>>>> Francis
>>>>
>>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Julian Hyde <jh...@gmail.com>.
I have never needed or wanted a versioned Javadoc URL for Calcite. Our APIs tend to grow over time.

The only requirement I see is that we don’t pollute the javadoc/doc of the latest released version with things that are not yet released. Which would lead to two versions: latest release and head.

I can see that the implementation might be simpler if we have multiple versions, but let’s be clear that that is not the requirement.

Julian


> On Mar 29, 2022, at 6:49 AM, Fan Liya <li...@gmail.com> wrote:
> 
> I think it is a good idea to provide versioned JavaDocs.
> 
> However, even if we only provide the JavaDoc of the latest release,
> there is no need to maintain two branches (IMHO),
> because the processes of updating the website and JavaDoc are
> relatively separate processes (according to [1]).
> With a single branch, it is feasible to update the website regularly,
> and update JavaDocs only at release times.
> 
> Best,
> Liya Fan
> 
> [1] https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> 
> Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二 17:59写道:
>> 
>> Hello everyone,
>> I totally agree on automating the website publication and having a single
>> branch, the less we do manually, the lower the chances to mess something up.
>> 
>> I am also in favour of versioned docs in the website, it's confusing to
>> land on updated pages from an older context like a message from the ML.
>> 
>> Best regards,
>> Alessandro
>> 
>> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
>> wrote:
>> 
>>> Hey Julian,
>>> 
>>> All very good points. I can definitely see the utility of the javadocs.
>>> The analogue in Go would be godoc, with the difference being that the
>>> godoc server automatically crawls the code across all versions to
>>> generate the documentation.
>>> 
>>> As an example, see the godoc for protobuf [1]. There is a version
>>> selector on the top left to look at the documentation for different
>>> versions of the module / library in question.
>>> 
>>> You mentioned that you do not want to have a version string in the URL.
>>> Is there any particular reason for this? For example, if I were to end
>>> up on the mailing list archives through a google search and there's a
>>> message linking to the javadoc, it might be more helpful if the javadoc
>>> was linked to a particular version of the release so that the context
>>> around the discussion at the time makes more sense.
>>> 
>>> We can have all javadocs for all releases of Calcite published and have
>>> a selector to jump between versions, similar to godoc, for example, like
>>> this javadoc for google cloud with a version selector on the bottom
>>> right [2]. This would allow users to switch between different versions
>>> and look at the version of the javadoc that's currently being used in
>>> their project.
>>> 
>>> Regarding the documentation on the website itself, would it make sense
>>> if we have a versioned copy for each release? Currently, we only publish
>>> the documentation for the latest release, so, if we were to look at
>>> older messages from the mailing list and follow a link to the
>>> documentation, the documentation could be incorrect or not relevant to
>>> the message itself.
>>> 
>>> Maybe we can have a folder for each release? For example:
>>> -
>>> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
>>> -
>>> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>>> 
>>> This would give each release their own documentation with a unique path.
>>> For the current unreleased version, we can still put it in version of
>>> the next release:
>>> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
>>> and
>>> maybe have a message that says this is an unreleased version like
>>> elasticsearch [3]. Links to this release's javadoc would work before and
>>> after the release and would never break.
>>> 
>>> The upside to this approach is that all documentation (even the
>>> unreleased version) is published immediately, but they are versioned, so
>>> there is no confusion. It also means that users of Calcite master would
>>> be able to look at the docs online. This also simplifies the deployment
>>> of site as we no longer need the site branch: the website can just be
>>> built from master.
>>> 
>>> Francis
>>> 
>>> [1] https://pkg.go.dev/google.golang.org/protobuf
>>> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
>>> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>>> 


Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Fan Liya <li...@gmail.com>.
I think it is a good idea to provide versioned JavaDocs.

However, even if we only provide the JavaDoc of the latest release,
there is no need to maintain two branches (IMHO),
because the processes of updating the website and JavaDoc are
relatively separate processes (according to [1]).
With a single branch, it is feasible to update the website regularly,
and update JavaDocs only at release times.

Best,
Liya Fan

[1] https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md

Alessandro Solimando <al...@gmail.com> 于2022年3月29日周二 17:59写道:
>
> Hello everyone,
> I totally agree on automating the website publication and having a single
> branch, the less we do manually, the lower the chances to mess something up.
>
> I am also in favour of versioned docs in the website, it's confusing to
> land on updated pages from an older context like a message from the ML.
>
> Best regards,
> Alessandro
>
> On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
> wrote:
>
> > Hey Julian,
> >
> > All very good points. I can definitely see the utility of the javadocs.
> > The analogue in Go would be godoc, with the difference being that the
> > godoc server automatically crawls the code across all versions to
> > generate the documentation.
> >
> > As an example, see the godoc for protobuf [1]. There is a version
> > selector on the top left to look at the documentation for different
> > versions of the module / library in question.
> >
> > You mentioned that you do not want to have a version string in the URL.
> > Is there any particular reason for this? For example, if I were to end
> > up on the mailing list archives through a google search and there's a
> > message linking to the javadoc, it might be more helpful if the javadoc
> > was linked to a particular version of the release so that the context
> > around the discussion at the time makes more sense.
> >
> > We can have all javadocs for all releases of Calcite published and have
> > a selector to jump between versions, similar to godoc, for example, like
> > this javadoc for google cloud with a version selector on the bottom
> > right [2]. This would allow users to switch between different versions
> > and look at the version of the javadoc that's currently being used in
> > their project.
> >
> > Regarding the documentation on the website itself, would it make sense
> > if we have a versioned copy for each release? Currently, we only publish
> > the documentation for the latest release, so, if we were to look at
> > older messages from the mailing list and follow a link to the
> > documentation, the documentation could be incorrect or not relevant to
> > the message itself.
> >
> > Maybe we can have a folder for each release? For example:
> > -
> > calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
> > -
> > calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
> >
> > This would give each release their own documentation with a unique path.
> > For the current unreleased version, we can still put it in version of
> > the next release:
> > calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
> > and
> > maybe have a message that says this is an unreleased version like
> > elasticsearch [3]. Links to this release's javadoc would work before and
> > after the release and would never break.
> >
> > The upside to this approach is that all documentation (even the
> > unreleased version) is published immediately, but they are versioned, so
> > there is no confusion. It also means that users of Calcite master would
> > be able to look at the docs online. This also simplifies the deployment
> > of site as we no longer need the site branch: the website can just be
> > built from master.
> >
> > Francis
> >
> > [1] https://pkg.go.dev/google.golang.org/protobuf
> > [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
> > [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
> >

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Alessandro Solimando <al...@gmail.com>.
Hello everyone,
I totally agree on automating the website publication and having a single
branch, the less we do manually, the lower the chances to mess something up.

I am also in favour of versioned docs in the website, it's confusing to
land on updated pages from an older context like a message from the ML.

Best regards,
Alessandro

On Tue, 29 Mar 2022 at 06:44, Francis Chuang <fr...@apache.org>
wrote:

> Hey Julian,
>
> All very good points. I can definitely see the utility of the javadocs.
> The analogue in Go would be godoc, with the difference being that the
> godoc server automatically crawls the code across all versions to
> generate the documentation.
>
> As an example, see the godoc for protobuf [1]. There is a version
> selector on the top left to look at the documentation for different
> versions of the module / library in question.
>
> You mentioned that you do not want to have a version string in the URL.
> Is there any particular reason for this? For example, if I were to end
> up on the mailing list archives through a google search and there's a
> message linking to the javadoc, it might be more helpful if the javadoc
> was linked to a particular version of the release so that the context
> around the discussion at the time makes more sense.
>
> We can have all javadocs for all releases of Calcite published and have
> a selector to jump between versions, similar to godoc, for example, like
> this javadoc for google cloud with a version selector on the bottom
> right [2]. This would allow users to switch between different versions
> and look at the version of the javadoc that's currently being used in
> their project.
>
> Regarding the documentation on the website itself, would it make sense
> if we have a versioned copy for each release? Currently, we only publish
> the documentation for the latest release, so, if we were to look at
> older messages from the mailing list and follow a link to the
> documentation, the documentation could be incorrect or not relevant to
> the message itself.
>
> Maybe we can have a folder for each release? For example:
> -
> calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
> -
> calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters
>
> This would give each release their own documentation with a unique path.
> For the current unreleased version, we can still put it in version of
> the next release:
> calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters
> and
> maybe have a message that says this is an unreleased version like
> elasticsearch [3]. Links to this release's javadoc would work before and
> after the release and would never break.
>
> The upside to this approach is that all documentation (even the
> unreleased version) is published immediately, but they are versioned, so
> there is no confusion. It also means that users of Calcite master would
> be able to look at the docs online. This also simplifies the deployment
> of site as we no longer need the site branch: the website can just be
> built from master.
>
> Francis
>
> [1] https://pkg.go.dev/google.golang.org/protobuf
> [2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
> [3] https://www.elastic.co/guide/en/elastic-stack/master/index.html
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Hey Julian,

All very good points. I can definitely see the utility of the javadocs. 
The analogue in Go would be godoc, with the difference being that the 
godoc server automatically crawls the code across all versions to 
generate the documentation.

As an example, see the godoc for protobuf [1]. There is a version 
selector on the top left to look at the documentation for different 
versions of the module / library in question.

You mentioned that you do not want to have a version string in the URL. 
Is there any particular reason for this? For example, if I were to end 
up on the mailing list archives through a google search and there's a 
message linking to the javadoc, it might be more helpful if the javadoc 
was linked to a particular version of the release so that the context 
around the discussion at the time makes more sense.

We can have all javadocs for all releases of Calcite published and have 
a selector to jump between versions, similar to godoc, for example, like 
this javadoc for google cloud with a version selector on the bottom 
right [2]. This would allow users to switch between different versions 
and look at the version of the javadoc that's currently being used in 
their project.

Regarding the documentation on the website itself, would it make sense 
if we have a versioned copy for each release? Currently, we only publish 
the documentation for the latest release, so, if we were to look at 
older messages from the mailing list and follow a link to the 
documentation, the documentation could be incorrect or not relevant to 
the message itself.

Maybe we can have a folder for each release? For example:
- calcite.apache.org/docs/1.30.0/adapter.html#jdbc-connect-string-parameters
- calcite.apache.org/docs/1.29.0/adapter.html#jdbc-connect-string-parameters

This would give each release their own documentation with a unique path. 
For the current unreleased version, we can still put it in version of 
the next release: 
calcite.apache.org/docs/1.31.0/adapter.html#jbc-connect-string-parameters and 
maybe have a message that says this is an unreleased version like 
elasticsearch [3]. Links to this release's javadoc would work before and 
after the release and would never break.

The upside to this approach is that all documentation (even the 
unreleased version) is published immediately, but they are versioned, so 
there is no confusion. It also means that users of Calcite master would 
be able to look at the docs online. This also simplifies the deployment 
of site as we no longer need the site branch: the website can just be 
built from master.

Francis

[1] https://pkg.go.dev/google.golang.org/protobuf
[2] https://googleapis.dev/java/google-cloud-asset/latest/index.html
[3] https://www.elastic.co/guide/en/elastic-stack/master/index.html

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Julian Hyde <jh...@gmail.com>.
To respond to Stamatis’ query: I do use the generated javadoc.  For example, if I am talking about a connection property I might email the following link:

  https://calcite.apache.org/javadocAggregate/org/apache/calcite/config/CalciteConnectionProperty.html#FORCE_DECORRELATE <https://calcite.apache.org/javadocAggregate/org/apache/calcite/config/CalciteConnectionProperty.html#FORCE_DECORRELATE>

When I send such links, I would like them to be good forever, not include a particular version string in the URL.

Sending people to the code, as Stamatis suggests, is a bit of an anti-pattern. We want to encourage people to document their interfaces so that they make sense without reading the code. 

I also send links to documentation, for example

  https://calcite.apache.org/docs/adapter.html#jdbc-connect-string-parameters <https://calcite.apache.org/docs/adapter.html#jdbc-connect-string-parameters>

I see the benefits of only having to maintain one branch (and having the ’site’ branch maintained automatically, or having the web site generated from the master branch). But I also think that the javadoc and documentation should not change until features are published in a release.

Julian
 

> On Mar 28, 2022, at 6:44 AM, Fan Liya <li...@gmail.com> wrote:
> 
> Hi all,
> 
> Thanks for the fruitful discussion.
> It seems there is not a "single formula" that is safe for all scenarios.
> 
> In addition, it requires some effort to recognize the particular
> scenario we are in (commits out of order, missing commits, etc.).
> Sometimes it is not reliable to simply check the commit message to
> determine if two commits have the same contents.
> 
> So it would be great if we can simply work on a single branch.
> 
> Best,
> Liya Fan
> 
> 
> Francis Chuang <fr...@apache.org> 于2022年3月28日周一 18:06写道:
>> 
>> If all version specific documentation is siloed into their own
>> respective folders for each version, then this will be much easier to
>> automate as we can just simply build and publish the site on every push
>> to master.
>> 
>> Each version would be in each folder, with the unreleased version being
>> in the devel folder.
>> 
>> Currently, I think the documentation is scattered across various folders
>> and files in the website, so to simply build from every commit to master
>> would mean that users will see stuff only relevant to the unreleased
>> version. This may or may not be confusing.
>> 
>> We can use some heuristics as I mentioned earlier: keep the current way
>> but automate it.
>> - Have a github action that watches every commit, if it only touches a
>> list of whitelisted folders or pages that we know will not be
>> documentation for a future release (news, community, etc), cherry pick
>> it to the site branch
>> - Have a github action that watches for a final release and force site
>> to equal master
>> - On every commit to the site branch, build and publish the site
>> 
>> Francis
>> 
>> On 28/03/2022 8:55 pm, Stamatis Zampetakis wrote:
>>> Having multiple APIs versions in the website has been discussed here [1].
>>> 
>>> Since this work of automation is important and has been postponed many
>>> times in the past I think it is important to get something simple to begin
>>> with and add "new features" like versioned documentation later on and if
>>> there is interest.
>>> 
>>> Building the site from the master on every commit is as simple as it can
>>> get and that's why I brought it up.
>>> Having said that, any other option which goes one step further gets a +1
>>> from me.
>>> 
>>> Best,
>>> Stamatis
>>> 
>>> [1] https://lists.apache.org/thread/l81th3qvdwttgk135nplz983m78d62m7
>>> 
>>> On Mon, Mar 28, 2022 at 11:38 AM Ruben Q L <ru...@gmail.com> wrote:
>>> 
>>>> Would it be clearer if we had different API versions on the site?
>>>> We could have one API link per Calcite version (or at least for the latest
>>>> X versions) + an API link of the current master head (that could be updated
>>>> automatically).
>>>> I think this "multiple API" idea has been already discussed in the past,
>>>> but I could not find the thread.
>>>> 
>>>> 
>>>> 
>>>> On Mon, Mar 28, 2022 at 12:24 AM Francis Chuang <fr...@apache.org>
>>>> wrote:
>>>> 
>>>>> It looks like Infra should be able to give us a token to push to
>>>>> calcite-site from our other calcite-* repos using Github actions [1].
>>>>> 
>>>>> If we can have some consensus regarding whether to keep the site branch
>>>>> and maintain the current process, or to remove it and just publish from
>>>>> master, I can see if I can get the automated site builds moving along.
>>>>> 
>>>>> [1] https://issues.apache.org/jira/browse/INFRA-21453
>>>>> 
>>>>> On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
>>>>>> It would be great if we manage to wrap up CALCITE-3129 and have an
>>>>>> automated build for the website.
>>>>>> 
>>>>>> The thing that complicates the procedure in general (automated or not)
>>>> is
>>>>>> the fact that we don't want to publish API related changes on the web
>>>>>> before they are officially released.
>>>>>> I understand the benefits for trying to maintain this practice but I
>>>>> would
>>>>>> be willing to sacrifice those for having simpler procedures/scripts.
>>>>>> 
>>>>>> I rarely search for any javadoc (Calcite or other) online because
>>>>> whenever
>>>>>> I need something the IDE fetches it for me. In most cases, the javadoc
>>>>>> won't be enough and I will need to dig in the code which is again
>>>> fetched
>>>>>> automatically by IDE. If nothing works, and the project is open
>>>> source, I
>>>>>> will simply download the respective project and look into the
>>>>> javadoc/code
>>>>>> directly.
>>>>>> 
>>>>>> Apart from that, users will rarely jump to the latest Calcite version
>>>>>> directly so having the corresponding javadoc online might not be very
>>>>>> helpful.
>>>>>> 
>>>>>> Long story short, another option would be to build/update the website
>>>>>> directly after every commit on master or at certain intervals (e.g.,
>>>>> daily)
>>>>>> and not have any other branches to maintain.
>>>>>> 
>>>>>> If there are really people using the published javadoc on the website
>>>>> [1],
>>>>>> I would really like to hear their thoughts about this proposal.
>>>>>> 
>>>>>> Best,
>>>>>> Stamatis
>>>>>> 
>>>>>> [1] https://calcite.apache.org/javadocAggregate/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <
>>>>> francischuang@apache.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> Ideally, I would like to see that the site builds are automated by CI,
>>>>>>> we still have CALCITE-3129 [1] open.
>>>>>>> 
>>>>>>> My thinking is that if we automate the site building and deployment
>>>>>>> process, we can use the following heuristics:
>>>>>>> - Build the site completely and deploy when a final release tag is
>>>>>>> pushed to the repo.
>>>>>>> - Build the site on a partial basis in all other cases:
>>>>>>>     - Option 1: Check out the last final release tag and apply changes
>>>>> to
>>>>>>> the site that only touches certain whitelisted categories such as news
>>>>>>> and community. This should allow us to not have documentation changes
>>>>>>> for code deployed before the final release.This should then allow us
>>>> to
>>>>>>> get rid of the site branch
>>>>>>>     - Option 2: We keep the site branch, but we automate the current
>>>>>>> process. On every commit to master, if it is a change to the files in
>>>>>>> the site directory, we check if the change only touches certain
>>>>>>> whitelisted categories such as news and community. If so, we cherry
>>>> pick
>>>>>>> that into the site branch automatically using Github Actions and build
>>>>>>> and deploy the site. When a final release tag is pushed to the repo,
>>>> we
>>>>>>> use Github Actions to make the master and site branches equal and
>>>>>>> automatically build and deploy the site.
>>>>>>> 
>>>>>>> This would negate the need to build and publish the site manually and
>>>>>>> simplify the process as we always only commit to master. As an added
>>>>>>> bonus, we if we keep the site branch, but automate the process, maybe
>>>> we
>>>>>>> can lock the site branch so that only CI can push to it. The downside
>>>> of
>>>>>>> course, is that we're relying on heuristics for the partial build, so
>>>>>>> there's some "magic" to it.
>>>>>>> 
>>>>>>> Francis
>>>>>>> 
>>>>>>> 
>>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-3129
>>>>>>> 
>>>>>>> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> Thanks for starting this discussion Liya. It is important to find
>>>> which
>>>>>>>> parts of the process are unclear and improve them if possible.
>>>>>>>> 
>>>>>>>> The current procedure for updating the website remains unchanged and
>>>> it
>>>>>>> is
>>>>>>>> documented here:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>>>>>> 
>>>>>>>> If the procedure is not followed, which has happened a few times in
>>>> the
>>>>>>>> past, meaning that someone commits directly in site without
>>>> committing
>>>>> in
>>>>>>>> master then we will have commits in site that may get lost forever.
>>>>>>>> When we discover such commits we should port them to master. The
>>>>>>>> cherry-pick now goes in the opposite direction (from site to master).
>>>>>>>> This is usually discovered/done by the release manager and that's why
>>>>> we
>>>>>>>> have the respective instructions in the howto [1].
>>>>>>>> 
>>>>>>>> After a release we don't care much what happens because master and
>>>> site
>>>>>>>> should be equal. As Francis pointed out this is usually done with a
>>>>> force
>>>>>>>> push.
>>>>>>>> 
>>>>>>>> Regarding Julian's question the commit hashes before the force pushes
>>>>>>> done
>>>>>>>> by Liya are the following (according to commits@calcite):
>>>>>>>> * master -> dcbc493bf699d961427952c5efc047b76d859096
>>>>>>>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Stamatis
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
>>>>>>>> 
>>>>>>>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org>
>>>> wrote:
>>>>>>>> 
>>>>>>>>> Does anyone know (or could find out) the SHA of the master and site
>>>>>>>>> branches at the time that Fan attempted to move the site changes
>>>> over?
>>>>>>>>> If so, we could recreate the same environment, and figure out a set
>>>> of
>>>>>>>>> git commands that would have worked then and will work for the next
>>>>>>>>> release manager. This process is safe because we can do these
>>>>>>>>> experiments in a local git sandbox, without pushing to any remote.
>>>>>>>>> 
>>>>>>>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Francis,
>>>>>>>>>> 
>>>>>>>>>> Thanks for your feedback.
>>>>>>>>>> 
>>>>>>>>>> It seems we should choose option 2.
>>>>>>>>>> In addition, it seems less risky to run "git push --force" commands
>>>>> in
>>>>>>>>>> the site branch.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Liya Fan
>>>>>>>>>> 
>>>>>>>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Liya,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for bringing this up. We have always done the following
>>>> when
>>>>>>>>>>> committing:
>>>>>>>>>>> 1. Always commit to master.
>>>>>>>>>>> 2. If we need to publish the change to the site now (for example,
>>>>> new
>>>>>>>>>>> committer or announcement), cherry-pick the change into the site
>>>>>>> branch
>>>>>>>>>>> and publish it.
>>>>>>>>>>> 3. After a release, make the site branch the same as master (git
>>>>> reset
>>>>>>>>>>> --hard master) and force push (git push --force origin site).
>>>>>>>>>>> 
>>>>>>>>>>> Francis
>>>>>>>>>>> 
>>>>>>>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> 
>>>>>>>>>>>> As part of the release process, we need to synchronize the master
>>>>> and
>>>>>>>>>>>> site branches (Please see
>>>>>>>>>>>> 
>>>>>>>>> 
>>>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
>>>>> ).
>>>>>>>>>>>> Usually, the site is behind the master branch by some commits.
>>>>>>>>>>>> If the existing commits in the site branch are in the same order
>>>> as
>>>>>>>>> in
>>>>>>>>>>>> the master branch, the task is easy: just switch to the site
>>>>> branch,
>>>>>>>>>>>> and run
>>>>>>>>>>>> 
>>>>>>>>>>>> git rebase master
>>>>>>>>>>>> 
>>>>>>>>>>>> However, if some commits are in different orders, it can be
>>>> tricky.
>>>>>>>>>>>> For example, the master branch may have the following commits (in
>>>>>>>>>>>> order):
>>>>>>>>>>>> 
>>>>>>>>>>>> A, B, X1, X2, ... , Xn.
>>>>>>>>>>>> 
>>>>>>>>>>>> and the site branch may have the following commits (in order):
>>>>>>>>>>>> 
>>>>>>>>>>>> B, A, X1, X2.
>>>>>>>>>>>> 
>>>>>>>>>>>> Basically we have two choices:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. We can live with the out of order commits, because after
>>>>>>>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
>>>>> file
>>>>>>>>>>>> contents will be consistent.
>>>>>>>>>>>> 
>>>>>>>>>>>> The problem is that, since the two branches have diverged, we
>>>>> cannot
>>>>>>>>>>>> use the rebase command. Instead, we have to manually cherry-pick
>>>>>>>>>>>> commits individually, which requires large effort. In addition,
>>>> for
>>>>>>>>>>>> any subsequent release processes, we have to manually cherry-pick
>>>>>>>>> each
>>>>>>>>>>>> commit.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. We need to make the commits order consistent, which will make
>>>> it
>>>>>>>>>>>> easy for subsequent releases.
>>>>>>>>>>>> However, the problem is that, to make the commits order
>>>> consistent,
>>>>>>>>>>>> some git force push command is unavoidable, which is risky to
>>>> some
>>>>>>>>>>>> extent.
>>>>>>>>>>>> 
>>>>>>>>>>>> So what is the recommended way to do this? Thanks in advance for
>>>>>>>>> your feedback!
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Liya Fan
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 


Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Fan Liya <li...@gmail.com>.
Hi all,

Thanks for the fruitful discussion.
It seems there is not a "single formula" that is safe for all scenarios.

In addition, it requires some effort to recognize the particular
scenario we are in (commits out of order, missing commits, etc.).
Sometimes it is not reliable to simply check the commit message to
determine if two commits have the same contents.

So it would be great if we can simply work on a single branch.

Best,
Liya Fan


Francis Chuang <fr...@apache.org> 于2022年3月28日周一 18:06写道:
>
> If all version specific documentation is siloed into their own
> respective folders for each version, then this will be much easier to
> automate as we can just simply build and publish the site on every push
> to master.
>
> Each version would be in each folder, with the unreleased version being
> in the devel folder.
>
> Currently, I think the documentation is scattered across various folders
> and files in the website, so to simply build from every commit to master
> would mean that users will see stuff only relevant to the unreleased
> version. This may or may not be confusing.
>
> We can use some heuristics as I mentioned earlier: keep the current way
> but automate it.
> - Have a github action that watches every commit, if it only touches a
> list of whitelisted folders or pages that we know will not be
> documentation for a future release (news, community, etc), cherry pick
> it to the site branch
> - Have a github action that watches for a final release and force site
> to equal master
> - On every commit to the site branch, build and publish the site
>
> Francis
>
> On 28/03/2022 8:55 pm, Stamatis Zampetakis wrote:
> > Having multiple APIs versions in the website has been discussed here [1].
> >
> > Since this work of automation is important and has been postponed many
> > times in the past I think it is important to get something simple to begin
> > with and add "new features" like versioned documentation later on and if
> > there is interest.
> >
> > Building the site from the master on every commit is as simple as it can
> > get and that's why I brought it up.
> > Having said that, any other option which goes one step further gets a +1
> > from me.
> >
> > Best,
> > Stamatis
> >
> > [1] https://lists.apache.org/thread/l81th3qvdwttgk135nplz983m78d62m7
> >
> > On Mon, Mar 28, 2022 at 11:38 AM Ruben Q L <ru...@gmail.com> wrote:
> >
> >> Would it be clearer if we had different API versions on the site?
> >> We could have one API link per Calcite version (or at least for the latest
> >> X versions) + an API link of the current master head (that could be updated
> >> automatically).
> >> I think this "multiple API" idea has been already discussed in the past,
> >> but I could not find the thread.
> >>
> >>
> >>
> >> On Mon, Mar 28, 2022 at 12:24 AM Francis Chuang <fr...@apache.org>
> >> wrote:
> >>
> >>> It looks like Infra should be able to give us a token to push to
> >>> calcite-site from our other calcite-* repos using Github actions [1].
> >>>
> >>> If we can have some consensus regarding whether to keep the site branch
> >>> and maintain the current process, or to remove it and just publish from
> >>> master, I can see if I can get the automated site builds moving along.
> >>>
> >>> [1] https://issues.apache.org/jira/browse/INFRA-21453
> >>>
> >>> On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
> >>>> It would be great if we manage to wrap up CALCITE-3129 and have an
> >>>> automated build for the website.
> >>>>
> >>>> The thing that complicates the procedure in general (automated or not)
> >> is
> >>>> the fact that we don't want to publish API related changes on the web
> >>>> before they are officially released.
> >>>> I understand the benefits for trying to maintain this practice but I
> >>> would
> >>>> be willing to sacrifice those for having simpler procedures/scripts.
> >>>>
> >>>> I rarely search for any javadoc (Calcite or other) online because
> >>> whenever
> >>>> I need something the IDE fetches it for me. In most cases, the javadoc
> >>>> won't be enough and I will need to dig in the code which is again
> >> fetched
> >>>> automatically by IDE. If nothing works, and the project is open
> >> source, I
> >>>> will simply download the respective project and look into the
> >>> javadoc/code
> >>>> directly.
> >>>>
> >>>> Apart from that, users will rarely jump to the latest Calcite version
> >>>> directly so having the corresponding javadoc online might not be very
> >>>> helpful.
> >>>>
> >>>> Long story short, another option would be to build/update the website
> >>>> directly after every commit on master or at certain intervals (e.g.,
> >>> daily)
> >>>> and not have any other branches to maintain.
> >>>>
> >>>> If there are really people using the published javadoc on the website
> >>> [1],
> >>>> I would really like to hear their thoughts about this proposal.
> >>>>
> >>>> Best,
> >>>> Stamatis
> >>>>
> >>>> [1] https://calcite.apache.org/javadocAggregate/
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <
> >>> francischuang@apache.org>
> >>>> wrote:
> >>>>
> >>>>> Ideally, I would like to see that the site builds are automated by CI,
> >>>>> we still have CALCITE-3129 [1] open.
> >>>>>
> >>>>> My thinking is that if we automate the site building and deployment
> >>>>> process, we can use the following heuristics:
> >>>>> - Build the site completely and deploy when a final release tag is
> >>>>> pushed to the repo.
> >>>>> - Build the site on a partial basis in all other cases:
> >>>>>      - Option 1: Check out the last final release tag and apply changes
> >>> to
> >>>>> the site that only touches certain whitelisted categories such as news
> >>>>> and community. This should allow us to not have documentation changes
> >>>>> for code deployed before the final release.This should then allow us
> >> to
> >>>>> get rid of the site branch
> >>>>>      - Option 2: We keep the site branch, but we automate the current
> >>>>> process. On every commit to master, if it is a change to the files in
> >>>>> the site directory, we check if the change only touches certain
> >>>>> whitelisted categories such as news and community. If so, we cherry
> >> pick
> >>>>> that into the site branch automatically using Github Actions and build
> >>>>> and deploy the site. When a final release tag is pushed to the repo,
> >> we
> >>>>> use Github Actions to make the master and site branches equal and
> >>>>> automatically build and deploy the site.
> >>>>>
> >>>>> This would negate the need to build and publish the site manually and
> >>>>> simplify the process as we always only commit to master. As an added
> >>>>> bonus, we if we keep the site branch, but automate the process, maybe
> >> we
> >>>>> can lock the site branch so that only CI can push to it. The downside
> >> of
> >>>>> course, is that we're relying on heuristics for the partial build, so
> >>>>> there's some "magic" to it.
> >>>>>
> >>>>> Francis
> >>>>>
> >>>>>
> >>>>> [1] https://issues.apache.org/jira/browse/CALCITE-3129
> >>>>>
> >>>>> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> Thanks for starting this discussion Liya. It is important to find
> >> which
> >>>>>> parts of the process are unclear and improve them if possible.
> >>>>>>
> >>>>>> The current procedure for updating the website remains unchanged and
> >> it
> >>>>> is
> >>>>>> documented here:
> >>>>>>
> >>>>>
> >>>
> >> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> >>>>>>
> >>>>>> If the procedure is not followed, which has happened a few times in
> >> the
> >>>>>> past, meaning that someone commits directly in site without
> >> committing
> >>> in
> >>>>>> master then we will have commits in site that may get lost forever.
> >>>>>> When we discover such commits we should port them to master. The
> >>>>>> cherry-pick now goes in the opposite direction (from site to master).
> >>>>>> This is usually discovered/done by the release manager and that's why
> >>> we
> >>>>>> have the respective instructions in the howto [1].
> >>>>>>
> >>>>>> After a release we don't care much what happens because master and
> >> site
> >>>>>> should be equal. As Francis pointed out this is usually done with a
> >>> force
> >>>>>> push.
> >>>>>>
> >>>>>> Regarding Julian's question the commit hashes before the force pushes
> >>>>> done
> >>>>>> by Liya are the following (according to commits@calcite):
> >>>>>> * master -> dcbc493bf699d961427952c5efc047b76d859096
> >>>>>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
> >>>>>>
> >>>>>> Best,
> >>>>>> Stamatis
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>
> >>>
> >> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
> >>>>>>
> >>>>>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org>
> >> wrote:
> >>>>>>
> >>>>>>> Does anyone know (or could find out) the SHA of the master and site
> >>>>>>> branches at the time that Fan attempted to move the site changes
> >> over?
> >>>>>>> If so, we could recreate the same environment, and figure out a set
> >> of
> >>>>>>> git commands that would have worked then and will work for the next
> >>>>>>> release manager. This process is safe because we can do these
> >>>>>>> experiments in a local git sandbox, without pushing to any remote.
> >>>>>>>
> >>>>>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Francis,
> >>>>>>>>
> >>>>>>>> Thanks for your feedback.
> >>>>>>>>
> >>>>>>>> It seems we should choose option 2.
> >>>>>>>> In addition, it seems less risky to run "git push --force" commands
> >>> in
> >>>>>>>> the site branch.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Liya Fan
> >>>>>>>>
> >>>>>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> >>>>>>>>>
> >>>>>>>>> Hi Liya,
> >>>>>>>>>
> >>>>>>>>> Thanks for bringing this up. We have always done the following
> >> when
> >>>>>>>>> committing:
> >>>>>>>>> 1. Always commit to master.
> >>>>>>>>> 2. If we need to publish the change to the site now (for example,
> >>> new
> >>>>>>>>> committer or announcement), cherry-pick the change into the site
> >>>>> branch
> >>>>>>>>> and publish it.
> >>>>>>>>> 3. After a release, make the site branch the same as master (git
> >>> reset
> >>>>>>>>> --hard master) and force push (git push --force origin site).
> >>>>>>>>>
> >>>>>>>>> Francis
> >>>>>>>>>
> >>>>>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> As part of the release process, we need to synchronize the master
> >>> and
> >>>>>>>>>> site branches (Please see
> >>>>>>>>>>
> >>>>>>>
> >> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
> >>> ).
> >>>>>>>>>> Usually, the site is behind the master branch by some commits.
> >>>>>>>>>> If the existing commits in the site branch are in the same order
> >> as
> >>>>>>> in
> >>>>>>>>>> the master branch, the task is easy: just switch to the site
> >>> branch,
> >>>>>>>>>> and run
> >>>>>>>>>>
> >>>>>>>>>> git rebase master
> >>>>>>>>>>
> >>>>>>>>>> However, if some commits are in different orders, it can be
> >> tricky.
> >>>>>>>>>> For example, the master branch may have the following commits (in
> >>>>>>>>>> order):
> >>>>>>>>>>
> >>>>>>>>>> A, B, X1, X2, ... , Xn.
> >>>>>>>>>>
> >>>>>>>>>> and the site branch may have the following commits (in order):
> >>>>>>>>>>
> >>>>>>>>>> B, A, X1, X2.
> >>>>>>>>>>
> >>>>>>>>>> Basically we have two choices:
> >>>>>>>>>>
> >>>>>>>>>> 1. We can live with the out of order commits, because after
> >>>>>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
> >>> file
> >>>>>>>>>> contents will be consistent.
> >>>>>>>>>>
> >>>>>>>>>> The problem is that, since the two branches have diverged, we
> >>> cannot
> >>>>>>>>>> use the rebase command. Instead, we have to manually cherry-pick
> >>>>>>>>>> commits individually, which requires large effort. In addition,
> >> for
> >>>>>>>>>> any subsequent release processes, we have to manually cherry-pick
> >>>>>>> each
> >>>>>>>>>> commit.
> >>>>>>>>>>
> >>>>>>>>>> 2. We need to make the commits order consistent, which will make
> >> it
> >>>>>>>>>> easy for subsequent releases.
> >>>>>>>>>> However, the problem is that, to make the commits order
> >> consistent,
> >>>>>>>>>> some git force push command is unavoidable, which is risky to
> >> some
> >>>>>>>>>> extent.
> >>>>>>>>>>
> >>>>>>>>>> So what is the recommended way to do this? Thanks in advance for
> >>>>>>> your feedback!
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Liya Fan
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
If all version specific documentation is siloed into their own 
respective folders for each version, then this will be much easier to 
automate as we can just simply build and publish the site on every push 
to master.

Each version would be in each folder, with the unreleased version being 
in the devel folder.

Currently, I think the documentation is scattered across various folders 
and files in the website, so to simply build from every commit to master 
would mean that users will see stuff only relevant to the unreleased 
version. This may or may not be confusing.

We can use some heuristics as I mentioned earlier: keep the current way 
but automate it.
- Have a github action that watches every commit, if it only touches a 
list of whitelisted folders or pages that we know will not be 
documentation for a future release (news, community, etc), cherry pick 
it to the site branch
- Have a github action that watches for a final release and force site 
to equal master
- On every commit to the site branch, build and publish the site

Francis

On 28/03/2022 8:55 pm, Stamatis Zampetakis wrote:
> Having multiple APIs versions in the website has been discussed here [1].
> 
> Since this work of automation is important and has been postponed many
> times in the past I think it is important to get something simple to begin
> with and add "new features" like versioned documentation later on and if
> there is interest.
> 
> Building the site from the master on every commit is as simple as it can
> get and that's why I brought it up.
> Having said that, any other option which goes one step further gets a +1
> from me.
> 
> Best,
> Stamatis
> 
> [1] https://lists.apache.org/thread/l81th3qvdwttgk135nplz983m78d62m7
> 
> On Mon, Mar 28, 2022 at 11:38 AM Ruben Q L <ru...@gmail.com> wrote:
> 
>> Would it be clearer if we had different API versions on the site?
>> We could have one API link per Calcite version (or at least for the latest
>> X versions) + an API link of the current master head (that could be updated
>> automatically).
>> I think this "multiple API" idea has been already discussed in the past,
>> but I could not find the thread.
>>
>>
>>
>> On Mon, Mar 28, 2022 at 12:24 AM Francis Chuang <fr...@apache.org>
>> wrote:
>>
>>> It looks like Infra should be able to give us a token to push to
>>> calcite-site from our other calcite-* repos using Github actions [1].
>>>
>>> If we can have some consensus regarding whether to keep the site branch
>>> and maintain the current process, or to remove it and just publish from
>>> master, I can see if I can get the automated site builds moving along.
>>>
>>> [1] https://issues.apache.org/jira/browse/INFRA-21453
>>>
>>> On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
>>>> It would be great if we manage to wrap up CALCITE-3129 and have an
>>>> automated build for the website.
>>>>
>>>> The thing that complicates the procedure in general (automated or not)
>> is
>>>> the fact that we don't want to publish API related changes on the web
>>>> before they are officially released.
>>>> I understand the benefits for trying to maintain this practice but I
>>> would
>>>> be willing to sacrifice those for having simpler procedures/scripts.
>>>>
>>>> I rarely search for any javadoc (Calcite or other) online because
>>> whenever
>>>> I need something the IDE fetches it for me. In most cases, the javadoc
>>>> won't be enough and I will need to dig in the code which is again
>> fetched
>>>> automatically by IDE. If nothing works, and the project is open
>> source, I
>>>> will simply download the respective project and look into the
>>> javadoc/code
>>>> directly.
>>>>
>>>> Apart from that, users will rarely jump to the latest Calcite version
>>>> directly so having the corresponding javadoc online might not be very
>>>> helpful.
>>>>
>>>> Long story short, another option would be to build/update the website
>>>> directly after every commit on master or at certain intervals (e.g.,
>>> daily)
>>>> and not have any other branches to maintain.
>>>>
>>>> If there are really people using the published javadoc on the website
>>> [1],
>>>> I would really like to hear their thoughts about this proposal.
>>>>
>>>> Best,
>>>> Stamatis
>>>>
>>>> [1] https://calcite.apache.org/javadocAggregate/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <
>>> francischuang@apache.org>
>>>> wrote:
>>>>
>>>>> Ideally, I would like to see that the site builds are automated by CI,
>>>>> we still have CALCITE-3129 [1] open.
>>>>>
>>>>> My thinking is that if we automate the site building and deployment
>>>>> process, we can use the following heuristics:
>>>>> - Build the site completely and deploy when a final release tag is
>>>>> pushed to the repo.
>>>>> - Build the site on a partial basis in all other cases:
>>>>>      - Option 1: Check out the last final release tag and apply changes
>>> to
>>>>> the site that only touches certain whitelisted categories such as news
>>>>> and community. This should allow us to not have documentation changes
>>>>> for code deployed before the final release.This should then allow us
>> to
>>>>> get rid of the site branch
>>>>>      - Option 2: We keep the site branch, but we automate the current
>>>>> process. On every commit to master, if it is a change to the files in
>>>>> the site directory, we check if the change only touches certain
>>>>> whitelisted categories such as news and community. If so, we cherry
>> pick
>>>>> that into the site branch automatically using Github Actions and build
>>>>> and deploy the site. When a final release tag is pushed to the repo,
>> we
>>>>> use Github Actions to make the master and site branches equal and
>>>>> automatically build and deploy the site.
>>>>>
>>>>> This would negate the need to build and publish the site manually and
>>>>> simplify the process as we always only commit to master. As an added
>>>>> bonus, we if we keep the site branch, but automate the process, maybe
>> we
>>>>> can lock the site branch so that only CI can push to it. The downside
>> of
>>>>> course, is that we're relying on heuristics for the partial build, so
>>>>> there's some "magic" to it.
>>>>>
>>>>> Francis
>>>>>
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-3129
>>>>>
>>>>> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Thanks for starting this discussion Liya. It is important to find
>> which
>>>>>> parts of the process are unclear and improve them if possible.
>>>>>>
>>>>>> The current procedure for updating the website remains unchanged and
>> it
>>>>> is
>>>>>> documented here:
>>>>>>
>>>>>
>>>
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>>>>
>>>>>> If the procedure is not followed, which has happened a few times in
>> the
>>>>>> past, meaning that someone commits directly in site without
>> committing
>>> in
>>>>>> master then we will have commits in site that may get lost forever.
>>>>>> When we discover such commits we should port them to master. The
>>>>>> cherry-pick now goes in the opposite direction (from site to master).
>>>>>> This is usually discovered/done by the release manager and that's why
>>> we
>>>>>> have the respective instructions in the howto [1].
>>>>>>
>>>>>> After a release we don't care much what happens because master and
>> site
>>>>>> should be equal. As Francis pointed out this is usually done with a
>>> force
>>>>>> push.
>>>>>>
>>>>>> Regarding Julian's question the commit hashes before the force pushes
>>>>> done
>>>>>> by Liya are the following (according to commits@calcite):
>>>>>> * master -> dcbc493bf699d961427952c5efc047b76d859096
>>>>>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
>>>>>>
>>>>>> Best,
>>>>>> Stamatis
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>
>>>
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
>>>>>>
>>>>>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org>
>> wrote:
>>>>>>
>>>>>>> Does anyone know (or could find out) the SHA of the master and site
>>>>>>> branches at the time that Fan attempted to move the site changes
>> over?
>>>>>>> If so, we could recreate the same environment, and figure out a set
>> of
>>>>>>> git commands that would have worked then and will work for the next
>>>>>>> release manager. This process is safe because we can do these
>>>>>>> experiments in a local git sandbox, without pushing to any remote.
>>>>>>>
>>>>>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com>
>>> wrote:
>>>>>>>>
>>>>>>>> Hi Francis,
>>>>>>>>
>>>>>>>> Thanks for your feedback.
>>>>>>>>
>>>>>>>> It seems we should choose option 2.
>>>>>>>> In addition, it seems less risky to run "git push --force" commands
>>> in
>>>>>>>> the site branch.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Liya Fan
>>>>>>>>
>>>>>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
>>>>>>>>>
>>>>>>>>> Hi Liya,
>>>>>>>>>
>>>>>>>>> Thanks for bringing this up. We have always done the following
>> when
>>>>>>>>> committing:
>>>>>>>>> 1. Always commit to master.
>>>>>>>>> 2. If we need to publish the change to the site now (for example,
>>> new
>>>>>>>>> committer or announcement), cherry-pick the change into the site
>>>>> branch
>>>>>>>>> and publish it.
>>>>>>>>> 3. After a release, make the site branch the same as master (git
>>> reset
>>>>>>>>> --hard master) and force push (git push --force origin site).
>>>>>>>>>
>>>>>>>>> Francis
>>>>>>>>>
>>>>>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> As part of the release process, we need to synchronize the master
>>> and
>>>>>>>>>> site branches (Please see
>>>>>>>>>>
>>>>>>>
>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
>>> ).
>>>>>>>>>> Usually, the site is behind the master branch by some commits.
>>>>>>>>>> If the existing commits in the site branch are in the same order
>> as
>>>>>>> in
>>>>>>>>>> the master branch, the task is easy: just switch to the site
>>> branch,
>>>>>>>>>> and run
>>>>>>>>>>
>>>>>>>>>> git rebase master
>>>>>>>>>>
>>>>>>>>>> However, if some commits are in different orders, it can be
>> tricky.
>>>>>>>>>> For example, the master branch may have the following commits (in
>>>>>>>>>> order):
>>>>>>>>>>
>>>>>>>>>> A, B, X1, X2, ... , Xn.
>>>>>>>>>>
>>>>>>>>>> and the site branch may have the following commits (in order):
>>>>>>>>>>
>>>>>>>>>> B, A, X1, X2.
>>>>>>>>>>
>>>>>>>>>> Basically we have two choices:
>>>>>>>>>>
>>>>>>>>>> 1. We can live with the out of order commits, because after
>>>>>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
>>> file
>>>>>>>>>> contents will be consistent.
>>>>>>>>>>
>>>>>>>>>> The problem is that, since the two branches have diverged, we
>>> cannot
>>>>>>>>>> use the rebase command. Instead, we have to manually cherry-pick
>>>>>>>>>> commits individually, which requires large effort. In addition,
>> for
>>>>>>>>>> any subsequent release processes, we have to manually cherry-pick
>>>>>>> each
>>>>>>>>>> commit.
>>>>>>>>>>
>>>>>>>>>> 2. We need to make the commits order consistent, which will make
>> it
>>>>>>>>>> easy for subsequent releases.
>>>>>>>>>> However, the problem is that, to make the commits order
>> consistent,
>>>>>>>>>> some git force push command is unavoidable, which is risky to
>> some
>>>>>>>>>> extent.
>>>>>>>>>>
>>>>>>>>>> So what is the recommended way to do this? Thanks in advance for
>>>>>>> your feedback!
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Liya Fan
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Stamatis Zampetakis <za...@gmail.com>.
Having multiple APIs versions in the website has been discussed here [1].

Since this work of automation is important and has been postponed many
times in the past I think it is important to get something simple to begin
with and add "new features" like versioned documentation later on and if
there is interest.

Building the site from the master on every commit is as simple as it can
get and that's why I brought it up.
Having said that, any other option which goes one step further gets a +1
from me.

Best,
Stamatis

[1] https://lists.apache.org/thread/l81th3qvdwttgk135nplz983m78d62m7

On Mon, Mar 28, 2022 at 11:38 AM Ruben Q L <ru...@gmail.com> wrote:

> Would it be clearer if we had different API versions on the site?
> We could have one API link per Calcite version (or at least for the latest
> X versions) + an API link of the current master head (that could be updated
> automatically).
> I think this "multiple API" idea has been already discussed in the past,
> but I could not find the thread.
>
>
>
> On Mon, Mar 28, 2022 at 12:24 AM Francis Chuang <fr...@apache.org>
> wrote:
>
> > It looks like Infra should be able to give us a token to push to
> > calcite-site from our other calcite-* repos using Github actions [1].
> >
> > If we can have some consensus regarding whether to keep the site branch
> > and maintain the current process, or to remove it and just publish from
> > master, I can see if I can get the automated site builds moving along.
> >
> > [1] https://issues.apache.org/jira/browse/INFRA-21453
> >
> > On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
> > > It would be great if we manage to wrap up CALCITE-3129 and have an
> > > automated build for the website.
> > >
> > > The thing that complicates the procedure in general (automated or not)
> is
> > > the fact that we don't want to publish API related changes on the web
> > > before they are officially released.
> > > I understand the benefits for trying to maintain this practice but I
> > would
> > > be willing to sacrifice those for having simpler procedures/scripts.
> > >
> > > I rarely search for any javadoc (Calcite or other) online because
> > whenever
> > > I need something the IDE fetches it for me. In most cases, the javadoc
> > > won't be enough and I will need to dig in the code which is again
> fetched
> > > automatically by IDE. If nothing works, and the project is open
> source, I
> > > will simply download the respective project and look into the
> > javadoc/code
> > > directly.
> > >
> > > Apart from that, users will rarely jump to the latest Calcite version
> > > directly so having the corresponding javadoc online might not be very
> > > helpful.
> > >
> > > Long story short, another option would be to build/update the website
> > > directly after every commit on master or at certain intervals (e.g.,
> > daily)
> > > and not have any other branches to maintain.
> > >
> > > If there are really people using the published javadoc on the website
> > [1],
> > > I would really like to hear their thoughts about this proposal.
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1] https://calcite.apache.org/javadocAggregate/
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <
> > francischuang@apache.org>
> > > wrote:
> > >
> > >> Ideally, I would like to see that the site builds are automated by CI,
> > >> we still have CALCITE-3129 [1] open.
> > >>
> > >> My thinking is that if we automate the site building and deployment
> > >> process, we can use the following heuristics:
> > >> - Build the site completely and deploy when a final release tag is
> > >> pushed to the repo.
> > >> - Build the site on a partial basis in all other cases:
> > >>     - Option 1: Check out the last final release tag and apply changes
> > to
> > >> the site that only touches certain whitelisted categories such as news
> > >> and community. This should allow us to not have documentation changes
> > >> for code deployed before the final release.This should then allow us
> to
> > >> get rid of the site branch
> > >>     - Option 2: We keep the site branch, but we automate the current
> > >> process. On every commit to master, if it is a change to the files in
> > >> the site directory, we check if the change only touches certain
> > >> whitelisted categories such as news and community. If so, we cherry
> pick
> > >> that into the site branch automatically using Github Actions and build
> > >> and deploy the site. When a final release tag is pushed to the repo,
> we
> > >> use Github Actions to make the master and site branches equal and
> > >> automatically build and deploy the site.
> > >>
> > >> This would negate the need to build and publish the site manually and
> > >> simplify the process as we always only commit to master. As an added
> > >> bonus, we if we keep the site branch, but automate the process, maybe
> we
> > >> can lock the site branch so that only CI can push to it. The downside
> of
> > >> course, is that we're relying on heuristics for the partial build, so
> > >> there's some "magic" to it.
> > >>
> > >> Francis
> > >>
> > >>
> > >> [1] https://issues.apache.org/jira/browse/CALCITE-3129
> > >>
> > >> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
> > >>> Hello,
> > >>>
> > >>> Thanks for starting this discussion Liya. It is important to find
> which
> > >>> parts of the process are unclear and improve them if possible.
> > >>>
> > >>> The current procedure for updating the website remains unchanged and
> it
> > >> is
> > >>> documented here:
> > >>>
> > >>
> >
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> > >>>
> > >>> If the procedure is not followed, which has happened a few times in
> the
> > >>> past, meaning that someone commits directly in site without
> committing
> > in
> > >>> master then we will have commits in site that may get lost forever.
> > >>> When we discover such commits we should port them to master. The
> > >>> cherry-pick now goes in the opposite direction (from site to master).
> > >>> This is usually discovered/done by the release manager and that's why
> > we
> > >>> have the respective instructions in the howto [1].
> > >>>
> > >>> After a release we don't care much what happens because master and
> site
> > >>> should be equal. As Francis pointed out this is usually done with a
> > force
> > >>> push.
> > >>>
> > >>> Regarding Julian's question the commit hashes before the force pushes
> > >> done
> > >>> by Liya are the following (according to commits@calcite):
> > >>> * master -> dcbc493bf699d961427952c5efc047b76d859096
> > >>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
> > >>>
> > >>> Best,
> > >>> Stamatis
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
> > >>>
> > >>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org>
> wrote:
> > >>>
> > >>>> Does anyone know (or could find out) the SHA of the master and site
> > >>>> branches at the time that Fan attempted to move the site changes
> over?
> > >>>> If so, we could recreate the same environment, and figure out a set
> of
> > >>>> git commands that would have worked then and will work for the next
> > >>>> release manager. This process is safe because we can do these
> > >>>> experiments in a local git sandbox, without pushing to any remote.
> > >>>>
> > >>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com>
> > wrote:
> > >>>>>
> > >>>>> Hi Francis,
> > >>>>>
> > >>>>> Thanks for your feedback.
> > >>>>>
> > >>>>> It seems we should choose option 2.
> > >>>>> In addition, it seems less risky to run "git push --force" commands
> > in
> > >>>>> the site branch.
> > >>>>>
> > >>>>> Best,
> > >>>>> Liya Fan
> > >>>>>
> > >>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> > >>>>>>
> > >>>>>> Hi Liya,
> > >>>>>>
> > >>>>>> Thanks for bringing this up. We have always done the following
> when
> > >>>>>> committing:
> > >>>>>> 1. Always commit to master.
> > >>>>>> 2. If we need to publish the change to the site now (for example,
> > new
> > >>>>>> committer or announcement), cherry-pick the change into the site
> > >> branch
> > >>>>>> and publish it.
> > >>>>>> 3. After a release, make the site branch the same as master (git
> > reset
> > >>>>>> --hard master) and force push (git push --force origin site).
> > >>>>>>
> > >>>>>> Francis
> > >>>>>>
> > >>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> As part of the release process, we need to synchronize the master
> > and
> > >>>>>>> site branches (Please see
> > >>>>>>>
> > >>>>
> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
> > ).
> > >>>>>>> Usually, the site is behind the master branch by some commits.
> > >>>>>>> If the existing commits in the site branch are in the same order
> as
> > >>>> in
> > >>>>>>> the master branch, the task is easy: just switch to the site
> > branch,
> > >>>>>>> and run
> > >>>>>>>
> > >>>>>>> git rebase master
> > >>>>>>>
> > >>>>>>> However, if some commits are in different orders, it can be
> tricky.
> > >>>>>>> For example, the master branch may have the following commits (in
> > >>>>>>> order):
> > >>>>>>>
> > >>>>>>> A, B, X1, X2, ... , Xn.
> > >>>>>>>
> > >>>>>>> and the site branch may have the following commits (in order):
> > >>>>>>>
> > >>>>>>> B, A, X1, X2.
> > >>>>>>>
> > >>>>>>> Basically we have two choices:
> > >>>>>>>
> > >>>>>>> 1. We can live with the out of order commits, because after
> > >>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
> > file
> > >>>>>>> contents will be consistent.
> > >>>>>>>
> > >>>>>>> The problem is that, since the two branches have diverged, we
> > cannot
> > >>>>>>> use the rebase command. Instead, we have to manually cherry-pick
> > >>>>>>> commits individually, which requires large effort. In addition,
> for
> > >>>>>>> any subsequent release processes, we have to manually cherry-pick
> > >>>> each
> > >>>>>>> commit.
> > >>>>>>>
> > >>>>>>> 2. We need to make the commits order consistent, which will make
> it
> > >>>>>>> easy for subsequent releases.
> > >>>>>>> However, the problem is that, to make the commits order
> consistent,
> > >>>>>>> some git force push command is unavoidable, which is risky to
> some
> > >>>>>>> extent.
> > >>>>>>>
> > >>>>>>> So what is the recommended way to do this? Thanks in advance for
> > >>>> your feedback!
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Liya Fan
> > >>>>
> > >>>
> > >>
> > >
> >
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Ruben Q L <ru...@gmail.com>.
Would it be clearer if we had different API versions on the site?
We could have one API link per Calcite version (or at least for the latest
X versions) + an API link of the current master head (that could be updated
automatically).
I think this "multiple API" idea has been already discussed in the past,
but I could not find the thread.



On Mon, Mar 28, 2022 at 12:24 AM Francis Chuang <fr...@apache.org>
wrote:

> It looks like Infra should be able to give us a token to push to
> calcite-site from our other calcite-* repos using Github actions [1].
>
> If we can have some consensus regarding whether to keep the site branch
> and maintain the current process, or to remove it and just publish from
> master, I can see if I can get the automated site builds moving along.
>
> [1] https://issues.apache.org/jira/browse/INFRA-21453
>
> On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
> > It would be great if we manage to wrap up CALCITE-3129 and have an
> > automated build for the website.
> >
> > The thing that complicates the procedure in general (automated or not) is
> > the fact that we don't want to publish API related changes on the web
> > before they are officially released.
> > I understand the benefits for trying to maintain this practice but I
> would
> > be willing to sacrifice those for having simpler procedures/scripts.
> >
> > I rarely search for any javadoc (Calcite or other) online because
> whenever
> > I need something the IDE fetches it for me. In most cases, the javadoc
> > won't be enough and I will need to dig in the code which is again fetched
> > automatically by IDE. If nothing works, and the project is open source, I
> > will simply download the respective project and look into the
> javadoc/code
> > directly.
> >
> > Apart from that, users will rarely jump to the latest Calcite version
> > directly so having the corresponding javadoc online might not be very
> > helpful.
> >
> > Long story short, another option would be to build/update the website
> > directly after every commit on master or at certain intervals (e.g.,
> daily)
> > and not have any other branches to maintain.
> >
> > If there are really people using the published javadoc on the website
> [1],
> > I would really like to hear their thoughts about this proposal.
> >
> > Best,
> > Stamatis
> >
> > [1] https://calcite.apache.org/javadocAggregate/
> >
> >
> >
> >
> >
> >
> >
> >
> > On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <
> francischuang@apache.org>
> > wrote:
> >
> >> Ideally, I would like to see that the site builds are automated by CI,
> >> we still have CALCITE-3129 [1] open.
> >>
> >> My thinking is that if we automate the site building and deployment
> >> process, we can use the following heuristics:
> >> - Build the site completely and deploy when a final release tag is
> >> pushed to the repo.
> >> - Build the site on a partial basis in all other cases:
> >>     - Option 1: Check out the last final release tag and apply changes
> to
> >> the site that only touches certain whitelisted categories such as news
> >> and community. This should allow us to not have documentation changes
> >> for code deployed before the final release.This should then allow us to
> >> get rid of the site branch
> >>     - Option 2: We keep the site branch, but we automate the current
> >> process. On every commit to master, if it is a change to the files in
> >> the site directory, we check if the change only touches certain
> >> whitelisted categories such as news and community. If so, we cherry pick
> >> that into the site branch automatically using Github Actions and build
> >> and deploy the site. When a final release tag is pushed to the repo, we
> >> use Github Actions to make the master and site branches equal and
> >> automatically build and deploy the site.
> >>
> >> This would negate the need to build and publish the site manually and
> >> simplify the process as we always only commit to master. As an added
> >> bonus, we if we keep the site branch, but automate the process, maybe we
> >> can lock the site branch so that only CI can push to it. The downside of
> >> course, is that we're relying on heuristics for the partial build, so
> >> there's some "magic" to it.
> >>
> >> Francis
> >>
> >>
> >> [1] https://issues.apache.org/jira/browse/CALCITE-3129
> >>
> >> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
> >>> Hello,
> >>>
> >>> Thanks for starting this discussion Liya. It is important to find which
> >>> parts of the process are unclear and improve them if possible.
> >>>
> >>> The current procedure for updating the website remains unchanged and it
> >> is
> >>> documented here:
> >>>
> >>
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> >>>
> >>> If the procedure is not followed, which has happened a few times in the
> >>> past, meaning that someone commits directly in site without committing
> in
> >>> master then we will have commits in site that may get lost forever.
> >>> When we discover such commits we should port them to master. The
> >>> cherry-pick now goes in the opposite direction (from site to master).
> >>> This is usually discovered/done by the release manager and that's why
> we
> >>> have the respective instructions in the howto [1].
> >>>
> >>> After a release we don't care much what happens because master and site
> >>> should be equal. As Francis pointed out this is usually done with a
> force
> >>> push.
> >>>
> >>> Regarding Julian's question the commit hashes before the force pushes
> >> done
> >>> by Liya are the following (according to commits@calcite):
> >>> * master -> dcbc493bf699d961427952c5efc047b76d859096
> >>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
> >>>
> >>> Best,
> >>> Stamatis
> >>>
> >>> [1]
> >>>
> >>
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
> >>>
> >>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:
> >>>
> >>>> Does anyone know (or could find out) the SHA of the master and site
> >>>> branches at the time that Fan attempted to move the site changes over?
> >>>> If so, we could recreate the same environment, and figure out a set of
> >>>> git commands that would have worked then and will work for the next
> >>>> release manager. This process is safe because we can do these
> >>>> experiments in a local git sandbox, without pushing to any remote.
> >>>>
> >>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com>
> wrote:
> >>>>>
> >>>>> Hi Francis,
> >>>>>
> >>>>> Thanks for your feedback.
> >>>>>
> >>>>> It seems we should choose option 2.
> >>>>> In addition, it seems less risky to run "git push --force" commands
> in
> >>>>> the site branch.
> >>>>>
> >>>>> Best,
> >>>>> Liya Fan
> >>>>>
> >>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> >>>>>>
> >>>>>> Hi Liya,
> >>>>>>
> >>>>>> Thanks for bringing this up. We have always done the following when
> >>>>>> committing:
> >>>>>> 1. Always commit to master.
> >>>>>> 2. If we need to publish the change to the site now (for example,
> new
> >>>>>> committer or announcement), cherry-pick the change into the site
> >> branch
> >>>>>> and publish it.
> >>>>>> 3. After a release, make the site branch the same as master (git
> reset
> >>>>>> --hard master) and force push (git push --force origin site).
> >>>>>>
> >>>>>> Francis
> >>>>>>
> >>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> As part of the release process, we need to synchronize the master
> and
> >>>>>>> site branches (Please see
> >>>>>>>
> >>>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate
> ).
> >>>>>>> Usually, the site is behind the master branch by some commits.
> >>>>>>> If the existing commits in the site branch are in the same order as
> >>>> in
> >>>>>>> the master branch, the task is easy: just switch to the site
> branch,
> >>>>>>> and run
> >>>>>>>
> >>>>>>> git rebase master
> >>>>>>>
> >>>>>>> However, if some commits are in different orders, it can be tricky.
> >>>>>>> For example, the master branch may have the following commits (in
> >>>>>>> order):
> >>>>>>>
> >>>>>>> A, B, X1, X2, ... , Xn.
> >>>>>>>
> >>>>>>> and the site branch may have the following commits (in order):
> >>>>>>>
> >>>>>>> B, A, X1, X2.
> >>>>>>>
> >>>>>>> Basically we have two choices:
> >>>>>>>
> >>>>>>> 1. We can live with the out of order commits, because after
> >>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the
> file
> >>>>>>> contents will be consistent.
> >>>>>>>
> >>>>>>> The problem is that, since the two branches have diverged, we
> cannot
> >>>>>>> use the rebase command. Instead, we have to manually cherry-pick
> >>>>>>> commits individually, which requires large effort. In addition, for
> >>>>>>> any subsequent release processes, we have to manually cherry-pick
> >>>> each
> >>>>>>> commit.
> >>>>>>>
> >>>>>>> 2. We need to make the commits order consistent, which will make it
> >>>>>>> easy for subsequent releases.
> >>>>>>> However, the problem is that, to make the commits order consistent,
> >>>>>>> some git force push command is unavoidable, which is risky to some
> >>>>>>> extent.
> >>>>>>>
> >>>>>>> So what is the recommended way to do this? Thanks in advance for
> >>>> your feedback!
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Liya Fan
> >>>>
> >>>
> >>
> >
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
It looks like Infra should be able to give us a token to push to 
calcite-site from our other calcite-* repos using Github actions [1].

If we can have some consensus regarding whether to keep the site branch 
and maintain the current process, or to remove it and just publish from 
master, I can see if I can get the automated site builds moving along.

[1] https://issues.apache.org/jira/browse/INFRA-21453

On 28/03/2022 8:23 am, Stamatis Zampetakis wrote:
> It would be great if we manage to wrap up CALCITE-3129 and have an
> automated build for the website.
> 
> The thing that complicates the procedure in general (automated or not) is
> the fact that we don't want to publish API related changes on the web
> before they are officially released.
> I understand the benefits for trying to maintain this practice but I would
> be willing to sacrifice those for having simpler procedures/scripts.
> 
> I rarely search for any javadoc (Calcite or other) online because whenever
> I need something the IDE fetches it for me. In most cases, the javadoc
> won't be enough and I will need to dig in the code which is again fetched
> automatically by IDE. If nothing works, and the project is open source, I
> will simply download the respective project and look into the javadoc/code
> directly.
> 
> Apart from that, users will rarely jump to the latest Calcite version
> directly so having the corresponding javadoc online might not be very
> helpful.
> 
> Long story short, another option would be to build/update the website
> directly after every commit on master or at certain intervals (e.g., daily)
> and not have any other branches to maintain.
> 
> If there are really people using the published javadoc on the website [1],
> I would really like to hear their thoughts about this proposal.
> 
> Best,
> Stamatis
> 
> [1] https://calcite.apache.org/javadocAggregate/
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <fr...@apache.org>
> wrote:
> 
>> Ideally, I would like to see that the site builds are automated by CI,
>> we still have CALCITE-3129 [1] open.
>>
>> My thinking is that if we automate the site building and deployment
>> process, we can use the following heuristics:
>> - Build the site completely and deploy when a final release tag is
>> pushed to the repo.
>> - Build the site on a partial basis in all other cases:
>>     - Option 1: Check out the last final release tag and apply changes to
>> the site that only touches certain whitelisted categories such as news
>> and community. This should allow us to not have documentation changes
>> for code deployed before the final release.This should then allow us to
>> get rid of the site branch
>>     - Option 2: We keep the site branch, but we automate the current
>> process. On every commit to master, if it is a change to the files in
>> the site directory, we check if the change only touches certain
>> whitelisted categories such as news and community. If so, we cherry pick
>> that into the site branch automatically using Github Actions and build
>> and deploy the site. When a final release tag is pushed to the repo, we
>> use Github Actions to make the master and site branches equal and
>> automatically build and deploy the site.
>>
>> This would negate the need to build and publish the site manually and
>> simplify the process as we always only commit to master. As an added
>> bonus, we if we keep the site branch, but automate the process, maybe we
>> can lock the site branch so that only CI can push to it. The downside of
>> course, is that we're relying on heuristics for the partial build, so
>> there's some "magic" to it.
>>
>> Francis
>>
>>
>> [1] https://issues.apache.org/jira/browse/CALCITE-3129
>>
>> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
>>> Hello,
>>>
>>> Thanks for starting this discussion Liya. It is important to find which
>>> parts of the process are unclear and improve them if possible.
>>>
>>> The current procedure for updating the website remains unchanged and it
>> is
>>> documented here:
>>>
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
>>>
>>> If the procedure is not followed, which has happened a few times in the
>>> past, meaning that someone commits directly in site without committing in
>>> master then we will have commits in site that may get lost forever.
>>> When we discover such commits we should port them to master. The
>>> cherry-pick now goes in the opposite direction (from site to master).
>>> This is usually discovered/done by the release manager and that's why we
>>> have the respective instructions in the howto [1].
>>>
>>> After a release we don't care much what happens because master and site
>>> should be equal. As Francis pointed out this is usually done with a force
>>> push.
>>>
>>> Regarding Julian's question the commit hashes before the force pushes
>> done
>>> by Liya are the following (according to commits@calcite):
>>> * master -> dcbc493bf699d961427952c5efc047b76d859096
>>> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
>>>
>>> Best,
>>> Stamatis
>>>
>>> [1]
>>>
>> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
>>>
>>> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:
>>>
>>>> Does anyone know (or could find out) the SHA of the master and site
>>>> branches at the time that Fan attempted to move the site changes over?
>>>> If so, we could recreate the same environment, and figure out a set of
>>>> git commands that would have worked then and will work for the next
>>>> release manager. This process is safe because we can do these
>>>> experiments in a local git sandbox, without pushing to any remote.
>>>>
>>>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com> wrote:
>>>>>
>>>>> Hi Francis,
>>>>>
>>>>> Thanks for your feedback.
>>>>>
>>>>> It seems we should choose option 2.
>>>>> In addition, it seems less risky to run "git push --force" commands in
>>>>> the site branch.
>>>>>
>>>>> Best,
>>>>> Liya Fan
>>>>>
>>>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
>>>>>>
>>>>>> Hi Liya,
>>>>>>
>>>>>> Thanks for bringing this up. We have always done the following when
>>>>>> committing:
>>>>>> 1. Always commit to master.
>>>>>> 2. If we need to publish the change to the site now (for example, new
>>>>>> committer or announcement), cherry-pick the change into the site
>> branch
>>>>>> and publish it.
>>>>>> 3. After a release, make the site branch the same as master (git reset
>>>>>> --hard master) and force push (git push --force origin site).
>>>>>>
>>>>>> Francis
>>>>>>
>>>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> As part of the release process, we need to synchronize the master and
>>>>>>> site branches (Please see
>>>>>>>
>>>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
>>>>>>> Usually, the site is behind the master branch by some commits.
>>>>>>> If the existing commits in the site branch are in the same order as
>>>> in
>>>>>>> the master branch, the task is easy: just switch to the site branch,
>>>>>>> and run
>>>>>>>
>>>>>>> git rebase master
>>>>>>>
>>>>>>> However, if some commits are in different orders, it can be tricky.
>>>>>>> For example, the master branch may have the following commits (in
>>>>>>> order):
>>>>>>>
>>>>>>> A, B, X1, X2, ... , Xn.
>>>>>>>
>>>>>>> and the site branch may have the following commits (in order):
>>>>>>>
>>>>>>> B, A, X1, X2.
>>>>>>>
>>>>>>> Basically we have two choices:
>>>>>>>
>>>>>>> 1. We can live with the out of order commits, because after
>>>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the file
>>>>>>> contents will be consistent.
>>>>>>>
>>>>>>> The problem is that, since the two branches have diverged, we cannot
>>>>>>> use the rebase command. Instead, we have to manually cherry-pick
>>>>>>> commits individually, which requires large effort. In addition, for
>>>>>>> any subsequent release processes, we have to manually cherry-pick
>>>> each
>>>>>>> commit.
>>>>>>>
>>>>>>> 2. We need to make the commits order consistent, which will make it
>>>>>>> easy for subsequent releases.
>>>>>>> However, the problem is that, to make the commits order consistent,
>>>>>>> some git force push command is unavoidable, which is risky to some
>>>>>>> extent.
>>>>>>>
>>>>>>> So what is the recommended way to do this? Thanks in advance for
>>>> your feedback!
>>>>>>>
>>>>>>> Best,
>>>>>>> Liya Fan
>>>>
>>>
>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Stamatis Zampetakis <za...@gmail.com>.
It would be great if we manage to wrap up CALCITE-3129 and have an
automated build for the website.

The thing that complicates the procedure in general (automated or not) is
the fact that we don't want to publish API related changes on the web
before they are officially released.
I understand the benefits for trying to maintain this practice but I would
be willing to sacrifice those for having simpler procedures/scripts.

I rarely search for any javadoc (Calcite or other) online because whenever
I need something the IDE fetches it for me. In most cases, the javadoc
won't be enough and I will need to dig in the code which is again fetched
automatically by IDE. If nothing works, and the project is open source, I
will simply download the respective project and look into the javadoc/code
directly.

Apart from that, users will rarely jump to the latest Calcite version
directly so having the corresponding javadoc online might not be very
helpful.

Long story short, another option would be to build/update the website
directly after every commit on master or at certain intervals (e.g., daily)
and not have any other branches to maintain.

If there are really people using the published javadoc on the website [1],
I would really like to hear their thoughts about this proposal.

Best,
Stamatis

[1] https://calcite.apache.org/javadocAggregate/








On Sat, Mar 26, 2022 at 10:57 PM Francis Chuang <fr...@apache.org>
wrote:

> Ideally, I would like to see that the site builds are automated by CI,
> we still have CALCITE-3129 [1] open.
>
> My thinking is that if we automate the site building and deployment
> process, we can use the following heuristics:
> - Build the site completely and deploy when a final release tag is
> pushed to the repo.
> - Build the site on a partial basis in all other cases:
>    - Option 1: Check out the last final release tag and apply changes to
> the site that only touches certain whitelisted categories such as news
> and community. This should allow us to not have documentation changes
> for code deployed before the final release.This should then allow us to
> get rid of the site branch
>    - Option 2: We keep the site branch, but we automate the current
> process. On every commit to master, if it is a change to the files in
> the site directory, we check if the change only touches certain
> whitelisted categories such as news and community. If so, we cherry pick
> that into the site branch automatically using Github Actions and build
> and deploy the site. When a final release tag is pushed to the repo, we
> use Github Actions to make the master and site branches equal and
> automatically build and deploy the site.
>
> This would negate the need to build and publish the site manually and
> simplify the process as we always only commit to master. As an added
> bonus, we if we keep the site branch, but automate the process, maybe we
> can lock the site branch so that only CI can push to it. The downside of
> course, is that we're relying on heuristics for the partial build, so
> there's some "magic" to it.
>
> Francis
>
>
> [1] https://issues.apache.org/jira/browse/CALCITE-3129
>
> On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
> > Hello,
> >
> > Thanks for starting this discussion Liya. It is important to find which
> > parts of the process are unclear and improve them if possible.
> >
> > The current procedure for updating the website remains unchanged and it
> is
> > documented here:
> >
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> >
> > If the procedure is not followed, which has happened a few times in the
> > past, meaning that someone commits directly in site without committing in
> > master then we will have commits in site that may get lost forever.
> > When we discover such commits we should port them to master. The
> > cherry-pick now goes in the opposite direction (from site to master).
> > This is usually discovered/done by the release manager and that's why we
> > have the respective instructions in the howto [1].
> >
> > After a release we don't care much what happens because master and site
> > should be equal. As Francis pointed out this is usually done with a force
> > push.
> >
> > Regarding Julian's question the commit hashes before the force pushes
> done
> > by Liya are the following (according to commits@calcite):
> > * master -> dcbc493bf699d961427952c5efc047b76d859096
> > * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
> >
> > Best,
> > Stamatis
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
> >
> > On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:
> >
> >> Does anyone know (or could find out) the SHA of the master and site
> >> branches at the time that Fan attempted to move the site changes over?
> >> If so, we could recreate the same environment, and figure out a set of
> >> git commands that would have worked then and will work for the next
> >> release manager. This process is safe because we can do these
> >> experiments in a local git sandbox, without pushing to any remote.
> >>
> >> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com> wrote:
> >>>
> >>> Hi Francis,
> >>>
> >>> Thanks for your feedback.
> >>>
> >>> It seems we should choose option 2.
> >>> In addition, it seems less risky to run "git push --force" commands in
> >>> the site branch.
> >>>
> >>> Best,
> >>> Liya Fan
> >>>
> >>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> >>>>
> >>>> Hi Liya,
> >>>>
> >>>> Thanks for bringing this up. We have always done the following when
> >>>> committing:
> >>>> 1. Always commit to master.
> >>>> 2. If we need to publish the change to the site now (for example, new
> >>>> committer or announcement), cherry-pick the change into the site
> branch
> >>>> and publish it.
> >>>> 3. After a release, make the site branch the same as master (git reset
> >>>> --hard master) and force push (git push --force origin site).
> >>>>
> >>>> Francis
> >>>>
> >>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> As part of the release process, we need to synchronize the master and
> >>>>> site branches (Please see
> >>>>>
> >> https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> >>>>> Usually, the site is behind the master branch by some commits.
> >>>>> If the existing commits in the site branch are in the same order as
> >> in
> >>>>> the master branch, the task is easy: just switch to the site branch,
> >>>>> and run
> >>>>>
> >>>>> git rebase master
> >>>>>
> >>>>> However, if some commits are in different orders, it can be tricky.
> >>>>> For example, the master branch may have the following commits (in
> >>>>> order):
> >>>>>
> >>>>> A, B, X1, X2, ... , Xn.
> >>>>>
> >>>>> and the site branch may have the following commits (in order):
> >>>>>
> >>>>> B, A, X1, X2.
> >>>>>
> >>>>> Basically we have two choices:
> >>>>>
> >>>>> 1. We can live with the out of order commits, because after
> >>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> >>>>> contents will be consistent.
> >>>>>
> >>>>> The problem is that, since the two branches have diverged, we cannot
> >>>>> use the rebase command. Instead, we have to manually cherry-pick
> >>>>> commits individually, which requires large effort. In addition, for
> >>>>> any subsequent release processes, we have to manually cherry-pick
> >> each
> >>>>> commit.
> >>>>>
> >>>>> 2. We need to make the commits order consistent, which will make it
> >>>>> easy for subsequent releases.
> >>>>> However, the problem is that, to make the commits order consistent,
> >>>>> some git force push command is unavoidable, which is risky to some
> >>>>> extent.
> >>>>>
> >>>>> So what is the recommended way to do this? Thanks in advance for
> >> your feedback!
> >>>>>
> >>>>> Best,
> >>>>> Liya Fan
> >>
> >
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Ideally, I would like to see that the site builds are automated by CI, 
we still have CALCITE-3129 [1] open.

My thinking is that if we automate the site building and deployment 
process, we can use the following heuristics:
- Build the site completely and deploy when a final release tag is 
pushed to the repo.
- Build the site on a partial basis in all other cases:
   - Option 1: Check out the last final release tag and apply changes to 
the site that only touches certain whitelisted categories such as news 
and community. This should allow us to not have documentation changes 
for code deployed before the final release.This should then allow us to 
get rid of the site branch
   - Option 2: We keep the site branch, but we automate the current 
process. On every commit to master, if it is a change to the files in 
the site directory, we check if the change only touches certain 
whitelisted categories such as news and community. If so, we cherry pick 
that into the site branch automatically using Github Actions and build 
and deploy the site. When a final release tag is pushed to the repo, we 
use Github Actions to make the master and site branches equal and 
automatically build and deploy the site.

This would negate the need to build and publish the site manually and 
simplify the process as we always only commit to master. As an added 
bonus, we if we keep the site branch, but automate the process, maybe we 
can lock the site branch so that only CI can push to it. The downside of 
course, is that we're relying on heuristics for the partial build, so 
there's some "magic" to it.

Francis


[1] https://issues.apache.org/jira/browse/CALCITE-3129

On 26/03/2022 8:58 am, Stamatis Zampetakis wrote:
> Hello,
> 
> Thanks for starting this discussion Liya. It is important to find which
> parts of the process are unclear and improve them if possible.
> 
> The current procedure for updating the website remains unchanged and it is
> documented here:
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md
> 
> If the procedure is not followed, which has happened a few times in the
> past, meaning that someone commits directly in site without committing in
> master then we will have commits in site that may get lost forever.
> When we discover such commits we should port them to master. The
> cherry-pick now goes in the opposite direction (from site to master).
> This is usually discovered/done by the release manager and that's why we
> have the respective instructions in the howto [1].
> 
> After a release we don't care much what happens because master and site
> should be equal. As Francis pointed out this is usually done with a force
> push.
> 
> Regarding Julian's question the commit hashes before the force pushes done
> by Liya are the following (according to commits@calcite):
> * master -> dcbc493bf699d961427952c5efc047b76d859096
> * site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696
> 
> On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:
> 
>> Does anyone know (or could find out) the SHA of the master and site
>> branches at the time that Fan attempted to move the site changes over?
>> If so, we could recreate the same environment, and figure out a set of
>> git commands that would have worked then and will work for the next
>> release manager. This process is safe because we can do these
>> experiments in a local git sandbox, without pushing to any remote.
>>
>> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com> wrote:
>>>
>>> Hi Francis,
>>>
>>> Thanks for your feedback.
>>>
>>> It seems we should choose option 2.
>>> In addition, it seems less risky to run "git push --force" commands in
>>> the site branch.
>>>
>>> Best,
>>> Liya Fan
>>>
>>> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
>>>>
>>>> Hi Liya,
>>>>
>>>> Thanks for bringing this up. We have always done the following when
>>>> committing:
>>>> 1. Always commit to master.
>>>> 2. If we need to publish the change to the site now (for example, new
>>>> committer or announcement), cherry-pick the change into the site branch
>>>> and publish it.
>>>> 3. After a release, make the site branch the same as master (git reset
>>>> --hard master) and force push (git push --force origin site).
>>>>
>>>> Francis
>>>>
>>>> On 25/03/2022 3:03 pm, Fan Liya wrote:
>>>>> Hi all,
>>>>>
>>>>> As part of the release process, we need to synchronize the master and
>>>>> site branches (Please see
>>>>>
>> https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
>>>>> Usually, the site is behind the master branch by some commits.
>>>>> If the existing commits in the site branch are in the same order as
>> in
>>>>> the master branch, the task is easy: just switch to the site branch,
>>>>> and run
>>>>>
>>>>> git rebase master
>>>>>
>>>>> However, if some commits are in different orders, it can be tricky.
>>>>> For example, the master branch may have the following commits (in
>>>>> order):
>>>>>
>>>>> A, B, X1, X2, ... , Xn.
>>>>>
>>>>> and the site branch may have the following commits (in order):
>>>>>
>>>>> B, A, X1, X2.
>>>>>
>>>>> Basically we have two choices:
>>>>>
>>>>> 1. We can live with the out of order commits, because after
>>>>> cherry-picking commits X3, X4, ... , Xn to the site branch, the file
>>>>> contents will be consistent.
>>>>>
>>>>> The problem is that, since the two branches have diverged, we cannot
>>>>> use the rebase command. Instead, we have to manually cherry-pick
>>>>> commits individually, which requires large effort. In addition, for
>>>>> any subsequent release processes, we have to manually cherry-pick
>> each
>>>>> commit.
>>>>>
>>>>> 2. We need to make the commits order consistent, which will make it
>>>>> easy for subsequent releases.
>>>>> However, the problem is that, to make the commits order consistent,
>>>>> some git force push command is unavoidable, which is risky to some
>>>>> extent.
>>>>>
>>>>> So what is the recommended way to do this? Thanks in advance for
>> your feedback!
>>>>>
>>>>> Best,
>>>>> Liya Fan
>>
> 

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hello,

Thanks for starting this discussion Liya. It is important to find which
parts of the process are unclear and improve them if possible.

The current procedure for updating the website remains unchanged and it is
documented here:
https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/README.md

If the procedure is not followed, which has happened a few times in the
past, meaning that someone commits directly in site without committing in
master then we will have commits in site that may get lost forever.
When we discover such commits we should port them to master. The
cherry-pick now goes in the opposite direction (from site to master).
This is usually discovered/done by the release manager and that's why we
have the respective instructions in the howto [1].

After a release we don't care much what happens because master and site
should be equal. As Francis pointed out this is usually done with a force
push.

Regarding Julian's question the commit hashes before the force pushes done
by Liya are the following (according to commits@calcite):
* master -> dcbc493bf699d961427952c5efc047b76d859096
* site -> aa9dfc7dbc64c784040cf20ed168016ae3b9c2c5

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/a6a1e2cef332893fd90286098869c56529e052c3/site/_docs/howto.md?plain=1#L696

On Fri, Mar 25, 2022 at 7:36 PM Julian Hyde <jh...@apache.org> wrote:

> Does anyone know (or could find out) the SHA of the master and site
> branches at the time that Fan attempted to move the site changes over?
> If so, we could recreate the same environment, and figure out a set of
> git commands that would have worked then and will work for the next
> release manager. This process is safe because we can do these
> experiments in a local git sandbox, without pushing to any remote.
>
> On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com> wrote:
> >
> > Hi Francis,
> >
> > Thanks for your feedback.
> >
> > It seems we should choose option 2.
> > In addition, it seems less risky to run "git push --force" commands in
> > the site branch.
> >
> > Best,
> > Liya Fan
> >
> > Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> > >
> > > Hi Liya,
> > >
> > > Thanks for bringing this up. We have always done the following when
> > > committing:
> > > 1. Always commit to master.
> > > 2. If we need to publish the change to the site now (for example, new
> > > committer or announcement), cherry-pick the change into the site branch
> > > and publish it.
> > > 3. After a release, make the site branch the same as master (git reset
> > > --hard master) and force push (git push --force origin site).
> > >
> > > Francis
> > >
> > > On 25/03/2022 3:03 pm, Fan Liya wrote:
> > > > Hi all,
> > > >
> > > > As part of the release process, we need to synchronize the master and
> > > > site branches (Please see
> > > >
> https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> > > > Usually, the site is behind the master branch by some commits.
> > > > If the existing commits in the site branch are in the same order as
> in
> > > > the master branch, the task is easy: just switch to the site branch,
> > > > and run
> > > >
> > > > git rebase master
> > > >
> > > > However, if some commits are in different orders, it can be tricky.
> > > > For example, the master branch may have the following commits (in
> > > > order):
> > > >
> > > > A, B, X1, X2, ... , Xn.
> > > >
> > > > and the site branch may have the following commits (in order):
> > > >
> > > > B, A, X1, X2.
> > > >
> > > > Basically we have two choices:
> > > >
> > > > 1. We can live with the out of order commits, because after
> > > > cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> > > > contents will be consistent.
> > > >
> > > > The problem is that, since the two branches have diverged, we cannot
> > > > use the rebase command. Instead, we have to manually cherry-pick
> > > > commits individually, which requires large effort. In addition, for
> > > > any subsequent release processes, we have to manually cherry-pick
> each
> > > > commit.
> > > >
> > > > 2. We need to make the commits order consistent, which will make it
> > > > easy for subsequent releases.
> > > > However, the problem is that, to make the commits order consistent,
> > > > some git force push command is unavoidable, which is risky to some
> > > > extent.
> > > >
> > > > So what is the recommended way to do this? Thanks in advance for
> your feedback!
> > > >
> > > > Best,
> > > > Liya Fan
>

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Julian Hyde <jh...@apache.org>.
Does anyone know (or could find out) the SHA of the master and site
branches at the time that Fan attempted to move the site changes over?
If so, we could recreate the same environment, and figure out a set of
git commands that would have worked then and will work for the next
release manager. This process is safe because we can do these
experiments in a local git sandbox, without pushing to any remote.

On Fri, Mar 25, 2022 at 6:09 AM Fan Liya <li...@gmail.com> wrote:
>
> Hi Francis,
>
> Thanks for your feedback.
>
> It seems we should choose option 2.
> In addition, it seems less risky to run "git push --force" commands in
> the site branch.
>
> Best,
> Liya Fan
>
> Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
> >
> > Hi Liya,
> >
> > Thanks for bringing this up. We have always done the following when
> > committing:
> > 1. Always commit to master.
> > 2. If we need to publish the change to the site now (for example, new
> > committer or announcement), cherry-pick the change into the site branch
> > and publish it.
> > 3. After a release, make the site branch the same as master (git reset
> > --hard master) and force push (git push --force origin site).
> >
> > Francis
> >
> > On 25/03/2022 3:03 pm, Fan Liya wrote:
> > > Hi all,
> > >
> > > As part of the release process, we need to synchronize the master and
> > > site branches (Please see
> > > https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> > > Usually, the site is behind the master branch by some commits.
> > > If the existing commits in the site branch are in the same order as in
> > > the master branch, the task is easy: just switch to the site branch,
> > > and run
> > >
> > > git rebase master
> > >
> > > However, if some commits are in different orders, it can be tricky.
> > > For example, the master branch may have the following commits (in
> > > order):
> > >
> > > A, B, X1, X2, ... , Xn.
> > >
> > > and the site branch may have the following commits (in order):
> > >
> > > B, A, X1, X2.
> > >
> > > Basically we have two choices:
> > >
> > > 1. We can live with the out of order commits, because after
> > > cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> > > contents will be consistent.
> > >
> > > The problem is that, since the two branches have diverged, we cannot
> > > use the rebase command. Instead, we have to manually cherry-pick
> > > commits individually, which requires large effort. In addition, for
> > > any subsequent release processes, we have to manually cherry-pick each
> > > commit.
> > >
> > > 2. We need to make the commits order consistent, which will make it
> > > easy for subsequent releases.
> > > However, the problem is that, to make the commits order consistent,
> > > some git force push command is unavoidable, which is risky to some
> > > extent.
> > >
> > > So what is the recommended way to do this? Thanks in advance for your feedback!
> > >
> > > Best,
> > > Liya Fan

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Fan Liya <li...@gmail.com>.
Hi Francis,

Thanks for your feedback.

It seems we should choose option 2.
In addition, it seems less risky to run "git push --force" commands in
the site branch.

Best,
Liya Fan

Francis Chuang <fr...@apache.org> 于2022年3月25日周五 12:14写道:
>
> Hi Liya,
>
> Thanks for bringing this up. We have always done the following when
> committing:
> 1. Always commit to master.
> 2. If we need to publish the change to the site now (for example, new
> committer or announcement), cherry-pick the change into the site branch
> and publish it.
> 3. After a release, make the site branch the same as master (git reset
> --hard master) and force push (git push --force origin site).
>
> Francis
>
> On 25/03/2022 3:03 pm, Fan Liya wrote:
> > Hi all,
> >
> > As part of the release process, we need to synchronize the master and
> > site branches (Please see
> > https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> > Usually, the site is behind the master branch by some commits.
> > If the existing commits in the site branch are in the same order as in
> > the master branch, the task is easy: just switch to the site branch,
> > and run
> >
> > git rebase master
> >
> > However, if some commits are in different orders, it can be tricky.
> > For example, the master branch may have the following commits (in
> > order):
> >
> > A, B, X1, X2, ... , Xn.
> >
> > and the site branch may have the following commits (in order):
> >
> > B, A, X1, X2.
> >
> > Basically we have two choices:
> >
> > 1. We can live with the out of order commits, because after
> > cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> > contents will be consistent.
> >
> > The problem is that, since the two branches have diverged, we cannot
> > use the rebase command. Instead, we have to manually cherry-pick
> > commits individually, which requires large effort. In addition, for
> > any subsequent release processes, we have to manually cherry-pick each
> > commit.
> >
> > 2. We need to make the commits order consistent, which will make it
> > easy for subsequent releases.
> > However, the problem is that, to make the commits order consistent,
> > some git force push command is unavoidable, which is risky to some
> > extent.
> >
> > So what is the recommended way to do this? Thanks in advance for your feedback!
> >
> > Best,
> > Liya Fan

Re: [DISCUSS] Best practice for synchronizing master and site branches

Posted by Francis Chuang <fr...@apache.org>.
Hi Liya,

Thanks for bringing this up. We have always done the following when 
committing:
1. Always commit to master.
2. If we need to publish the change to the site now (for example, new 
committer or announcement), cherry-pick the change into the site branch 
and publish it.
3. After a release, make the site branch the same as master (git reset 
--hard master) and force push (git push --force origin site).

Francis

On 25/03/2022 3:03 pm, Fan Liya wrote:
> Hi all,
> 
> As part of the release process, we need to synchronize the master and
> site branches (Please see
> https://calcite.apache.org/docs/howto.html#making-a-release-candidate).
> Usually, the site is behind the master branch by some commits.
> If the existing commits in the site branch are in the same order as in
> the master branch, the task is easy: just switch to the site branch,
> and run
> 
> git rebase master
> 
> However, if some commits are in different orders, it can be tricky.
> For example, the master branch may have the following commits (in
> order):
> 
> A, B, X1, X2, ... , Xn.
> 
> and the site branch may have the following commits (in order):
> 
> B, A, X1, X2.
> 
> Basically we have two choices:
> 
> 1. We can live with the out of order commits, because after
> cherry-picking commits X3, X4, ... , Xn to the site branch, the file
> contents will be consistent.
> 
> The problem is that, since the two branches have diverged, we cannot
> use the rebase command. Instead, we have to manually cherry-pick
> commits individually, which requires large effort. In addition, for
> any subsequent release processes, we have to manually cherry-pick each
> commit.
> 
> 2. We need to make the commits order consistent, which will make it
> easy for subsequent releases.
> However, the problem is that, to make the commits order consistent,
> some git force push command is unavoidable, which is risky to some
> extent.
> 
> So what is the recommended way to do this? Thanks in advance for your feedback!
> 
> Best,
> Liya Fan