You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@camel.apache.org by Zoran Regvart <zo...@regvart.com> on 2020/11/02 12:00:27 UTC

git squash on the asf-site of camel-website repository

Hi Cameleers,
when cloned the camel-website repository is 1.3GB in size. I think
that's because of the large number of commits in the `asf-site`
branch. As a reminder when we build the website, to publish it we have
to push the resulting files to the `asf-site` branch.

I think it would help if we were to squash to commits there. This
would, of course, mean we would lose the history on that branch.

I was thinking we would keep the last 10 commits unsquashed, and
squash all older commits (apart from the initial one), with something
like:

git -c core.editor="sed -i 2,/$(git log --skip=10 -1
--pretty=format:%h)/s/^pick/squash/" rebase --interactive
1586f65bf7f24784dc99e22aff08e44c7dbb1920

That `sed` would skip the first line and replace until the 11th commit
(hash printed by that `git log`) has been seen all "pick" with
"squash".

I'd put this as a step in the deploy part of the pipeline[1].

WDYT?

zoran

[1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
-- 
Zoran Regvart

Re: git squash on the asf-site of camel-website repository

Posted by David Jencks <da...@gmail.com>.
Thanks!

> On Nov 5, 2020, at 1:42 AM, Zoran Regvart <zo...@regvart.com> wrote:
> 
> Hi David,
> I can also take a look into that, it'll most likely be a shell script
> that generates that commit message. I'll put it on the list of things
> to do[1].
> 
> zoran
> 
> [1] https://issues.apache.org/jira/browse/CAMEL-15816
> 
> On Wed, Nov 4, 2020 at 1:04 AM David Jencks <da...@gmail.com> wrote:
>> 
>> OK, that’s good, but what about the commit OIDs of all the other repos/branches contributing to the website?
>> 
>> David Jencks
>> 
>>> On Nov 3, 2020, at 1:00 AM, Zoran Regvart <zo...@regvart.com> wrote:
>>> 
>>> Hi David,
>>> 
>>> On Tue, Nov 3, 2020 at 6:19 AM David Jencks <da...@gmail.com> wrote:
>>>> ... it would be helpful to record the git commit OIDs for each branch in the build each time the site is published.
>>> 
>>> I've fixed[1] the linkage between the `master` and the `asf-site`
>>> branch, that should be correct now. So if you look at the last commit
>>> mesage[2], it now points to the commit on `master`[3] that triggered
>>> publishing.
>>> 
>>> zoran
>>> 
>>> [1] https://github.com/apache/camel-website/commit/8d4e742abe9d2b85d05e53a89283d07cad338399
>>> [2] https://github.com/apache/camel-website/commit/a6a133be40afa767ee1e9de90187ecf12b7ce720
>>> [3] https://github.com/apache/camel-website/commit/5f7813774aac2db57aefc29b73df0b4537f5307c
>>> --
>>> Zoran Regvart
>> 
> 
> 
> -- 
> Zoran Regvart


Re: git squash on the asf-site of camel-website repository

Posted by Zoran Regvart <zo...@regvart.com>.
Hi David,
I can also take a look into that, it'll most likely be a shell script
that generates that commit message. I'll put it on the list of things
to do[1].

zoran

[1] https://issues.apache.org/jira/browse/CAMEL-15816

On Wed, Nov 4, 2020 at 1:04 AM David Jencks <da...@gmail.com> wrote:
>
> OK, that’s good, but what about the commit OIDs of all the other repos/branches contributing to the website?
>
> David Jencks
>
> > On Nov 3, 2020, at 1:00 AM, Zoran Regvart <zo...@regvart.com> wrote:
> >
> > Hi David,
> >
> > On Tue, Nov 3, 2020 at 6:19 AM David Jencks <da...@gmail.com> wrote:
> >> ... it would be helpful to record the git commit OIDs for each branch in the build each time the site is published.
> >
> > I've fixed[1] the linkage between the `master` and the `asf-site`
> > branch, that should be correct now. So if you look at the last commit
> > mesage[2], it now points to the commit on `master`[3] that triggered
> > publishing.
> >
> > zoran
> >
> > [1] https://github.com/apache/camel-website/commit/8d4e742abe9d2b85d05e53a89283d07cad338399
> > [2] https://github.com/apache/camel-website/commit/a6a133be40afa767ee1e9de90187ecf12b7ce720
> > [3] https://github.com/apache/camel-website/commit/5f7813774aac2db57aefc29b73df0b4537f5307c
> > --
> > Zoran Regvart
>


-- 
Zoran Regvart

Re: git squash on the asf-site of camel-website repository

Posted by David Jencks <da...@gmail.com>.
OK, that’s good, but what about the commit OIDs of all the other repos/branches contributing to the website?

David Jencks

> On Nov 3, 2020, at 1:00 AM, Zoran Regvart <zo...@regvart.com> wrote:
> 
> Hi David,
> 
> On Tue, Nov 3, 2020 at 6:19 AM David Jencks <da...@gmail.com> wrote:
>> ... it would be helpful to record the git commit OIDs for each branch in the build each time the site is published.
> 
> I've fixed[1] the linkage between the `master` and the `asf-site`
> branch, that should be correct now. So if you look at the last commit
> mesage[2], it now points to the commit on `master`[3] that triggered
> publishing.
> 
> zoran
> 
> [1] https://github.com/apache/camel-website/commit/8d4e742abe9d2b85d05e53a89283d07cad338399
> [2] https://github.com/apache/camel-website/commit/a6a133be40afa767ee1e9de90187ecf12b7ce720
> [3] https://github.com/apache/camel-website/commit/5f7813774aac2db57aefc29b73df0b4537f5307c
> -- 
> Zoran Regvart


Re: git squash on the asf-site of camel-website repository

Posted by Zoran Regvart <zo...@regvart.com>.
Hi David,

On Tue, Nov 3, 2020 at 6:19 AM David Jencks <da...@gmail.com> wrote:
> ... it would be helpful to record the git commit OIDs for each branch in the build each time the site is published.

I've fixed[1] the linkage between the `master` and the `asf-site`
branch, that should be correct now. So if you look at the last commit
mesage[2], it now points to the commit on `master`[3] that triggered
publishing.

zoran

[1] https://github.com/apache/camel-website/commit/8d4e742abe9d2b85d05e53a89283d07cad338399
[2] https://github.com/apache/camel-website/commit/a6a133be40afa767ee1e9de90187ecf12b7ce720
[3] https://github.com/apache/camel-website/commit/5f7813774aac2db57aefc29b73df0b4537f5307c
-- 
Zoran Regvart

Re: git squash on the asf-site of camel-website repository

Posted by David Jencks <da...@gmail.com>.
Well, actually you are claiming that the website build is reproducible. I don’t think there’s any evidence for that, but perhaps it doesn’t matter.

However, for the Antora portion, it would be helpful to record the git commit OIDs for each branch in the build each time the site is published.

David Jencks

> On Nov 2, 2020, at 4:14 PM, David Jencks <da...@gmail.com> wrote:
> 
> Perhaps I don’t know how often the website is published.  If it’s published after every git update, then indeed history is not so essential. Indeed, then, perhaps doing git commit —amend —no-edit for each update would be fine (and squashing all earlier commits)
> 
> Personally I’d avoid any SVN solution. The git + asf-site branch solution seems pretty nice to me.
> 
> David Jencks
> 
>> On Nov 2, 2020, at 2:49 PM, Zoran Regvart <zo...@regvart.com> wrote:
>> 
>> Hi David,
>> And at any point we can recreate the state of the `asf-site` branch by
>> rebuilding and pushing the content of the `public` directory, so I
>> don't see history of that being important as it is a pure byproduct of
>> the state on the `master` branch.
>> 
>> I don't think we ever look at history on the `asf-site` branch, at
>> least I haven't, if someone has and has a compelling use case I'm
>> willing to go back on this.
>> 
>> I've been looking at INFRA wiki and I've found a way to not use the
>> `asf-site` branch in the same git repository, it seems that we can
>> push to SVN directly[1], that needs to be investigated. I'm not
>> certain what drawbacks it entails.
>> 
>> zoran
>> 
>> [1] https://cwiki.apache.org/confluence/display/INFRA/Publish+a+huge+project+website+without+checking+it+into+Git
>> 
>> On Mon, Nov 2, 2020 at 11:13 PM David Jencks <da...@gmail.com> wrote:
>>> 
>>> I don’t think killing the history is a good idea at all, and I’m not sure what infra would think about it.  Perhaps you could suggest that cloning with depth 1 would be appropriate?
>>> 
>>> Personally I think that having a separate camel-site repo with just the published site with all history would make even more sense.
>>> 
>>> David Jencks
>>> 
>>>> On Nov 2, 2020, at 4:04 AM, Claus Ibsen <cl...@gmail.com> wrote:
>>>> 
>>>> Hi Zoran
>>>> 
>>>> Yeah its fine with me.
>>>> 
>>>> On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
>>>>> 
>>>>> Hi Cameleers,
>>>>> when cloned the camel-website repository is 1.3GB in size. I think
>>>>> that's because of the large number of commits in the `asf-site`
>>>>> branch. As a reminder when we build the website, to publish it we have
>>>>> to push the resulting files to the `asf-site` branch.
>>>>> 
>>>>> I think it would help if we were to squash to commits there. This
>>>>> would, of course, mean we would lose the history on that branch.
>>>>> 
>>>>> I was thinking we would keep the last 10 commits unsquashed, and
>>>>> squash all older commits (apart from the initial one), with something
>>>>> like:
>>>>> 
>>>>> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
>>>>> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
>>>>> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
>>>>> 
>>>>> That `sed` would skip the first line and replace until the 11th commit
>>>>> (hash printed by that `git log`) has been seen all "pick" with
>>>>> "squash".
>>>>> 
>>>>> I'd put this as a step in the deploy part of the pipeline[1].
>>>>> 
>>>>> WDYT?
>>>>> 
>>>>> zoran
>>>>> 
>>>>> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
>>>>> --
>>>>> Zoran Regvart
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Claus Ibsen
>>>> -----------------
>>>> http://davsclaus.com @davsclaus
>>>> Camel in Action 2: https://www.manning.com/ibsen2
>>> 
>> 
>> 
>> -- 
>> Zoran Regvart
> 


Re: git squash on the asf-site of camel-website repository

Posted by David Jencks <da...@gmail.com>.
Perhaps I don’t know how often the website is published.  If it’s published after every git update, then indeed history is not so essential. Indeed, then, perhaps doing git commit —amend —no-edit for each update would be fine (and squashing all earlier commits)

Personally I’d avoid any SVN solution. The git + asf-site branch solution seems pretty nice to me.

David Jencks

> On Nov 2, 2020, at 2:49 PM, Zoran Regvart <zo...@regvart.com> wrote:
> 
> Hi David,
> And at any point we can recreate the state of the `asf-site` branch by
> rebuilding and pushing the content of the `public` directory, so I
> don't see history of that being important as it is a pure byproduct of
> the state on the `master` branch.
> 
> I don't think we ever look at history on the `asf-site` branch, at
> least I haven't, if someone has and has a compelling use case I'm
> willing to go back on this.
> 
> I've been looking at INFRA wiki and I've found a way to not use the
> `asf-site` branch in the same git repository, it seems that we can
> push to SVN directly[1], that needs to be investigated. I'm not
> certain what drawbacks it entails.
> 
> zoran
> 
> [1] https://cwiki.apache.org/confluence/display/INFRA/Publish+a+huge+project+website+without+checking+it+into+Git
> 
> On Mon, Nov 2, 2020 at 11:13 PM David Jencks <da...@gmail.com> wrote:
>> 
>> I don’t think killing the history is a good idea at all, and I’m not sure what infra would think about it.  Perhaps you could suggest that cloning with depth 1 would be appropriate?
>> 
>> Personally I think that having a separate camel-site repo with just the published site with all history would make even more sense.
>> 
>> David Jencks
>> 
>>> On Nov 2, 2020, at 4:04 AM, Claus Ibsen <cl...@gmail.com> wrote:
>>> 
>>> Hi Zoran
>>> 
>>> Yeah its fine with me.
>>> 
>>> On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
>>>> 
>>>> Hi Cameleers,
>>>> when cloned the camel-website repository is 1.3GB in size. I think
>>>> that's because of the large number of commits in the `asf-site`
>>>> branch. As a reminder when we build the website, to publish it we have
>>>> to push the resulting files to the `asf-site` branch.
>>>> 
>>>> I think it would help if we were to squash to commits there. This
>>>> would, of course, mean we would lose the history on that branch.
>>>> 
>>>> I was thinking we would keep the last 10 commits unsquashed, and
>>>> squash all older commits (apart from the initial one), with something
>>>> like:
>>>> 
>>>> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
>>>> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
>>>> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
>>>> 
>>>> That `sed` would skip the first line and replace until the 11th commit
>>>> (hash printed by that `git log`) has been seen all "pick" with
>>>> "squash".
>>>> 
>>>> I'd put this as a step in the deploy part of the pipeline[1].
>>>> 
>>>> WDYT?
>>>> 
>>>> zoran
>>>> 
>>>> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
>>>> --
>>>> Zoran Regvart
>>> 
>>> 
>>> 
>>> --
>>> Claus Ibsen
>>> -----------------
>>> http://davsclaus.com @davsclaus
>>> Camel in Action 2: https://www.manning.com/ibsen2
>> 
> 
> 
> -- 
> Zoran Regvart


Re: git squash on the asf-site of camel-website repository

Posted by Zoran Regvart <zo...@regvart.com>.
Hi David,
And at any point we can recreate the state of the `asf-site` branch by
rebuilding and pushing the content of the `public` directory, so I
don't see history of that being important as it is a pure byproduct of
the state on the `master` branch.

I don't think we ever look at history on the `asf-site` branch, at
least I haven't, if someone has and has a compelling use case I'm
willing to go back on this.

I've been looking at INFRA wiki and I've found a way to not use the
`asf-site` branch in the same git repository, it seems that we can
push to SVN directly[1], that needs to be investigated. I'm not
certain what drawbacks it entails.

zoran

[1] https://cwiki.apache.org/confluence/display/INFRA/Publish+a+huge+project+website+without+checking+it+into+Git

On Mon, Nov 2, 2020 at 11:13 PM David Jencks <da...@gmail.com> wrote:
>
> I don’t think killing the history is a good idea at all, and I’m not sure what infra would think about it.  Perhaps you could suggest that cloning with depth 1 would be appropriate?
>
> Personally I think that having a separate camel-site repo with just the published site with all history would make even more sense.
>
> David Jencks
>
> > On Nov 2, 2020, at 4:04 AM, Claus Ibsen <cl...@gmail.com> wrote:
> >
> > Hi Zoran
> >
> > Yeah its fine with me.
> >
> > On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
> >>
> >> Hi Cameleers,
> >> when cloned the camel-website repository is 1.3GB in size. I think
> >> that's because of the large number of commits in the `asf-site`
> >> branch. As a reminder when we build the website, to publish it we have
> >> to push the resulting files to the `asf-site` branch.
> >>
> >> I think it would help if we were to squash to commits there. This
> >> would, of course, mean we would lose the history on that branch.
> >>
> >> I was thinking we would keep the last 10 commits unsquashed, and
> >> squash all older commits (apart from the initial one), with something
> >> like:
> >>
> >> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
> >> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
> >> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
> >>
> >> That `sed` would skip the first line and replace until the 11th commit
> >> (hash printed by that `git log`) has been seen all "pick" with
> >> "squash".
> >>
> >> I'd put this as a step in the deploy part of the pipeline[1].
> >>
> >> WDYT?
> >>
> >> zoran
> >>
> >> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
> >> --
> >> Zoran Regvart
> >
> >
> >
> > --
> > Claus Ibsen
> > -----------------
> > http://davsclaus.com @davsclaus
> > Camel in Action 2: https://www.manning.com/ibsen2
>


-- 
Zoran Regvart

Re: git squash on the asf-site of camel-website repository

Posted by David Jencks <da...@gmail.com>.
I don’t think killing the history is a good idea at all, and I’m not sure what infra would think about it.  Perhaps you could suggest that cloning with depth 1 would be appropriate?

Personally I think that having a separate camel-site repo with just the published site with all history would make even more sense.

David Jencks

> On Nov 2, 2020, at 4:04 AM, Claus Ibsen <cl...@gmail.com> wrote:
> 
> Hi Zoran
> 
> Yeah its fine with me.
> 
> On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
>> 
>> Hi Cameleers,
>> when cloned the camel-website repository is 1.3GB in size. I think
>> that's because of the large number of commits in the `asf-site`
>> branch. As a reminder when we build the website, to publish it we have
>> to push the resulting files to the `asf-site` branch.
>> 
>> I think it would help if we were to squash to commits there. This
>> would, of course, mean we would lose the history on that branch.
>> 
>> I was thinking we would keep the last 10 commits unsquashed, and
>> squash all older commits (apart from the initial one), with something
>> like:
>> 
>> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
>> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
>> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
>> 
>> That `sed` would skip the first line and replace until the 11th commit
>> (hash printed by that `git log`) has been seen all "pick" with
>> "squash".
>> 
>> I'd put this as a step in the deploy part of the pipeline[1].
>> 
>> WDYT?
>> 
>> zoran
>> 
>> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
>> --
>> Zoran Regvart
> 
> 
> 
> -- 
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2


Re: git squash on the asf-site of camel-website repository

Posted by Claus Ibsen <cl...@gmail.com>.
Hi Zoran

Yeah its fine with me.

On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
>
> Hi Cameleers,
> when cloned the camel-website repository is 1.3GB in size. I think
> that's because of the large number of commits in the `asf-site`
> branch. As a reminder when we build the website, to publish it we have
> to push the resulting files to the `asf-site` branch.
>
> I think it would help if we were to squash to commits there. This
> would, of course, mean we would lose the history on that branch.
>
> I was thinking we would keep the last 10 commits unsquashed, and
> squash all older commits (apart from the initial one), with something
> like:
>
> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
>
> That `sed` would skip the first line and replace until the 11th commit
> (hash printed by that `git log`) has been seen all "pick" with
> "squash".
>
> I'd put this as a step in the deploy part of the pipeline[1].
>
> WDYT?
>
> zoran
>
> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
> --
> Zoran Regvart



-- 
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2

Re: git squash on the asf-site of camel-website repository

Posted by Zoran Regvart <zo...@regvart.com>.
Hi Cameleers,
so this has been done on the latest build of the website and a fresh
clone of the website is now 569MB (178MB in .git), so about half of
the previous size.

zoran

On Mon, Nov 2, 2020 at 1:00 PM Zoran Regvart <zo...@regvart.com> wrote:
>
> Hi Cameleers,
> when cloned the camel-website repository is 1.3GB in size. I think
> that's because of the large number of commits in the `asf-site`
> branch. As a reminder when we build the website, to publish it we have
> to push the resulting files to the `asf-site` branch.
>
> I think it would help if we were to squash to commits there. This
> would, of course, mean we would lose the history on that branch.
>
> I was thinking we would keep the last 10 commits unsquashed, and
> squash all older commits (apart from the initial one), with something
> like:
>
> git -c core.editor="sed -i 2,/$(git log --skip=10 -1
> --pretty=format:%h)/s/^pick/squash/" rebase --interactive
> 1586f65bf7f24784dc99e22aff08e44c7dbb1920
>
> That `sed` would skip the first line and replace until the 11th commit
> (hash printed by that `git log`) has been seen all "pick" with
> "squash".
>
> I'd put this as a step in the deploy part of the pipeline[1].
>
> WDYT?
>
> zoran
>
> [1] https://github.com/apache/camel-website/blob/8cafa694e13b72d3013b7de2b956da73f55ca2b4/Jenkinsfile#L89
> --
> Zoran Regvart



-- 
Zoran Regvart