You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bloodhound.apache.org by Jure Zitnik <ju...@digiverse.si> on 2013/03/13 11:47:46 UTC

[BEP-0003] Multi product repositories

Hi,

one thing that has not been discussed in relation to multi-products yet 
are VCS repositories.

Current implementation does not 'productize' repositories, all 
repositories are global. In my opinion this should be changed to have 
per product repositories. Practically this would mean adding 
'repository' table to the translated table list.

Olemis on the other hand suggested (in another thread) that all 
repositories should be global and have 'soft links' to products which 
would allow for repository reusal in different product contexts.

I would suggest we go with the first option (per-product repositories) 
for now and add support for 'global' repositories and 'soft links' at 
the later stage.

Any comments/suggestions on this?

Cheers,
Jure


Re: [BEP-0003] Multi product repositories

Posted by Jure Zitnik <ju...@digiverse.si>.
On 3/14/13 9:35 PM, Olemis Lang wrote:
> On 3/14/13, Jure Zitnik <ju...@digiverse.si> wrote:
>> On 3/14/13 10:13 AM, Peter Koželj wrote:
>>> Maybe there is a solution in between pure soft links and repository table
>>> translation.
>>> I would go with links which is classical many-to-many table between
>>> products and repositories.
>>> We can then translate the repository table based on the links table
>>> instead
>>> of the repository table itself.
>>>
>>> If we can can pull this off on the SQL translation level we get "global
>>> repositories" with translated view in product context almoast for free.
>> Looking at the trac code, we might actually get away w/o introducing new
>> tables (or changes to the translator layer) by just leveraging current
>> 'repository' table.
>>
> Well , the fact is that it might not be desired that users on a given
> product will see code in all repositories ... therefore the soft links
> to limit the scope of sharing .

Yes, that's what we're trying to achieve.

>> afaics the table is a simple attribute (name+value) table for each of
>> the repositories. So one of the possible solutions for the global
>> repositories + soft links would be to simply add another repository
>> attribute that would represent a list of products that 'soft link' the
>> repository...
> /me thinking of queries involving product repos ... what will they look like ?

So, the technical details - the 'repository' table has 3 columns, 'id', 
'name' and 'value'. 'id' is repository key, 'name' + 'value' are 
repository attributes. atm the following attributes are being used: 
'name', 'type', 'url', 'description', 'dir', 'hidden', 'alias'. 
Soft-links could be implemented by using another repository attribute 
named let's say 'product'. So, for each product that would reference a 
specific repository, a new repository attribute would be added to that 
specific repository, with 'name' set to 'product' and 'value' being 
product prefix.

Our implementation of 'DbRepositoryProvider' and 'RepositoryManager' 
would use that ('product') attribute to filter visible/soft-linked 
repositories based on the environment. I would assume that custom 
implementation of those two classes would cover trac/BH code (assuming 
it uses those two classes to access repository information).

The drawback of this approach would be that it doesn't cover 3rd party 
plugin scenarios when accessing 'repository' table(s) directly. Plugins 
running within specific product environments would see all other 
repositories, as the SQL queries wouldn't get translated...

> [...]
> + ... in any case
>
> PS: I'm not sure of whether we might want to track *now* fork
> relationships among repos , and web UI shortcuts to do such things .
>
I don't understand what you meant there.

Cheers,
Jure


Re: [BEP-0003] Multi product repositories

Posted by Olemis Lang <ol...@gmail.com>.
On 3/14/13, Jure Zitnik <ju...@digiverse.si> wrote:
> On 3/14/13 10:13 AM, Peter Koželj wrote:
>> Maybe there is a solution in between pure soft links and repository table
>> translation.
>> I would go with links which is classical many-to-many table between
>> products and repositories.
>> We can then translate the repository table based on the links table
>> instead
>> of the repository table itself.
>>
>> If we can can pull this off on the SQL translation level we get "global
>> repositories" with translated view in product context almoast for free.
>
> Looking at the trac code, we might actually get away w/o introducing new
> tables (or changes to the translator layer) by just leveraging current
> 'repository' table.
>

Well , the fact is that it might not be desired that users on a given
product will see code in all repositories ... therefore the soft links
to limit the scope of sharing .

;)

> afaics the table is a simple attribute (name+value) table for each of
> the repositories. So one of the possible solutions for the global
> repositories + soft links would be to simply add another repository
> attribute that would represent a list of products that 'soft link' the
> repository...

/me thinking of queries involving product repos ... what will they look like ?

> no database schema/translator changes required... other
> changes would include (at least) modifying
> 'trac.versioncontrol.api.DbRepositoryProvider' and
> 'trac.versioncontrol.api.RepositoryManager' to properly 'filter'
> repositories based on the environment ...
>

+ ... in any case

PS: I'm not sure of whether we might want to track *now* fork
relationships among repos , and web UI shortcuts to do such things .

-- 
Regards,

Olemis.

Re: [BEP-0003] Multi product repositories

Posted by Jure Zitnik <ju...@digiverse.si>.
On 3/14/13 10:13 AM, Peter Koželj wrote:
> Maybe there is a solution in between pure soft links and repository table
> translation.
> I would go with links which is classical many-to-many table between
> products and repositories.
> We can then translate the repository table based on the links table instead
> of the repository table itself.
>
> If we can can pull this off on the SQL translation level we get "global
> repositories" with translated view in product context almoast for free.
Looking at the trac code, we might actually get away w/o introducing new 
tables (or changes to the translator layer) by just leveraging current 
'repository' table.

afaics the table is a simple attribute (name+value) table for each of 
the repositories. So one of the possible solutions for the global 
repositories + soft links would be to simply add another repository 
attribute that would represent a list of products that 'soft link' the 
repository... no database schema/translator changes required... other 
changes would include (at least) modifying 
'trac.versioncontrol.api.DbRepositoryProvider' and 
'trac.versioncontrol.api.RepositoryManager' to properly 'filter' 
repositories based on the environment ...

Cheers,
Jure


Re: [BEP-0003] Multi product repositories

Posted by Peter Koželj <pe...@digiverse.si>.
Maybe there is a solution in between pure soft links and repository table
translation.
I would go with links which is classical many-to-many table between
products and repositories.
We can then translate the repository table based on the links table instead
of the repository table itself.

If we can can pull this off on the SQL translation level we get "global
repositories" with translated view in product context almoast for free.

Peter

On 13 March 2013 16:43, Olemis Lang <ol...@gmail.com> wrote:

> On 3/13/13, Branko Čibej <br...@wandisco.com> wrote:
> > On 13.03.2013 11:47, Jure Zitnik wrote:
> >> Hi,
> >>
>
> :)
>
> >> one thing that has not been discussed in relation to multi-products
> >> yet are VCS repositories.
> >>
>
> they should be global , and SQL queries will not be translated
>
> >> Current implementation does not 'productize' repositories, all
> >> repositories are global.
>
> that's correct
>
> >> In my opinion this should be changed to have
> >> per product repositories. Practically this would mean adding
> >> 'repository' table to the translated table list.
> >>
>
> -1
>
> >> Olemis on the other hand suggested (in another thread) that all
> >> repositories should be global and have 'soft links' to products which
> >> would allow for repository reusal in different product contexts.
> >>
>
> I recall ...
>
> >> I would suggest we go with the first option (per-product repositories)
> >> for now and add support for 'global' repositories and 'soft links' at
> >> the later stage.
> >>
> [...]
> >
> > IIRC, a "repository" in Trac implies an index of all changes.
>
> +1
>
> > With
> > per-product repositories, you'd be duplicating all those indexes, which
> > can cause serious database size explosion.
> >
>
> ... and also an unnecessary repetition of changeset caching procedure
> wasting valuable CPU time thus seriously degrading overall performance
> for no good reason .
>
> > Consider, for example, the ASF: there is one repository with well over a
> > million revisions and ~100 "products" stored in it, so you're looking at
> > two orders of magnitude more data for repository indexing if you
> > duplicate the repos in a bloodhound instance for each project.
> >
>
> This is a fact O(p * r) ... and considering that r > p will be a
> frequent scenario that becomes O(p^2)
>
> --
> Regards,
>
> Olemis.
>

Re: [BEP-0003] Multi product repositories

Posted by Olemis Lang <ol...@gmail.com>.
On 3/13/13, Branko Čibej <br...@wandisco.com> wrote:
> On 13.03.2013 16:43, Olemis Lang wrote:
>> -1
>
> Please stop throwing around "-1" like that. A veto has a very specific
> meaning at the ASF, and none of your "-1" apply. I know you don't
> necessarily mean these as vetoes, but a casual reader of our lists who
> does know what "-1" means would be concerned about how frequently these
> appear.
>

ok , was not my intention to veto anything (I do not think I could
even do such a thing) . /me just expressing my opinion ... didn't see
it that way.

I apologize if this caused any trouble before

-- 
Regards,

Olemis.

Re: [BEP-0003] Multi product repositories

Posted by Branko Čibej <br...@wandisco.com>.
On 13.03.2013 16:43, Olemis Lang wrote:
> -1

Please stop throwing around "-1" like that. A veto has a very specific
meaning at the ASF, and none of your "-1" apply. I know you don't
necessarily mean these as vetoes, but a casual reader of our lists who
does know what "-1" means would be concerned about how frequently these
appear.

It's quite sufficient to explain why you object to an idea, as you do
later in the message.

Thanks,

-- Brane

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com


Re: [BEP-0003] Multi product repositories

Posted by Olemis Lang <ol...@gmail.com>.
On 3/13/13, Branko Čibej <br...@wandisco.com> wrote:
> On 13.03.2013 11:47, Jure Zitnik wrote:
>> Hi,
>>

:)

>> one thing that has not been discussed in relation to multi-products
>> yet are VCS repositories.
>>

they should be global , and SQL queries will not be translated

>> Current implementation does not 'productize' repositories, all
>> repositories are global.

that's correct

>> In my opinion this should be changed to have
>> per product repositories. Practically this would mean adding
>> 'repository' table to the translated table list.
>>

-1

>> Olemis on the other hand suggested (in another thread) that all
>> repositories should be global and have 'soft links' to products which
>> would allow for repository reusal in different product contexts.
>>

I recall ...

>> I would suggest we go with the first option (per-product repositories)
>> for now and add support for 'global' repositories and 'soft links' at
>> the later stage.
>>
[...]
>
> IIRC, a "repository" in Trac implies an index of all changes.

+1

> With
> per-product repositories, you'd be duplicating all those indexes, which
> can cause serious database size explosion.
>

... and also an unnecessary repetition of changeset caching procedure
wasting valuable CPU time thus seriously degrading overall performance
for no good reason .

> Consider, for example, the ASF: there is one repository with well over a
> million revisions and ~100 "products" stored in it, so you're looking at
> two orders of magnitude more data for repository indexing if you
> duplicate the repos in a bloodhound instance for each project.
>

This is a fact O(p * r) ... and considering that r > p will be a
frequent scenario that becomes O(p^2)

-- 
Regards,

Olemis.

Re: [BEP-0003] Multi product repositories

Posted by Branko Čibej <br...@wandisco.com>.
On 13.03.2013 11:47, Jure Zitnik wrote:
> Hi,
>
> one thing that has not been discussed in relation to multi-products
> yet are VCS repositories.
>
> Current implementation does not 'productize' repositories, all
> repositories are global. In my opinion this should be changed to have
> per product repositories. Practically this would mean adding
> 'repository' table to the translated table list.
>
> Olemis on the other hand suggested (in another thread) that all
> repositories should be global and have 'soft links' to products which
> would allow for repository reusal in different product contexts.
>
> I would suggest we go with the first option (per-product repositories)
> for now and add support for 'global' repositories and 'soft links' at
> the later stage.
>
> Any comments/suggestions on this?

IIRC, a "repository" in Trac implies an index of all changes. With
per-product repositories, you'd be duplicating all those indexes, which
can cause serious database size explosion.

Consider, for example, the ASF: there is one repository with well over a
million revisions and ~100 "products" stored in it, so you're looking at
two orders of magnitude more data for repository indexing if you
duplicate the repos in a bloodhound instance for each project.

-- Brane


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com