You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@archiva.apache.org by Martin <ma...@apache.org> on 2017/06/12 20:06:51 UTC

maven-indexer / Lucene

Hi,

the lucene version depends on the maven indexer. But I'm not sure about the 
current state of maven-indexer. The version has not changed since some 2013.

There are commits on the master branch since then, and the lucene version has 
been changed too, but no releases were tagged.
Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT? 

As I know there are new compact index formats with new lucene versions but I'm 
not sure if this is relevant for the maven indexes.

Cheers

Martin

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
I think we merge the branch to the master, when it's decided what way to go.

Do you refer to the intermittent failures of store-jcr module with your answer 
or do you refer to other issues?
I had the intermittent failures too and thought about some race condition, 
because JCR Oak is very asynchronous by design.

Greetings

Martin

Am Samstag, 24. Juni 2017, 01:50:40 CEST schrieb Olivier Lamy:
> well the issue is non compatible version of Lucene for Maven Indexer and
> Oak (well I can try push a patch to Oak for upgrading...)
> 
> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
> > Hi
> > Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> > I'm working on it in the branch ( feature/jcr_oak )
> > Not sure why but I have intermittent failure with store-jcr module.
> > I definitely agree on the upgrade.
> > Well we can simply detect it's not oak compatible and schedule a full
> > reindex (maybe with a message in logs and ui?)
> > But we need to be sure we can still read central index and not sure about
> > possible lucene conflict with oak and maven indexer.
> > We can work on this branch? (I created a Jenkins job for it
> > https://builds.apache.org/view/A-D/view/Archiva/job/
> > archiva-jcr-oak-branch/)
> > If you prefer master I would say no worries neither.
> > Something else to look at is upgrading maven-core etc...
> > Anyway
> > Cheers
> > Olivier
> > 
> > On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> >> Hi,
> >> 
> >> upgrading the maven indexer leads to some major changes.
> >> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit sticks
> >> to
> >> the old 3.x version and, as I see it, they will not move to a newer
> >> version.
> >> There is Jackrabbit Oak as alternative.
> >> I tried a proof of concept and could replace the jackrabbit
> >> implementation of
> >> metadata-store-jcr with a oak implementation. At least I got the unit
> >> tests of
> >> this module all to pass.
> >> But switching to Oak has some drawbacks:
> >> - The repository format changed and we must provide a way to migrate
> >> (either
> >> migrate the existing repository or create a new one by reindexing)
> >> - The lucene version used is newer but does not match to the version from
> >> the
> >> maven-indexer dependencies. There may come up some incompatibilities that
> >> are
> >> not solvable without using a modified version of one of the both. Or
> >> there may
> >> be the possibility to switch to solr (as separate component) and get rid
> >> of
> >> the lucene dependencies for jcr inside the archiva project.
> >> 
> >> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> >> - The Plexus-Sisu-Bridge does not work as before.
> >> - We must migrate from the NexusIndexer to the indexer API.
> >> 
> >> So switching to the new indexer and oak means more work as expected and
> >> some
> >> risks regarding new incompatibility problems. And I think this cannot be
> >> done
> >> without broken master builds for some time period.
> >> 
> >> So, what should we do? I think maven indexer is one of the core
> >> components of
> >> archiva, and we should utilize the 3.x-version to  migrate to the new
> >> indexer
> >> version, even if this means switching to jcr oak. Otherwise it would mean
> >> to
> >> stick to the old version for the next years.
> >> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope
> >> you
> >> can provide  useful help.
> >> 
> >> I committed the PoC to the branch feature/jcr_oak. There are some modules
> >> where the tests do not pass (mainly because of the indexer API changes).
> >> 
> >> Any comments?
> >> 
> >> Cheers
> >> 
> >> Martin
> >> 
> >> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> >> > forget it but we need to ensure we can read maven index files....
> >> > 
> >> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> >> > > Hi,
> >> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene can
> >> 
> >> be a
> >> 
> >> > > problem here.
> >> > > Regarding maven-indexer yes we can depend on a snapshot until the
> >> 
> >> release.
> >> 
> >> > > I can release it ;-)
> >> > > 
> >> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> >> > >> Hi,
> >> > >> 
> >> > >> the lucene version depends on the maven indexer. But I'm not sure
> >> 
> >> about
> >> 
> >> > >> the
> >> > >> current state of maven-indexer. The version has not changed since
> >> 
> >> some
> >> 
> >> > >> 2013.
> >> > >> 
> >> > >> There are commits on the master branch since then, and the lucene
> >> 
> >> version
> >> 
> >> > >> has
> >> > >> been changed too, but no releases were tagged.
> >> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
> >> > >> 
> >> > >> As I know there are new compact index formats with new lucene
> >> 
> >> versions
> >> 
> >> > >> but I'm
> >> > >> not sure if this is relevant for the maven indexes.
> >> > >> 
> >> > >> Cheers
> >> > >> 
> >> > >> Martin
> >> > > 
> >> > > --
> >> > > Olivier Lamy
> >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > 
> > --
> > Olivier Lamy
> > http://twitter.com/olamy | http://linkedin.com/in/olamy



Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
Hi,

I got it now running on my local machine (had to fight some issues with old packages in my
local mvn repository).
So the shaded lucene is now in the maven-indexer master, if I see it correctly.

We have a dependency problem with the guava version. The selenium tests need guava 22.0
and jcr oak runs only with guava 15.0.
Currently I have a (poor) workaround by setting the 22.0 for the webtests on test scope. That
should work because the webtest module is not included for the normal build.
But I would prefer, if we can change to the newer version for the whole project. I will try to find
out, what we can do about it.

Greetings

Martin

Am Samstag, 19. August 2017, 13:42:03 CEST schrieb Olivier Lamy:
> Hi
> So I have merged to master :-)
> 
> On 18 August 2017 at 01:22, Martin Stockhammer <ma...@apache.org> wrote:
> 
> > Hi Olivier,
> >
> > great! I will look at it. I will give you feedback the next days.
> > And yes I have to optimize the jcr oak part and stabilize it. I will work
> > on it.
> >
> > Greetings
> >
> > Martin
> >
> >
> >
> >
> > Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
> > >Hi
> > >Took a bit of time but I finally get the branch working :-)
> > >branch: feature/jcr_oak
> > >Let me know what do you think of?
> > >Well I guess there are still some optimisations to do for jcr oak
> > >I can see some logs:
> > >21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
> > >(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
> > >/*
> > >oak-internal */; consider creating an index
> > >21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
> > >-
> > >Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
> > >WHERE
> > >[jcr:uuid] = $id /* oak-internal */, path=*,
> > >property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
> > >creating an index or changing the query


Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi
So I have merged to master :-)

On 18 August 2017 at 01:22, Martin Stockhammer <ma...@apache.org> wrote:

> Hi Olivier,
>
> great! I will look at it. I will give you feedback the next days.
> And yes I have to optimize the jcr oak part and stabilize it. I will work
> on it.
>
> Greetings
>
> Martin
>
>
>
>
> Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
> >Hi
> >Took a bit of time but I finally get the branch working :-)
> >branch: feature/jcr_oak
> >Let me know what do you think of?
> >Well I guess there are still some optimisations to do for jcr oak
> >I can see some logs:
> >21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
> >(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
> >/*
> >oak-internal */; consider creating an index
> >21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
> >-
> >Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
> >WHERE
> >[jcr:uuid] = $id /* oak-internal */, path=*,
> >property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
> >creating an index or changing the query
> >
> >
> >
> >
> >
> >On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:
> >
> >> Hi Olivier,
> >>
> >> great!
> >> For my understanding: The dependency to lucene in the pom of
> >indexer-core
> >> is
> >> still there, but the lucene packages are moved to the
> >> ...maven.index.shaded...
> >> package? You develop indexer-core with the standard lucene packages
> >and the
> >> shading is executed during the build of the indexer package?
> >>
> >> I think that may solve our dependency problem.
> >>
> >> I still got errors in the maven-indexer module, but I think the
> >status is
> >> still "work in progress". I don't want to interfere too much with
> >your
> >> changes.
> >>
> >> I'm not sure, if we should keep the JCR Oak as metadata
> >implementation. I
> >> think OrientDB may be a feasible alternative: Embeddable,  Graph
> >database,
> >> Lucene index optional and may be omitted, Apache License. And with
> >JCR Oak
> >> we
> >> also have to convert the existing metadata index.
> >>
> >> But one step after the other. If we agree that the shaded indexer
> >works, we
> >> should merge only the maven indexer changes to the master branch
> >without
> >> the
> >> JCR/lucene update and change the JCR and or lucene afterwards.
> >>
> >> Greetings
> >>
> >> Martin
> >>
> >> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
> >> > So the repo contains a branch feature/jar_shaded_lucene here
> >> >
> >https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
> >> > and I pushed what I started for Archiva in the branch called
> >> feature/jcr_oak
> >> > So in order to test it you need to build first maven-indexer from
> >the
> >> > branch feature/jar_shaded_lucene
> >> >
> >> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
> >> > > I will try to share the work I did tomorrow in a branch
> >> > >
> >> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer
> ><martin_s@apache.org
> >> >
> >> > >
> >> > > wrote:
> >> > >> We have different lucene (incompatible) dependencies that
> >prevents us
> >> to
> >> > >> update the maven indexer and/or jackrabbit. And this will happen
> >again
> >> > >> with
> >> > >> each upgrade from one of these two packages in the future.
> >> > >> So would be really good if we can find a solution that removes
> >one of
> >> the
> >> > >> lucene dependencies.
> >> > >>
> >> > >> Greetings
> >> > >>
> >> > >> Martin
> >> > >>
> >> > >>
> >> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
> >> chrisgwarp@gmail.com
> >> > >>
> >> > >> >Can I please an obvious/stupid question?
> >> > >> >
> >> > >> >What is driving this need for change?
> >> > >> >
> >> > >> >From a quick read of the thread above, all of the options
> >appear to
> >> > >> >introduce a lot of breaking changes, and a whole lot more
> >> uncertainty.
> >> > >> >
> >> > >> >So, what is so broken that it is driving these changes?
> >> > >> >
> >> > >> >Sent from my iPhone
> >> > >> >
> >> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org>
> >wrote:
> >> > >> >>
> >> > >> >> Yup.
> >> > >> >> The idea is to have an extra jar produced by the
> >maven-indexer with
> >> > >> >
> >> > >> >shaded
> >> > >> >
> >> > >> >> lucene version.
> >> > >> >> So the lucene classes (version used by Maven indexer) will be
> >> > >> >
> >> > >> >relocated in
> >> > >> >
> >> > >> >> a package called org.apache.maven.index.shaded.lucene (such
> >> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> >> > >> >> Then you exclude lucene dependencies used by maven indexer
> >and
> >> voila.
> >> > >> >> The voila is a bit optimistic and not so ezy but anyway
> >working on
> >> it
> >> > >> >
> >> > >> >ATM.
> >> > >> >
> >> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> >> > >> >>>
> >> > >> >>> What do you mean exactly by shading? Moving to another
> >package
> >> name?
> >> > >> >>>
> >> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier
> >Lamy:
> >> > >> >>>> maybe an option is to use some shading?
> >> > >> >>>> I'm thinking of shading lucene packages used by maven
> >indexer. I
> >> > >> >
> >> > >> >can
> >> > >> >
> >> > >> >>> easily
> >> > >> >>>
> >> > >> >>>> provide a build for that.
> >> > >> >>>> WDYT?
> >> > >> >>>>
> >> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
> >> wrote:
> >> > >> >>>>> Hi
> >> > >> >>>>> graph/document storage could be convenient (but not
> >possible
> >> with
> >> > >> >>>
> >> > >> >>> neo4j as
> >> > >> >>>
> >> > >> >>>>> it's GPL license [1])
> >> > >> >>>>> well we can add solr as an additional webapp with our
> >jetty
> >> > >> >>>
> >> > >> >>> distribution
> >> > >> >>>
> >> > >> >>>>> but this will be a pain for users who want to use tomcat
> >or any
> >> > >> >
> >> > >> >other
> >> > >> >
> >> > >> >>>>> servlet container...
> >> > >> >>>>> we still need to investigate a new storage model :-)
> >> > >> >>>>>
> >> > >> >>>>> Olivier
> >> > >> >>>>> [1] https://neo4j.com/licensing/
> >> > >> >>>>>
> >> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org>
> >wrote:
> >> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
> >> trouble
> >> > >> >
> >> > >> >and
> >> > >> >
> >> > >> >>>>>> will
> >> > >> >>>>>> cause headaches with each version change of one of the
> >> > >> >
> >> > >> >dependencies.
> >> > >> >
> >> > >> >>>>>> What are the requirements for a replacement?
> >> > >> >>>>>> - We want to store hierarchical data?
> >> > >> >>>>>> - We want to store metadata for nodes ?
> >> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> >> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
> >> familiar
> >> > >> >
> >> > >> >with
> >> > >> >
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> archiva artifact model)?
> >> > >> >>>>>>
> >> > >> >>>>>> Maybe some graph database may be an alternative. Don't
> >know if
> >> > >> >
> >> > >> >the
> >> > >> >
> >> > >> >>>>>> license of
> >> > >> >>>>>> neo4j is compatible to the apache license, and I think it
> >> brings
> >> > >> >>>
> >> > >> >>> lucene
> >> > >> >>>
> >> > >> >>>>>> as
> >> > >> >>>>>> dependency too. I will have a look.
> >> > >> >>>>>> Problem is, if there is fulltext search needed, I think,
> >for
> >> most
> >> > >> >
> >> > >> >of
> >> > >> >
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> >> > >> >>>>>>
> >> > >> >>>>>> Other alternatives:
> >> > >> >>>>>> - Implement fulltext search by our own (index of the
> >metadata
> >> > >> >
> >> > >> >stored
> >> > >> >
> >> > >> >>> via
> >> > >> >>>
> >> > >> >>>>>> the
> >> > >> >>>>>> archiva api) and use the lucene dependency that comes
> >from the
> >> > >> >>>>>> maven-indexer
> >> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as
> >its own
> >> > >> >>>>>> application
> >> > >> >>>>>> (war).
> >> > >> >>>>>>
> >> > >> >>>>>> Greetings
> >> > >> >>>>>>
> >> > >> >>>>>> Martin
> >> > >> >>>>>>
> >> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier
> >Lamy:
> >> > >> >>>>>>> well this gonna be a pain.
> >> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> >> > >> >>>>>>> And something not using Lucene as it's a real pain to
> >have
> >> > >> >
> >> > >> >different
> >> > >> >
> >> > >> >>>>>>> librairies using lucene as they do not update in the
> >same time
> >> > >> >
> >> > >> >(and
> >> > >> >
> >> > >> >>>>>> Lucene
> >> > >> >>>>>>
> >> > >> >>>>>>> break backward compat so quickly...)
> >> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with
> >a
> >> > >> >
> >> > >> >possible
> >> > >> >
> >> > >> >>>>>>> external server configuration).
> >> > >> >>>>>>> There is currently a Cassandra implementation. I was not
> >> > >> >
> >> > >> >satisfied
> >> > >> >
> >> > >> >>>>>>> about
> >> > >> >>>>>>> performance but I guess I did that 4yo ago so can be
> >improved
> >> > >> >
> >> > >> >for
> >> > >> >
> >> > >> >>> sure
> >> > >> >>>
> >> > >> >>>>>> :-)
> >> > >> >>>>>> :
> >> > >> >>>>>>> Maybe orientdb?
> >> > >> >>>>>>> What else?
> >> > >> >>>>>>>
> >> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >
> >> > >> >wrote:
> >> > >> >>>>>>>> well the issue is non compatible version of Lucene for
> >Maven
> >> > >> >>>
> >> > >> >>> Indexer
> >> > >> >>>
> >> > >> >>>>>> and
> >> > >> >>>>>>
> >> > >> >>>>>>>> Oak (well I can try push a patch to Oak for
> >upgrading...)
> >> > >> >>>>>>>>
> >> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >
> >> > >> >wrote:
> >> > >> >>>>>>>>> Hi
> >> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
> >> bridge.
> >> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> >> > >> >>>>>>>>> Not sure why but I have intermittent failure with
> >store-jcr
> >> > >> >>>
> >> > >> >>> module.
> >> > >> >>>
> >> > >> >>>>>>>>> I definitely agree on the upgrade.
> >> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
> >> schedule
> >> > >> >
> >> > >> >a
> >> > >> >
> >> > >> >>>>>>>>> full
> >> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> >> > >> >>>>>>>>> But we need to be sure we can still read central index
> >and
> >> not
> >> > >> >>>
> >> > >> >>> sure
> >> > >> >>>
> >> > >> >>>>>> about
> >> > >> >>>>>>
> >> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> >> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job
> >for it
> >> > >> >>>>>>>>>
> >https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >> > >> >>>>>>>>> va-jcr-oak-branch/)
> >> > >> >>>>>>>>> If you prefer master I would say no worries neither.
> >> > >> >>>>>>>>> Something else to look at is upgrading maven-core
> >etc...
> >> > >> >>>>>>>>> Anyway
> >> > >> >>>>>>>>> Cheers
> >> > >> >>>>>>>>> Olivier
> >> > >> >>>>>>>>>
> >> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin
> ><ma...@apache.org>
> >> wrote:
> >> > >> >>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> upgrading the maven indexer leads to some major
> >changes.
> >> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by
> >jackrabbit.
> >> > >> >>>
> >> > >> >>> Jackrabbit
> >> > >> >>>
> >> > >> >>>>>>>>>> sticks to
> >> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not
> >move
> >> to a
> >> > >> >>>
> >> > >> >>> newer
> >> > >> >>>
> >> > >> >>>>>>>>>> version.
> >> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> >> > >> >>>>>>>>>> I tried a proof of concept and could replace the
> >jackrabbit
> >> > >> >>>>>>>>>> implementation of
> >> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At
> >least I
> >> got
> >> > >> >
> >> > >> >the
> >> > >> >
> >> > >> >>>>>> unit
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> tests of
> >> > >> >>>>>>>>>> this module all to pass.
> >> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
> >> > >> >>>>>>>>>> - The repository format changed and we must provide a
> >way
> >> to
> >> > >> >>>>>>>>>> migrate
> >> > >> >>>>>>>>>> (either
> >> > >> >>>>>>>>>> migrate the existing repository or create a new one
> >by
> >> > >> >>>
> >> > >> >>> reindexing)
> >> > >> >>>
> >> > >> >>>>>>>>>> - The lucene version used is newer but does not match
> >to
> >> the
> >> > >> >>>>>>>>>> version
> >> > >> >>>>>>>>>> from the
> >> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
> >> > >> >>>>>>>>>> incompatibilities
> >> > >> >>>>>>>>>> that are
> >> > >> >>>>>>>>>> not solvable without using a modified version of one
> >of the
> >> > >> >>>
> >> > >> >>> both.
> >> > >> >>>
> >> > >> >>>>>>>>>> Or
> >> > >> >>>>>>>>>> there may
> >> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
> >> component)
> >> > >> >
> >> > >> >and
> >> > >> >
> >> > >> >>>>>> get rid
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> of
> >> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva
> >project.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some
> >changes
> >> > >> >
> >> > >> >too:
> >> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> >> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the
> >indexer API.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> So switching to the new indexer and oak means more
> >work as
> >> > >> >>>
> >> > >> >>> expected
> >> > >> >>>
> >> > >> >>>>>> and
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> some
> >> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I
> >think
> >> > >> >
> >> > >> >this
> >> > >> >
> >> > >> >>>>>> cannot be
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> done
> >> > >> >>>>>>>>>> without broken master builds for some time period.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one
> >of the
> >> > >> >
> >> > >> >core
> >> > >> >
> >> > >> >>>>>>>>>> components of
> >> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to
> >migrate
> >> to
> >> > >> >>>
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> new
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> indexer
> >> > >> >>>>>>>>>> version, even if this means switching to jcr oak.
> >Otherwise
> >> > >> >
> >> > >> >it
> >> > >> >
> >> > >> >>>>>>>>>> would
> >> > >> >>>>>>>>>> mean to
> >> > >> >>>>>>>>>> stick to the old version for the next years.
> >> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge
> >API
> >> > >> >>>
> >> > >> >>> changes, I
> >> > >> >>>
> >> > >> >>>>>> hope
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> you
> >> > >> >>>>>>>>>> can provide  useful help.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak.
> >There
> >> are
> >> > >> >>>
> >> > >> >>> some
> >> > >> >>>
> >> > >> >>>>>>>>>> modules
> >> > >> >>>>>>>>>> where the tests do not pass (mainly because of the
> >indexer
> >> > >> >
> >> > >> >API
> >> > >> >
> >> > >> >>>>>> changes).
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> Any comments?
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Cheers
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Martin
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb
> >Olivier
> >> > >> >
> >> > >> >Lamy:
> >> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven
> >index
> >> > >> >>>
> >> > >> >>> files....
> >> > >> >>>
> >> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >>>
> >> > >> >>> wrote:
> >> > >> >>>>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
> >> upgrading
> >> > >> >>>>>>
> >> > >> >>>>>> Lucene
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> can be a
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>> problem here.
> >> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a
> >snapshot
> >> > >> >>>
> >> > >> >>> until
> >> > >> >>>
> >> > >> >>>>>> the
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> release.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>> I can release it ;-)
> >> > >> >>>>>>>>>>>>
> >> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin
> ><ma...@apache.org>
> >> > >> >>>
> >> > >> >>> wrote:
> >> > >> >>>>>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer.
> >But I'm
> >> > >> >>>
> >> > >> >>> not
> >> > >> >>>
> >> > >> >>>>>> sure
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> about
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> the
> >> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has
> >not
> >> > >> >
> >> > >> >changed
> >> > >> >
> >> > >> >>>>>> since
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> some
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> 2013.
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> There are commits on the master branch since then,
> >and
> >> the
> >> > >> >>>>>>
> >> > >> >>>>>> lucene
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> version
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> has
> >> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> >> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> >> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> As I know there are new compact index formats with
> >new
> >> > >> >>>
> >> > >> >>> lucene
> >> > >> >>>
> >> > >> >>>>>>>>>> versions
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> but I'm
> >> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven
> >indexes.
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> Cheers
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> Martin
> >> > >> >>>>>>>>>>>>
> >> > >> >>>>>>>>>>>> --
> >> > >> >>>>>>>>>>>> Olivier Lamy
> >> > >> >>>>>>>>>>>> http://twitter.com/olamy |
> >http://linkedin.com/in/olamy
> >> > >> >>>>>>>>>
> >> > >> >>>>>>>>> --
> >> > >> >>>>>>>>> Olivier Lamy
> >> > >> >>>>>>>>> http://twitter.com/olamy |
> >http://linkedin.com/in/olamy
> >> > >> >>>>>>>>
> >> > >> >>>>>>>> --
> >> > >> >>>>>>>> Olivier Lamy
> >> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >> >>>>>
> >> > >> >>>>> --
> >> > >> >>>>> Olivier Lamy
> >> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >> >>
> >> > >> >> --
> >> > >> >> Olivier Lamy
> >> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >>
> >> > >> --
> >> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail
> >gesendet.
> >> > >
> >> > > --
> >> > > Olivier Lamy
> >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>
> >>
> >>
> >
> >
> >--
> >Olivier Lamy
> >http://twitter.com/olamy | http://linkedin.com/in/olamy
>
> --
> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin Stockhammer <ma...@apache.org>.
Hi Olivier,

great! I will look at it. I will give you feedback the next days.
And yes I have to optimize the jcr oak part and stabilize it. I will work on it.

Greetings

Martin




Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
>Hi
>Took a bit of time but I finally get the branch working :-)
>branch: feature/jcr_oak
>Let me know what do you think of?
>Well I guess there are still some optimisations to do for jcr oak
>I can see some logs:
>21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
>(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
>/*
>oak-internal */; consider creating an index
>21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
>-
>Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
>WHERE
>[jcr:uuid] = $id /* oak-internal */, path=*,
>property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
>creating an index or changing the query
>
>
>
>
>
>On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:
>
>> Hi Olivier,
>>
>> great!
>> For my understanding: The dependency to lucene in the pom of
>indexer-core
>> is
>> still there, but the lucene packages are moved to the
>> ...maven.index.shaded...
>> package? You develop indexer-core with the standard lucene packages
>and the
>> shading is executed during the build of the indexer package?
>>
>> I think that may solve our dependency problem.
>>
>> I still got errors in the maven-indexer module, but I think the
>status is
>> still "work in progress". I don't want to interfere too much with
>your
>> changes.
>>
>> I'm not sure, if we should keep the JCR Oak as metadata
>implementation. I
>> think OrientDB may be a feasible alternative: Embeddable,  Graph
>database,
>> Lucene index optional and may be omitted, Apache License. And with
>JCR Oak
>> we
>> also have to convert the existing metadata index.
>>
>> But one step after the other. If we agree that the shaded indexer
>works, we
>> should merge only the maven indexer changes to the master branch
>without
>> the
>> JCR/lucene update and change the JCR and or lucene afterwards.
>>
>> Greetings
>>
>> Martin
>>
>> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
>> > So the repo contains a branch feature/jar_shaded_lucene here
>> >
>https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
>> > and I pushed what I started for Archiva in the branch called
>> feature/jcr_oak
>> > So in order to test it you need to build first maven-indexer from
>the
>> > branch feature/jar_shaded_lucene
>> >
>> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
>> > > I will try to share the work I did tomorrow in a branch
>> > >
>> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer
><martin_s@apache.org
>> >
>> > >
>> > > wrote:
>> > >> We have different lucene (incompatible) dependencies that
>prevents us
>> to
>> > >> update the maven indexer and/or jackrabbit. And this will happen
>again
>> > >> with
>> > >> each upgrade from one of these two packages in the future.
>> > >> So would be really good if we can find a solution that removes
>one of
>> the
>> > >> lucene dependencies.
>> > >>
>> > >> Greetings
>> > >>
>> > >> Martin
>> > >>
>> > >>
>> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
>> chrisgwarp@gmail.com
>> > >>
>> > >> >Can I please an obvious/stupid question?
>> > >> >
>> > >> >What is driving this need for change?
>> > >> >
>> > >> >From a quick read of the thread above, all of the options
>appear to
>> > >> >introduce a lot of breaking changes, and a whole lot more
>> uncertainty.
>> > >> >
>> > >> >So, what is so broken that it is driving these changes?
>> > >> >
>> > >> >Sent from my iPhone
>> > >> >
>> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org>
>wrote:
>> > >> >>
>> > >> >> Yup.
>> > >> >> The idea is to have an extra jar produced by the
>maven-indexer with
>> > >> >
>> > >> >shaded
>> > >> >
>> > >> >> lucene version.
>> > >> >> So the lucene classes (version used by Maven indexer) will be
>> > >> >
>> > >> >relocated in
>> > >> >
>> > >> >> a package called org.apache.maven.index.shaded.lucene (such
>> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
>> > >> >> Then you exclude lucene dependencies used by maven indexer
>and
>> voila.
>> > >> >> The voila is a bit optimistic and not so ezy but anyway
>working on
>> it
>> > >> >
>> > >> >ATM.
>> > >> >
>> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
>> > >> >>>
>> > >> >>> What do you mean exactly by shading? Moving to another
>package
>> name?
>> > >> >>>
>> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier
>Lamy:
>> > >> >>>> maybe an option is to use some shading?
>> > >> >>>> I'm thinking of shading lucene packages used by maven
>indexer. I
>> > >> >
>> > >> >can
>> > >> >
>> > >> >>> easily
>> > >> >>>
>> > >> >>>> provide a build for that.
>> > >> >>>> WDYT?
>> > >> >>>>
>> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
>> wrote:
>> > >> >>>>> Hi
>> > >> >>>>> graph/document storage could be convenient (but not
>possible
>> with
>> > >> >>>
>> > >> >>> neo4j as
>> > >> >>>
>> > >> >>>>> it's GPL license [1])
>> > >> >>>>> well we can add solr as an additional webapp with our
>jetty
>> > >> >>>
>> > >> >>> distribution
>> > >> >>>
>> > >> >>>>> but this will be a pain for users who want to use tomcat
>or any
>> > >> >
>> > >> >other
>> > >> >
>> > >> >>>>> servlet container...
>> > >> >>>>> we still need to investigate a new storage model :-)
>> > >> >>>>>
>> > >> >>>>> Olivier
>> > >> >>>>> [1] https://neo4j.com/licensing/
>> > >> >>>>>
>> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org>
>wrote:
>> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
>> trouble
>> > >> >
>> > >> >and
>> > >> >
>> > >> >>>>>> will
>> > >> >>>>>> cause headaches with each version change of one of the
>> > >> >
>> > >> >dependencies.
>> > >> >
>> > >> >>>>>> What are the requirements for a replacement?
>> > >> >>>>>> - We want to store hierarchical data?
>> > >> >>>>>> - We want to store metadata for nodes ?
>> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
>> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
>> familiar
>> > >> >
>> > >> >with
>> > >> >
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> archiva artifact model)?
>> > >> >>>>>>
>> > >> >>>>>> Maybe some graph database may be an alternative. Don't
>know if
>> > >> >
>> > >> >the
>> > >> >
>> > >> >>>>>> license of
>> > >> >>>>>> neo4j is compatible to the apache license, and I think it
>> brings
>> > >> >>>
>> > >> >>> lucene
>> > >> >>>
>> > >> >>>>>> as
>> > >> >>>>>> dependency too. I will have a look.
>> > >> >>>>>> Problem is, if there is fulltext search needed, I think,
>for
>> most
>> > >> >
>> > >> >of
>> > >> >
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
>> > >> >>>>>>
>> > >> >>>>>> Other alternatives:
>> > >> >>>>>> - Implement fulltext search by our own (index of the
>metadata
>> > >> >
>> > >> >stored
>> > >> >
>> > >> >>> via
>> > >> >>>
>> > >> >>>>>> the
>> > >> >>>>>> archiva api) and use the lucene dependency that comes
>from the
>> > >> >>>>>> maven-indexer
>> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as
>its own
>> > >> >>>>>> application
>> > >> >>>>>> (war).
>> > >> >>>>>>
>> > >> >>>>>> Greetings
>> > >> >>>>>>
>> > >> >>>>>> Martin
>> > >> >>>>>>
>> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier
>Lamy:
>> > >> >>>>>>> well this gonna be a pain.
>> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
>> > >> >>>>>>> And something not using Lucene as it's a real pain to
>have
>> > >> >
>> > >> >different
>> > >> >
>> > >> >>>>>>> librairies using lucene as they do not update in the
>same time
>> > >> >
>> > >> >(and
>> > >> >
>> > >> >>>>>> Lucene
>> > >> >>>>>>
>> > >> >>>>>>> break backward compat so quickly...)
>> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with
>a
>> > >> >
>> > >> >possible
>> > >> >
>> > >> >>>>>>> external server configuration).
>> > >> >>>>>>> There is currently a Cassandra implementation. I was not
>> > >> >
>> > >> >satisfied
>> > >> >
>> > >> >>>>>>> about
>> > >> >>>>>>> performance but I guess I did that 4yo ago so can be
>improved
>> > >> >
>> > >> >for
>> > >> >
>> > >> >>> sure
>> > >> >>>
>> > >> >>>>>> :-)
>> > >> >>>>>> :
>> > >> >>>>>>> Maybe orientdb?
>> > >> >>>>>>> What else?
>> > >> >>>>>>>
>> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy
><ol...@apache.org>
>> > >> >
>> > >> >wrote:
>> > >> >>>>>>>> well the issue is non compatible version of Lucene for
>Maven
>> > >> >>>
>> > >> >>> Indexer
>> > >> >>>
>> > >> >>>>>> and
>> > >> >>>>>>
>> > >> >>>>>>>> Oak (well I can try push a patch to Oak for
>upgrading...)
>> > >> >>>>>>>>
>> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy
><ol...@apache.org>
>> > >> >
>> > >> >wrote:
>> > >> >>>>>>>>> Hi
>> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
>> bridge.
>> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
>> > >> >>>>>>>>> Not sure why but I have intermittent failure with
>store-jcr
>> > >> >>>
>> > >> >>> module.
>> > >> >>>
>> > >> >>>>>>>>> I definitely agree on the upgrade.
>> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
>> schedule
>> > >> >
>> > >> >a
>> > >> >
>> > >> >>>>>>>>> full
>> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
>> > >> >>>>>>>>> But we need to be sure we can still read central index
>and
>> not
>> > >> >>>
>> > >> >>> sure
>> > >> >>>
>> > >> >>>>>> about
>> > >> >>>>>>
>> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
>> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job
>for it
>> > >> >>>>>>>>>
>https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> > >> >>>>>>>>> va-jcr-oak-branch/)
>> > >> >>>>>>>>> If you prefer master I would say no worries neither.
>> > >> >>>>>>>>> Something else to look at is upgrading maven-core
>etc...
>> > >> >>>>>>>>> Anyway
>> > >> >>>>>>>>> Cheers
>> > >> >>>>>>>>> Olivier
>> > >> >>>>>>>>>
>> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin
><ma...@apache.org>
>> wrote:
>> > >> >>>>>>>>>> Hi,
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> upgrading the maven indexer leads to some major
>changes.
>> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by
>jackrabbit.
>> > >> >>>
>> > >> >>> Jackrabbit
>> > >> >>>
>> > >> >>>>>>>>>> sticks to
>> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not
>move
>> to a
>> > >> >>>
>> > >> >>> newer
>> > >> >>>
>> > >> >>>>>>>>>> version.
>> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
>> > >> >>>>>>>>>> I tried a proof of concept and could replace the
>jackrabbit
>> > >> >>>>>>>>>> implementation of
>> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At
>least I
>> got
>> > >> >
>> > >> >the
>> > >> >
>> > >> >>>>>> unit
>> > >> >>>>>>
>> > >> >>>>>>>>>> tests of
>> > >> >>>>>>>>>> this module all to pass.
>> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
>> > >> >>>>>>>>>> - The repository format changed and we must provide a
>way
>> to
>> > >> >>>>>>>>>> migrate
>> > >> >>>>>>>>>> (either
>> > >> >>>>>>>>>> migrate the existing repository or create a new one
>by
>> > >> >>>
>> > >> >>> reindexing)
>> > >> >>>
>> > >> >>>>>>>>>> - The lucene version used is newer but does not match
>to
>> the
>> > >> >>>>>>>>>> version
>> > >> >>>>>>>>>> from the
>> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
>> > >> >>>>>>>>>> incompatibilities
>> > >> >>>>>>>>>> that are
>> > >> >>>>>>>>>> not solvable without using a modified version of one
>of the
>> > >> >>>
>> > >> >>> both.
>> > >> >>>
>> > >> >>>>>>>>>> Or
>> > >> >>>>>>>>>> there may
>> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
>> component)
>> > >> >
>> > >> >and
>> > >> >
>> > >> >>>>>> get rid
>> > >> >>>>>>
>> > >> >>>>>>>>>> of
>> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva
>project.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some
>changes
>> > >> >
>> > >> >too:
>> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
>> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the
>indexer API.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> So switching to the new indexer and oak means more
>work as
>> > >> >>>
>> > >> >>> expected
>> > >> >>>
>> > >> >>>>>> and
>> > >> >>>>>>
>> > >> >>>>>>>>>> some
>> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I
>think
>> > >> >
>> > >> >this
>> > >> >
>> > >> >>>>>> cannot be
>> > >> >>>>>>
>> > >> >>>>>>>>>> done
>> > >> >>>>>>>>>> without broken master builds for some time period.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one
>of the
>> > >> >
>> > >> >core
>> > >> >
>> > >> >>>>>>>>>> components of
>> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to 
>migrate
>> to
>> > >> >>>
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> new
>> > >> >>>>>>
>> > >> >>>>>>>>>> indexer
>> > >> >>>>>>>>>> version, even if this means switching to jcr oak.
>Otherwise
>> > >> >
>> > >> >it
>> > >> >
>> > >> >>>>>>>>>> would
>> > >> >>>>>>>>>> mean to
>> > >> >>>>>>>>>> stick to the old version for the next years.
>> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge
>API
>> > >> >>>
>> > >> >>> changes, I
>> > >> >>>
>> > >> >>>>>> hope
>> > >> >>>>>>
>> > >> >>>>>>>>>> you
>> > >> >>>>>>>>>> can provide  useful help.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak.
>There
>> are
>> > >> >>>
>> > >> >>> some
>> > >> >>>
>> > >> >>>>>>>>>> modules
>> > >> >>>>>>>>>> where the tests do not pass (mainly because of the
>indexer
>> > >> >
>> > >> >API
>> > >> >
>> > >> >>>>>> changes).
>> > >> >>>>>>
>> > >> >>>>>>>>>> Any comments?
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Cheers
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Martin
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb
>Olivier
>> > >> >
>> > >> >Lamy:
>> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven
>index
>> > >> >>>
>> > >> >>> files....
>> > >> >>>
>> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy
><ol...@apache.org>
>> > >> >>>
>> > >> >>> wrote:
>> > >> >>>>>>>>>>>> Hi,
>> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
>> upgrading
>> > >> >>>>>>
>> > >> >>>>>> Lucene
>> > >> >>>>>>
>> > >> >>>>>>>>>> can be a
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>> problem here.
>> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a
>snapshot
>> > >> >>>
>> > >> >>> until
>> > >> >>>
>> > >> >>>>>> the
>> > >> >>>>>>
>> > >> >>>>>>>>>> release.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>> I can release it ;-)
>> > >> >>>>>>>>>>>>
>> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin
><ma...@apache.org>
>> > >> >>>
>> > >> >>> wrote:
>> > >> >>>>>>>>>>>>> Hi,
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer.
>But I'm
>> > >> >>>
>> > >> >>> not
>> > >> >>>
>> > >> >>>>>> sure
>> > >> >>>>>>
>> > >> >>>>>>>>>> about
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> the
>> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has
>not
>> > >> >
>> > >> >changed
>> > >> >
>> > >> >>>>>> since
>> > >> >>>>>>
>> > >> >>>>>>>>>> some
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> 2013.
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> There are commits on the master branch since then,
>and
>> the
>> > >> >>>>>>
>> > >> >>>>>> lucene
>> > >> >>>>>>
>> > >> >>>>>>>>>> version
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> has
>> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
>> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
>> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> As I know there are new compact index formats with
>new
>> > >> >>>
>> > >> >>> lucene
>> > >> >>>
>> > >> >>>>>>>>>> versions
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> but I'm
>> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven
>indexes.
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> Cheers
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> Martin
>> > >> >>>>>>>>>>>>
>> > >> >>>>>>>>>>>> --
>> > >> >>>>>>>>>>>> Olivier Lamy
>> > >> >>>>>>>>>>>> http://twitter.com/olamy |
>http://linkedin.com/in/olamy
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> --
>> > >> >>>>>>>>> Olivier Lamy
>> > >> >>>>>>>>> http://twitter.com/olamy |
>http://linkedin.com/in/olamy
>> > >> >>>>>>>>
>> > >> >>>>>>>> --
>> > >> >>>>>>>> Olivier Lamy
>> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >> >>>>>
>> > >> >>>>> --
>> > >> >>>>> Olivier Lamy
>> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >> >>
>> > >> >> --
>> > >> >> Olivier Lamy
>> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >>
>> > >> --
>> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail
>gesendet.
>> > >
>> > > --
>> > > Olivier Lamy
>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>>
>>
>
>
>-- 
>Olivier Lamy
>http://twitter.com/olamy | http://linkedin.com/in/olamy

-- 
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi
Took a bit of time but I finally get the branch working :-)
branch: feature/jcr_oak
Let me know what do you think of?
Well I guess there are still some optimisations to do for jcr oak
I can see some logs:
21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id /*
oak-internal */; consider creating an index
21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor -
Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base] WHERE
[jcr:uuid] = $id /* oak-internal */, path=*,
property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
creating an index or changing the query





On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:

> Hi Olivier,
>
> great!
> For my understanding: The dependency to lucene in the pom of indexer-core
> is
> still there, but the lucene packages are moved to the
> ...maven.index.shaded...
> package? You develop indexer-core with the standard lucene packages and the
> shading is executed during the build of the indexer package?
>
> I think that may solve our dependency problem.
>
> I still got errors in the maven-indexer module, but I think the status is
> still "work in progress". I don't want to interfere too much with your
> changes.
>
> I'm not sure, if we should keep the JCR Oak as metadata implementation. I
> think OrientDB may be a feasible alternative: Embeddable,  Graph database,
> Lucene index optional and may be omitted, Apache License. And with JCR Oak
> we
> also have to convert the existing metadata index.
>
> But one step after the other. If we agree that the shaded indexer works, we
> should merge only the maven indexer changes to the master branch without
> the
> JCR/lucene update and change the JCR and or lucene afterwards.
>
> Greetings
>
> Martin
>
> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
> > So the repo contains a branch feature/jar_shaded_lucene here
> > https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
> > and I pushed what I started for Archiva in the branch called
> feature/jcr_oak
> > So in order to test it you need to build first maven-indexer from the
> > branch feature/jar_shaded_lucene
> >
> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
> > > I will try to share the work I did tomorrow in a branch
> > >
> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <martin_s@apache.org
> >
> > >
> > > wrote:
> > >> We have different lucene (incompatible) dependencies that prevents us
> to
> > >> update the maven indexer and/or jackrabbit. And this will happen again
> > >> with
> > >> each upgrade from one of these two packages in the future.
> > >> So would be really good if we can find a solution that removes one of
> the
> > >> lucene dependencies.
> > >>
> > >> Greetings
> > >>
> > >> Martin
> > >>
> > >>
> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
> chrisgwarp@gmail.com
> > >>
> > >> >Can I please an obvious/stupid question?
> > >> >
> > >> >What is driving this need for change?
> > >> >
> > >> >From a quick read of the thread above, all of the options appear to
> > >> >introduce a lot of breaking changes, and a whole lot more
> uncertainty.
> > >> >
> > >> >So, what is so broken that it is driving these changes?
> > >> >
> > >> >Sent from my iPhone
> > >> >
> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
> > >> >>
> > >> >> Yup.
> > >> >> The idea is to have an extra jar produced by the maven-indexer with
> > >> >
> > >> >shaded
> > >> >
> > >> >> lucene version.
> > >> >> So the lucene classes (version used by Maven indexer) will be
> > >> >
> > >> >relocated in
> > >> >
> > >> >> a package called org.apache.maven.index.shaded.lucene (such
> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> > >> >> Then you exclude lucene dependencies used by maven indexer and
> voila.
> > >> >> The voila is a bit optimistic and not so ezy but anyway working on
> it
> > >> >
> > >> >ATM.
> > >> >
> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> > >> >>>
> > >> >>> What do you mean exactly by shading? Moving to another package
> name?
> > >> >>>
> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> > >> >>>> maybe an option is to use some shading?
> > >> >>>> I'm thinking of shading lucene packages used by maven indexer. I
> > >> >
> > >> >can
> > >> >
> > >> >>> easily
> > >> >>>
> > >> >>>> provide a build for that.
> > >> >>>> WDYT?
> > >> >>>>
> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
> wrote:
> > >> >>>>> Hi
> > >> >>>>> graph/document storage could be convenient (but not possible
> with
> > >> >>>
> > >> >>> neo4j as
> > >> >>>
> > >> >>>>> it's GPL license [1])
> > >> >>>>> well we can add solr as an additional webapp with our jetty
> > >> >>>
> > >> >>> distribution
> > >> >>>
> > >> >>>>> but this will be a pain for users who want to use tomcat or any
> > >> >
> > >> >other
> > >> >
> > >> >>>>> servlet container...
> > >> >>>>> we still need to investigate a new storage model :-)
> > >> >>>>>
> > >> >>>>> Olivier
> > >> >>>>> [1] https://neo4j.com/licensing/
> > >> >>>>>
> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
> trouble
> > >> >
> > >> >and
> > >> >
> > >> >>>>>> will
> > >> >>>>>> cause headaches with each version change of one of the
> > >> >
> > >> >dependencies.
> > >> >
> > >> >>>>>> What are the requirements for a replacement?
> > >> >>>>>> - We want to store hierarchical data?
> > >> >>>>>> - We want to store metadata for nodes ?
> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
> familiar
> > >> >
> > >> >with
> > >> >
> > >> >>> the
> > >> >>>
> > >> >>>>>> archiva artifact model)?
> > >> >>>>>>
> > >> >>>>>> Maybe some graph database may be an alternative. Don't know if
> > >> >
> > >> >the
> > >> >
> > >> >>>>>> license of
> > >> >>>>>> neo4j is compatible to the apache license, and I think it
> brings
> > >> >>>
> > >> >>> lucene
> > >> >>>
> > >> >>>>>> as
> > >> >>>>>> dependency too. I will have a look.
> > >> >>>>>> Problem is, if there is fulltext search needed, I think, for
> most
> > >> >
> > >> >of
> > >> >
> > >> >>> the
> > >> >>>
> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> > >> >>>>>>
> > >> >>>>>> Other alternatives:
> > >> >>>>>> - Implement fulltext search by our own (index of the metadata
> > >> >
> > >> >stored
> > >> >
> > >> >>> via
> > >> >>>
> > >> >>>>>> the
> > >> >>>>>> archiva api) and use the lucene dependency that comes from the
> > >> >>>>>> maven-indexer
> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> > >> >>>>>> application
> > >> >>>>>> (war).
> > >> >>>>>>
> > >> >>>>>> Greetings
> > >> >>>>>>
> > >> >>>>>> Martin
> > >> >>>>>>
> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> > >> >>>>>>> well this gonna be a pain.
> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> > >> >>>>>>> And something not using Lucene as it's a real pain to have
> > >> >
> > >> >different
> > >> >
> > >> >>>>>>> librairies using lucene as they do not update in the same time
> > >> >
> > >> >(and
> > >> >
> > >> >>>>>> Lucene
> > >> >>>>>>
> > >> >>>>>>> break backward compat so quickly...)
> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with a
> > >> >
> > >> >possible
> > >> >
> > >> >>>>>>> external server configuration).
> > >> >>>>>>> There is currently a Cassandra implementation. I was not
> > >> >
> > >> >satisfied
> > >> >
> > >> >>>>>>> about
> > >> >>>>>>> performance but I guess I did that 4yo ago so can be improved
> > >> >
> > >> >for
> > >> >
> > >> >>> sure
> > >> >>>
> > >> >>>>>> :-)
> > >> >>>>>> :
> > >> >>>>>>> Maybe orientdb?
> > >> >>>>>>> What else?
> > >> >>>>>>>
> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
> > >> >
> > >> >wrote:
> > >> >>>>>>>> well the issue is non compatible version of Lucene for Maven
> > >> >>>
> > >> >>> Indexer
> > >> >>>
> > >> >>>>>> and
> > >> >>>>>>
> > >> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
> > >> >>>>>>>>
> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
> > >> >
> > >> >wrote:
> > >> >>>>>>>>> Hi
> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
> bridge.
> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> > >> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr
> > >> >>>
> > >> >>> module.
> > >> >>>
> > >> >>>>>>>>> I definitely agree on the upgrade.
> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
> schedule
> > >> >
> > >> >a
> > >> >
> > >> >>>>>>>>> full
> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> > >> >>>>>>>>> But we need to be sure we can still read central index and
> not
> > >> >>>
> > >> >>> sure
> > >> >>>
> > >> >>>>>> about
> > >> >>>>>>
> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it
> > >> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> > >> >>>>>>>>> va-jcr-oak-branch/)
> > >> >>>>>>>>> If you prefer master I would say no worries neither.
> > >> >>>>>>>>> Something else to look at is upgrading maven-core etc...
> > >> >>>>>>>>> Anyway
> > >> >>>>>>>>> Cheers
> > >> >>>>>>>>> Olivier
> > >> >>>>>>>>>
> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org>
> wrote:
> > >> >>>>>>>>>> Hi,
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> upgrading the maven indexer leads to some major changes.
> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
> > >> >>>
> > >> >>> Jackrabbit
> > >> >>>
> > >> >>>>>>>>>> sticks to
> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not move
> to a
> > >> >>>
> > >> >>> newer
> > >> >>>
> > >> >>>>>>>>>> version.
> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> > >> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
> > >> >>>>>>>>>> implementation of
> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I
> got
> > >> >
> > >> >the
> > >> >
> > >> >>>>>> unit
> > >> >>>>>>
> > >> >>>>>>>>>> tests of
> > >> >>>>>>>>>> this module all to pass.
> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
> > >> >>>>>>>>>> - The repository format changed and we must provide a way
> to
> > >> >>>>>>>>>> migrate
> > >> >>>>>>>>>> (either
> > >> >>>>>>>>>> migrate the existing repository or create a new one by
> > >> >>>
> > >> >>> reindexing)
> > >> >>>
> > >> >>>>>>>>>> - The lucene version used is newer but does not match to
> the
> > >> >>>>>>>>>> version
> > >> >>>>>>>>>> from the
> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
> > >> >>>>>>>>>> incompatibilities
> > >> >>>>>>>>>> that are
> > >> >>>>>>>>>> not solvable without using a modified version of one of the
> > >> >>>
> > >> >>> both.
> > >> >>>
> > >> >>>>>>>>>> Or
> > >> >>>>>>>>>> there may
> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
> component)
> > >> >
> > >> >and
> > >> >
> > >> >>>>>> get rid
> > >> >>>>>>
> > >> >>>>>>>>>> of
> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
> > >> >
> > >> >too:
> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> So switching to the new indexer and oak means more work as
> > >> >>>
> > >> >>> expected
> > >> >>>
> > >> >>>>>> and
> > >> >>>>>>
> > >> >>>>>>>>>> some
> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I think
> > >> >
> > >> >this
> > >> >
> > >> >>>>>> cannot be
> > >> >>>>>>
> > >> >>>>>>>>>> done
> > >> >>>>>>>>>> without broken master builds for some time period.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one of the
> > >> >
> > >> >core
> > >> >
> > >> >>>>>>>>>> components of
> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate
> to
> > >> >>>
> > >> >>> the
> > >> >>>
> > >> >>>>>> new
> > >> >>>>>>
> > >> >>>>>>>>>> indexer
> > >> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
> > >> >
> > >> >it
> > >> >
> > >> >>>>>>>>>> would
> > >> >>>>>>>>>> mean to
> > >> >>>>>>>>>> stick to the old version for the next years.
> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> > >> >>>
> > >> >>> changes, I
> > >> >>>
> > >> >>>>>> hope
> > >> >>>>>>
> > >> >>>>>>>>>> you
> > >> >>>>>>>>>> can provide  useful help.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There
> are
> > >> >>>
> > >> >>> some
> > >> >>>
> > >> >>>>>>>>>> modules
> > >> >>>>>>>>>> where the tests do not pass (mainly because of the indexer
> > >> >
> > >> >API
> > >> >
> > >> >>>>>> changes).
> > >> >>>>>>
> > >> >>>>>>>>>> Any comments?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Cheers
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Martin
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
> > >> >
> > >> >Lamy:
> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven index
> > >> >>>
> > >> >>> files....
> > >> >>>
> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
> > >> >>>
> > >> >>> wrote:
> > >> >>>>>>>>>>>> Hi,
> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
> upgrading
> > >> >>>>>>
> > >> >>>>>> Lucene
> > >> >>>>>>
> > >> >>>>>>>>>> can be a
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>> problem here.
> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
> > >> >>>
> > >> >>> until
> > >> >>>
> > >> >>>>>> the
> > >> >>>>>>
> > >> >>>>>>>>>> release.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>> I can release it ;-)
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
> > >> >>>
> > >> >>> wrote:
> > >> >>>>>>>>>>>>> Hi,
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
> > >> >>>
> > >> >>> not
> > >> >>>
> > >> >>>>>> sure
> > >> >>>>>>
> > >> >>>>>>>>>> about
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> the
> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has not
> > >> >
> > >> >changed
> > >> >
> > >> >>>>>> since
> > >> >>>>>>
> > >> >>>>>>>>>> some
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> 2013.
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> There are commits on the master branch since then, and
> the
> > >> >>>>>>
> > >> >>>>>> lucene
> > >> >>>>>>
> > >> >>>>>>>>>> version
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> has
> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> As I know there are new compact index formats with new
> > >> >>>
> > >> >>> lucene
> > >> >>>
> > >> >>>>>>>>>> versions
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> but I'm
> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> Cheers
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> Martin
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> --
> > >> >>>>>>>>>>>> Olivier Lamy
> > >> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>>>>>
> > >> >>>>>>>>> --
> > >> >>>>>>>>> Olivier Lamy
> > >> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>>>>
> > >> >>>>>>>> --
> > >> >>>>>>>> Olivier Lamy
> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>
> > >> >>>>> --
> > >> >>>>> Olivier Lamy
> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>
> > >> >> --
> > >> >> Olivier Lamy
> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >>
> > >> --
> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
Hi Olivier,

great! 
For my understanding: The dependency to lucene in the pom of indexer-core is 
still there, but the lucene packages are moved to the ...maven.index.shaded... 
package? You develop indexer-core with the standard lucene packages and the 
shading is executed during the build of the indexer package?

I think that may solve our dependency problem.

I still got errors in the maven-indexer module, but I think the status is 
still "work in progress". I don't want to interfere too much with your 
changes.

I'm not sure, if we should keep the JCR Oak as metadata implementation. I 
think OrientDB may be a feasible alternative: Embeddable,  Graph database, 
Lucene index optional and may be omitted, Apache License. And with JCR Oak we 
also have to convert the existing metadata index.

But one step after the other. If we agree that the shaded indexer works, we 
should merge only the maven indexer changes to the master branch without the 
JCR/lucene update and change the JCR and or lucene afterwards.

Greetings

Martin

Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
> So the repo contains a branch feature/jar_shaded_lucene here
> https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
> and I pushed what I started for Archiva in the branch called feature/jcr_oak
> So in order to test it you need to build first maven-indexer from the
> branch feature/jar_shaded_lucene
> 
> On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
> > I will try to share the work I did tomorrow in a branch
> > 
> > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <ma...@apache.org>
> > 
> > wrote:
> >> We have different lucene (incompatible) dependencies that prevents us to
> >> update the maven indexer and/or jackrabbit. And this will happen again
> >> with
> >> each upgrade from one of these two packages in the future.
> >> So would be really good if we can find a solution that removes one of the
> >> lucene dependencies.
> >> 
> >> Greetings
> >> 
> >> Martin
> >> 
> >> 
> >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <chrisgwarp@gmail.com
> >> 
> >> >Can I please an obvious/stupid question?
> >> >
> >> >What is driving this need for change?
> >> >
> >> >From a quick read of the thread above, all of the options appear to
> >> >introduce a lot of breaking changes, and a whole lot more uncertainty.
> >> >
> >> >So, what is so broken that it is driving these changes?
> >> >
> >> >Sent from my iPhone
> >> >
> >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
> >> >> 
> >> >> Yup.
> >> >> The idea is to have an extra jar produced by the maven-indexer with
> >> >
> >> >shaded
> >> >
> >> >> lucene version.
> >> >> So the lucene classes (version used by Maven indexer) will be
> >> >
> >> >relocated in
> >> >
> >> >> a package called org.apache.maven.index.shaded.lucene (such
> >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> >> >> Then you exclude lucene dependencies used by maven indexer and voila.
> >> >> The voila is a bit optimistic and not so ezy but anyway working on it
> >> >
> >> >ATM.
> >> >
> >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> >> >>> 
> >> >>> What do you mean exactly by shading? Moving to another package name?
> >> >>> 
> >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> >> >>>> maybe an option is to use some shading?
> >> >>>> I'm thinking of shading lucene packages used by maven indexer. I
> >> >
> >> >can
> >> >
> >> >>> easily
> >> >>> 
> >> >>>> provide a build for that.
> >> >>>> WDYT?
> >> >>>> 
> >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
> >> >>>>> Hi
> >> >>>>> graph/document storage could be convenient (but not possible with
> >> >>> 
> >> >>> neo4j as
> >> >>> 
> >> >>>>> it's GPL license [1])
> >> >>>>> well we can add solr as an additional webapp with our jetty
> >> >>> 
> >> >>> distribution
> >> >>> 
> >> >>>>> but this will be a pain for users who want to use tomcat or any
> >> >
> >> >other
> >> >
> >> >>>>> servlet container...
> >> >>>>> we still need to investigate a new storage model :-)
> >> >>>>> 
> >> >>>>> Olivier
> >> >>>>> [1] https://neo4j.com/licensing/
> >> >>>>> 
> >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> >> >>>>>> Yes, you are right. The lucene dependency causes a lot of trouble
> >> >
> >> >and
> >> >
> >> >>>>>> will
> >> >>>>>> cause headaches with each version change of one of the
> >> >
> >> >dependencies.
> >> >
> >> >>>>>> What are the requirements for a replacement?
> >> >>>>>> - We want to store hierarchical data?
> >> >>>>>> - We want to store metadata for nodes ?
> >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> >> >>>>>> - Blob / Artifact storage (I don't think so, but not so familiar
> >> >
> >> >with
> >> >
> >> >>> the
> >> >>> 
> >> >>>>>> archiva artifact model)?
> >> >>>>>> 
> >> >>>>>> Maybe some graph database may be an alternative. Don't know if
> >> >
> >> >the
> >> >
> >> >>>>>> license of
> >> >>>>>> neo4j is compatible to the apache license, and I think it brings
> >> >>> 
> >> >>> lucene
> >> >>> 
> >> >>>>>> as
> >> >>>>>> dependency too. I will have a look.
> >> >>>>>> Problem is, if there is fulltext search needed, I think, for most
> >> >
> >> >of
> >> >
> >> >>> the
> >> >>> 
> >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> >> >>>>>> 
> >> >>>>>> Other alternatives:
> >> >>>>>> - Implement fulltext search by our own (index of the metadata
> >> >
> >> >stored
> >> >
> >> >>> via
> >> >>> 
> >> >>>>>> the
> >> >>>>>> archiva api) and use the lucene dependency that comes from the
> >> >>>>>> maven-indexer
> >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> >> >>>>>> application
> >> >>>>>> (war).
> >> >>>>>> 
> >> >>>>>> Greetings
> >> >>>>>> 
> >> >>>>>> Martin
> >> >>>>>> 
> >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> >> >>>>>>> well this gonna be a pain.
> >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> >> >>>>>>> And something not using Lucene as it's a real pain to have
> >> >
> >> >different
> >> >
> >> >>>>>>> librairies using lucene as they do not update in the same time
> >> >
> >> >(and
> >> >
> >> >>>>>> Lucene
> >> >>>>>> 
> >> >>>>>>> break backward compat so quickly...)
> >> >>>>>>> Any ideas? I'd like to have something embedded (but with a
> >> >
> >> >possible
> >> >
> >> >>>>>>> external server configuration).
> >> >>>>>>> There is currently a Cassandra implementation. I was not
> >> >
> >> >satisfied
> >> >
> >> >>>>>>> about
> >> >>>>>>> performance but I guess I did that 4yo ago so can be improved
> >> >
> >> >for
> >> >
> >> >>> sure
> >> >>> 
> >> >>>>>> :-)
> >> >>>>>> :
> >> >>>>>>> Maybe orientdb?
> >> >>>>>>> What else?
> >> >>>>>>> 
> >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
> >> >
> >> >wrote:
> >> >>>>>>>> well the issue is non compatible version of Lucene for Maven
> >> >>> 
> >> >>> Indexer
> >> >>> 
> >> >>>>>> and
> >> >>>>>> 
> >> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
> >> >>>>>>>> 
> >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
> >> >
> >> >wrote:
> >> >>>>>>>>> Hi
> >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> >> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr
> >> >>> 
> >> >>> module.
> >> >>> 
> >> >>>>>>>>> I definitely agree on the upgrade.
> >> >>>>>>>>> Well we can simply detect it's not oak compatible and schedule
> >> >
> >> >a
> >> >
> >> >>>>>>>>> full
> >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> >> >>>>>>>>> But we need to be sure we can still read central index and not
> >> >>> 
> >> >>> sure
> >> >>> 
> >> >>>>>> about
> >> >>>>>> 
> >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> >> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it
> >> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >> >>>>>>>>> va-jcr-oak-branch/)
> >> >>>>>>>>> If you prefer master I would say no worries neither.
> >> >>>>>>>>> Something else to look at is upgrading maven-core etc...
> >> >>>>>>>>> Anyway
> >> >>>>>>>>> Cheers
> >> >>>>>>>>> Olivier
> >> >>>>>>>>> 
> >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> >> >>>>>>>>>> Hi,
> >> >>>>>>>>>> 
> >> >>>>>>>>>> upgrading the maven indexer leads to some major changes.
> >> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
> >> >>> 
> >> >>> Jackrabbit
> >> >>> 
> >> >>>>>>>>>> sticks to
> >> >>>>>>>>>> the old 3.x version and, as I see it, they will not move to a
> >> >>> 
> >> >>> newer
> >> >>> 
> >> >>>>>>>>>> version.
> >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> >> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
> >> >>>>>>>>>> implementation of
> >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got
> >> >
> >> >the
> >> >
> >> >>>>>> unit
> >> >>>>>> 
> >> >>>>>>>>>> tests of
> >> >>>>>>>>>> this module all to pass.
> >> >>>>>>>>>> But switching to Oak has some drawbacks:
> >> >>>>>>>>>> - The repository format changed and we must provide a way to
> >> >>>>>>>>>> migrate
> >> >>>>>>>>>> (either
> >> >>>>>>>>>> migrate the existing repository or create a new one by
> >> >>> 
> >> >>> reindexing)
> >> >>> 
> >> >>>>>>>>>> - The lucene version used is newer but does not match to the
> >> >>>>>>>>>> version
> >> >>>>>>>>>> from the
> >> >>>>>>>>>> maven-indexer dependencies. There may come up some
> >> >>>>>>>>>> incompatibilities
> >> >>>>>>>>>> that are
> >> >>>>>>>>>> not solvable without using a modified version of one of the
> >> >>> 
> >> >>> both.
> >> >>> 
> >> >>>>>>>>>> Or
> >> >>>>>>>>>> there may
> >> >>>>>>>>>> be the possibility to switch to solr (as separate component)
> >> >
> >> >and
> >> >
> >> >>>>>> get rid
> >> >>>>>> 
> >> >>>>>>>>>> of
> >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
> >> >>>>>>>>>> 
> >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
> >> >
> >> >too:
> >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> >> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
> >> >>>>>>>>>> 
> >> >>>>>>>>>> So switching to the new indexer and oak means more work as
> >> >>> 
> >> >>> expected
> >> >>> 
> >> >>>>>> and
> >> >>>>>> 
> >> >>>>>>>>>> some
> >> >>>>>>>>>> risks regarding new incompatibility problems. And I think
> >> >
> >> >this
> >> >
> >> >>>>>> cannot be
> >> >>>>>> 
> >> >>>>>>>>>> done
> >> >>>>>>>>>> without broken master builds for some time period.
> >> >>>>>>>>>> 
> >> >>>>>>>>>> So, what should we do? I think maven indexer is one of the
> >> >
> >> >core
> >> >
> >> >>>>>>>>>> components of
> >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate to
> >> >>> 
> >> >>> the
> >> >>> 
> >> >>>>>> new
> >> >>>>>> 
> >> >>>>>>>>>> indexer
> >> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
> >> >
> >> >it
> >> >
> >> >>>>>>>>>> would
> >> >>>>>>>>>> mean to
> >> >>>>>>>>>> stick to the old version for the next years.
> >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> >> >>> 
> >> >>> changes, I
> >> >>> 
> >> >>>>>> hope
> >> >>>>>> 
> >> >>>>>>>>>> you
> >> >>>>>>>>>> can provide  useful help.
> >> >>>>>>>>>> 
> >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are
> >> >>> 
> >> >>> some
> >> >>> 
> >> >>>>>>>>>> modules
> >> >>>>>>>>>> where the tests do not pass (mainly because of the indexer
> >> >
> >> >API
> >> >
> >> >>>>>> changes).
> >> >>>>>> 
> >> >>>>>>>>>> Any comments?
> >> >>>>>>>>>> 
> >> >>>>>>>>>> Cheers
> >> >>>>>>>>>> 
> >> >>>>>>>>>> Martin
> >> >>>>>>>>>> 
> >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
> >> >
> >> >Lamy:
> >> >>>>>>>>>>> forget it but we need to ensure we can read maven index
> >> >>> 
> >> >>> files....
> >> >>> 
> >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
> >> >>> 
> >> >>> wrote:
> >> >>>>>>>>>>>> Hi,
> >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading
> >> >>>>>> 
> >> >>>>>> Lucene
> >> >>>>>> 
> >> >>>>>>>>>> can be a
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>> problem here.
> >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
> >> >>> 
> >> >>> until
> >> >>> 
> >> >>>>>> the
> >> >>>>>> 
> >> >>>>>>>>>> release.
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>> I can release it ;-)
> >> >>>>>>>>>>>> 
> >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
> >> >>> 
> >> >>> wrote:
> >> >>>>>>>>>>>>> Hi,
> >> >>>>>>>>>>>>> 
> >> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
> >> >>> 
> >> >>> not
> >> >>> 
> >> >>>>>> sure
> >> >>>>>> 
> >> >>>>>>>>>> about
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>>> the
> >> >>>>>>>>>>>>> current state of maven-indexer. The version has not
> >> >
> >> >changed
> >> >
> >> >>>>>> since
> >> >>>>>> 
> >> >>>>>>>>>> some
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>>> 2013.
> >> >>>>>>>>>>>>> 
> >> >>>>>>>>>>>>> There are commits on the master branch since then, and the
> >> >>>>>> 
> >> >>>>>> lucene
> >> >>>>>> 
> >> >>>>>>>>>> version
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>>> has
> >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> >> >>>>>>>>>>>>> 
> >> >>>>>>>>>>>>> As I know there are new compact index formats with new
> >> >>> 
> >> >>> lucene
> >> >>> 
> >> >>>>>>>>>> versions
> >> >>>>>>>>>> 
> >> >>>>>>>>>>>>> but I'm
> >> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
> >> >>>>>>>>>>>>> 
> >> >>>>>>>>>>>>> Cheers
> >> >>>>>>>>>>>>> 
> >> >>>>>>>>>>>>> Martin
> >> >>>>>>>>>>>> 
> >> >>>>>>>>>>>> --
> >> >>>>>>>>>>>> Olivier Lamy
> >> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> >>>>>>>>> 
> >> >>>>>>>>> --
> >> >>>>>>>>> Olivier Lamy
> >> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> >>>>>>>> 
> >> >>>>>>>> --
> >> >>>>>>>> Olivier Lamy
> >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> >>>>> 
> >> >>>>> --
> >> >>>>> Olivier Lamy
> >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> >> 
> >> >> --
> >> >> Olivier Lamy
> >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> 
> >> --
> >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> > 
> > --
> > Olivier Lamy
> > http://twitter.com/olamy | http://linkedin.com/in/olamy



Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
So the repo contains a branch feature/jar_shaded_lucene here
https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
and I pushed what I started for Archiva in the branch called feature/jcr_oak
So in order to test it you need to build first maven-indexer from the
branch feature/jar_shaded_lucene



On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:

> I will try to share the work I did tomorrow in a branch
>
> On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <ma...@apache.org>
> wrote:
>
>> We have different lucene (incompatible) dependencies that prevents us to
>> update the maven indexer and/or jackrabbit. And this will happen again with
>> each upgrade from one of these two packages in the future.
>> So would be really good if we can find a solution that removes one of the
>> lucene dependencies.
>>
>> Greetings
>>
>> Martin
>>
>>
>> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <chrisgwarp@gmail.com
>> >:
>> >Can I please an obvious/stupid question?
>> >
>> >What is driving this need for change?
>> >
>> >From a quick read of the thread above, all of the options appear to
>> >introduce a lot of breaking changes, and a whole lot more uncertainty.
>> >
>> >So, what is so broken that it is driving these changes?
>> >
>> >Sent from my iPhone
>> >
>> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
>> >>
>> >> Yup.
>> >> The idea is to have an extra jar produced by the maven-indexer with
>> >shaded
>> >> lucene version.
>> >> So the lucene classes (version used by Maven indexer) will be
>> >relocated in
>> >> a package called org.apache.maven.index.shaded.lucene (such
>> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
>> >> Then you exclude lucene dependencies used by maven indexer and voila.
>> >> The voila is a bit optimistic and not so ezy but anyway working on it
>> >ATM.
>> >>
>> >>
>> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
>> >>>
>> >>> What do you mean exactly by shading? Moving to another package name?
>> >>>
>> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
>> >>>> maybe an option is to use some shading?
>> >>>> I'm thinking of shading lucene packages used by maven indexer. I
>> >can
>> >>> easily
>> >>>> provide a build for that.
>> >>>> WDYT?
>> >>>>
>> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
>> >>>>> Hi
>> >>>>> graph/document storage could be convenient (but not possible with
>> >>> neo4j as
>> >>>>> it's GPL license [1])
>> >>>>> well we can add solr as an additional webapp with our jetty
>> >>> distribution
>> >>>>> but this will be a pain for users who want to use tomcat or any
>> >other
>> >>>>> servlet container...
>> >>>>> we still need to investigate a new storage model :-)
>> >>>>>
>> >>>>> Olivier
>> >>>>> [1] https://neo4j.com/licensing/
>> >>>>>
>> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
>> >>>>>> Yes, you are right. The lucene dependency causes a lot of trouble
>> >and
>> >>>>>> will
>> >>>>>> cause headaches with each version change of one of the
>> >dependencies.
>> >>>>>> What are the requirements for a replacement?
>> >>>>>> - We want to store hierarchical data?
>> >>>>>> - We want to store metadata for nodes ?
>> >>>>>> - Fulltext search (only metadata or for artifacts too?)
>> >>>>>> - Blob / Artifact storage (I don't think so, but not so familiar
>> >with
>> >>> the
>> >>>>>> archiva artifact model)?
>> >>>>>>
>> >>>>>> Maybe some graph database may be an alternative. Don't know if
>> >the
>> >>>>>> license of
>> >>>>>> neo4j is compatible to the apache license, and I think it brings
>> >>> lucene
>> >>>>>> as
>> >>>>>> dependency too. I will have a look.
>> >>>>>> Problem is, if there is fulltext search needed, I think, for most
>> >of
>> >>> the
>> >>>>>> frameworks we get a lucene dependency, if it's embedded.
>> >>>>>>
>> >>>>>> Other alternatives:
>> >>>>>> - Implement fulltext search by our own (index of the metadata
>> >stored
>> >>> via
>> >>>>>> the
>> >>>>>> archiva api) and use the lucene dependency that comes from the
>> >>>>>> maven-indexer
>> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
>> >>>>>> application
>> >>>>>> (war).
>> >>>>>>
>> >>>>>> Greetings
>> >>>>>>
>> >>>>>> Martin
>> >>>>>>
>> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
>> >>>>>>> well this gonna be a pain.
>> >>>>>>> IMHO we need to find a new alternative to jcr oak.
>> >>>>>>> And something not using Lucene as it's a real pain to have
>> >different
>> >>>>>>> librairies using lucene as they do not update in the same time
>> >(and
>> >>>>>>
>> >>>>>> Lucene
>> >>>>>>
>> >>>>>>> break backward compat so quickly...)
>> >>>>>>> Any ideas? I'd like to have something embedded (but with a
>> >possible
>> >>>>>>> external server configuration).
>> >>>>>>> There is currently a Cassandra implementation. I was not
>> >satisfied
>> >>>>>>> about
>> >>>>>>> performance but I guess I did that 4yo ago so can be improved
>> >for
>> >>> sure
>> >>>>>> :
>> >>>>>> :-)
>> >>>>>> :
>> >>>>>>> Maybe orientdb?
>> >>>>>>> What else?
>> >>>>>>>
>> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
>> >wrote:
>> >>>>>>>> well the issue is non compatible version of Lucene for Maven
>> >>> Indexer
>> >>>>>>
>> >>>>>> and
>> >>>>>>
>> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
>> >>>>>>>>
>> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
>> >wrote:
>> >>>>>>>>> Hi
>> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
>> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr
>> >>> module.
>> >>>>>>>>> I definitely agree on the upgrade.
>> >>>>>>>>> Well we can simply detect it's not oak compatible and schedule
>> >a
>> >>>>>>>>> full
>> >>>>>>>>> reindex (maybe with a message in logs and ui?)
>> >>>>>>>>> But we need to be sure we can still read central index and not
>> >>> sure
>> >>>>>>
>> >>>>>> about
>> >>>>>>
>> >>>>>>>>> possible lucene conflict with oak and maven indexer.
>> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it
>> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> >>>>>>>>> va-jcr-oak-branch/)
>> >>>>>>>>> If you prefer master I would say no worries neither.
>> >>>>>>>>> Something else to look at is upgrading maven-core etc...
>> >>>>>>>>> Anyway
>> >>>>>>>>> Cheers
>> >>>>>>>>> Olivier
>> >>>>>>>>>
>> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> upgrading the maven indexer leads to some major changes.
>> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
>> >>> Jackrabbit
>> >>>>>>>>>> sticks to
>> >>>>>>>>>> the old 3.x version and, as I see it, they will not move to a
>> >>> newer
>> >>>>>>>>>> version.
>> >>>>>>>>>> There is Jackrabbit Oak as alternative.
>> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
>> >>>>>>>>>> implementation of
>> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got
>> >the
>> >>>>>>
>> >>>>>> unit
>> >>>>>>
>> >>>>>>>>>> tests of
>> >>>>>>>>>> this module all to pass.
>> >>>>>>>>>> But switching to Oak has some drawbacks:
>> >>>>>>>>>> - The repository format changed and we must provide a way to
>> >>>>>>>>>> migrate
>> >>>>>>>>>> (either
>> >>>>>>>>>> migrate the existing repository or create a new one by
>> >>> reindexing)
>> >>>>>>>>>> - The lucene version used is newer but does not match to the
>> >>>>>>>>>> version
>> >>>>>>>>>> from the
>> >>>>>>>>>> maven-indexer dependencies. There may come up some
>> >>>>>>>>>> incompatibilities
>> >>>>>>>>>> that are
>> >>>>>>>>>> not solvable without using a modified version of one of the
>> >>> both.
>> >>>>>>>>>> Or
>> >>>>>>>>>> there may
>> >>>>>>>>>> be the possibility to switch to solr (as separate component)
>> >and
>> >>>>>>
>> >>>>>> get rid
>> >>>>>>
>> >>>>>>>>>> of
>> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
>> >>>>>>>>>>
>> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
>> >too:
>> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
>> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
>> >>>>>>>>>>
>> >>>>>>>>>> So switching to the new indexer and oak means more work as
>> >>> expected
>> >>>>>>
>> >>>>>> and
>> >>>>>>
>> >>>>>>>>>> some
>> >>>>>>>>>> risks regarding new incompatibility problems. And I think
>> >this
>> >>>>>>
>> >>>>>> cannot be
>> >>>>>>
>> >>>>>>>>>> done
>> >>>>>>>>>> without broken master builds for some time period.
>> >>>>>>>>>>
>> >>>>>>>>>> So, what should we do? I think maven indexer is one of the
>> >core
>> >>>>>>>>>> components of
>> >>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate to
>> >>> the
>> >>>>>>
>> >>>>>> new
>> >>>>>>
>> >>>>>>>>>> indexer
>> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
>> >it
>> >>>>>>>>>> would
>> >>>>>>>>>> mean to
>> >>>>>>>>>> stick to the old version for the next years.
>> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
>> >>> changes, I
>> >>>>>>
>> >>>>>> hope
>> >>>>>>
>> >>>>>>>>>> you
>> >>>>>>>>>> can provide  useful help.
>> >>>>>>>>>>
>> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are
>> >>> some
>> >>>>>>>>>> modules
>> >>>>>>>>>> where the tests do not pass (mainly because of the indexer
>> >API
>> >>>>>>
>> >>>>>> changes).
>> >>>>>>
>> >>>>>>>>>> Any comments?
>> >>>>>>>>>>
>> >>>>>>>>>> Cheers
>> >>>>>>>>>>
>> >>>>>>>>>> Martin
>> >>>>>>>>>>
>> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
>> >Lamy:
>> >>>>>>>>>>> forget it but we need to ensure we can read maven index
>> >>> files....
>> >>>>>>>>>>>
>> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
>> >>> wrote:
>> >>>>>>>>>>>> Hi,
>> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading
>> >>>>>>
>> >>>>>> Lucene
>> >>>>>>
>> >>>>>>>>>> can be a
>> >>>>>>>>>>
>> >>>>>>>>>>>> problem here.
>> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
>> >>> until
>> >>>>>>
>> >>>>>> the
>> >>>>>>
>> >>>>>>>>>> release.
>> >>>>>>>>>>
>> >>>>>>>>>>>> I can release it ;-)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
>> >>> wrote:
>> >>>>>>>>>>>>> Hi,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
>> >>> not
>> >>>>>>
>> >>>>>> sure
>> >>>>>>
>> >>>>>>>>>> about
>> >>>>>>>>>>
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>> current state of maven-indexer. The version has not
>> >changed
>> >>>>>>
>> >>>>>> since
>> >>>>>>
>> >>>>>>>>>> some
>> >>>>>>>>>>
>> >>>>>>>>>>>>> 2013.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> There are commits on the master branch since then, and the
>> >>>>>>
>> >>>>>> lucene
>> >>>>>>
>> >>>>>>>>>> version
>> >>>>>>>>>>
>> >>>>>>>>>>>>> has
>> >>>>>>>>>>>>> been changed too, but no releases were tagged.
>> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
>> >>>>>>>>>>>>> 6.0-SNAPSHOT?
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> As I know there are new compact index formats with new
>> >>> lucene
>> >>>>>>>>>>
>> >>>>>>>>>> versions
>> >>>>>>>>>>
>> >>>>>>>>>>>>> but I'm
>> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Cheers
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Martin
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> --
>> >>>>>>>>>>>> Olivier Lamy
>> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Olivier Lamy
>> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Olivier Lamy
>> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> >>>>>
>> >>>>> --
>> >>>>> Olivier Lamy
>> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Olivier Lamy
>> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>> --
>> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
I will try to share the work I did tomorrow in a branch

On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <ma...@apache.org>
wrote:

> We have different lucene (incompatible) dependencies that prevents us to
> update the maven indexer and/or jackrabbit. And this will happen again with
> each upgrade from one of these two packages in the future.
> So would be really good if we can find a solution that removes one of the
> lucene dependencies.
>
> Greetings
>
> Martin
>
>
> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <ch...@gmail.com>:
> >Can I please an obvious/stupid question?
> >
> >What is driving this need for change?
> >
> >From a quick read of the thread above, all of the options appear to
> >introduce a lot of breaking changes, and a whole lot more uncertainty.
> >
> >So, what is so broken that it is driving these changes?
> >
> >Sent from my iPhone
> >
> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
> >>
> >> Yup.
> >> The idea is to have an extra jar produced by the maven-indexer with
> >shaded
> >> lucene version.
> >> So the lucene classes (version used by Maven indexer) will be
> >relocated in
> >> a package called org.apache.maven.index.shaded.lucene (such
> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> >> Then you exclude lucene dependencies used by maven indexer and voila.
> >> The voila is a bit optimistic and not so ezy but anyway working on it
> >ATM.
> >>
> >>
> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> >>>
> >>> What do you mean exactly by shading? Moving to another package name?
> >>>
> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> >>>> maybe an option is to use some shading?
> >>>> I'm thinking of shading lucene packages used by maven indexer. I
> >can
> >>> easily
> >>>> provide a build for that.
> >>>> WDYT?
> >>>>
> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
> >>>>> Hi
> >>>>> graph/document storage could be convenient (but not possible with
> >>> neo4j as
> >>>>> it's GPL license [1])
> >>>>> well we can add solr as an additional webapp with our jetty
> >>> distribution
> >>>>> but this will be a pain for users who want to use tomcat or any
> >other
> >>>>> servlet container...
> >>>>> we still need to investigate a new storage model :-)
> >>>>>
> >>>>> Olivier
> >>>>> [1] https://neo4j.com/licensing/
> >>>>>
> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> >>>>>> Yes, you are right. The lucene dependency causes a lot of trouble
> >and
> >>>>>> will
> >>>>>> cause headaches with each version change of one of the
> >dependencies.
> >>>>>> What are the requirements for a replacement?
> >>>>>> - We want to store hierarchical data?
> >>>>>> - We want to store metadata for nodes ?
> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> >>>>>> - Blob / Artifact storage (I don't think so, but not so familiar
> >with
> >>> the
> >>>>>> archiva artifact model)?
> >>>>>>
> >>>>>> Maybe some graph database may be an alternative. Don't know if
> >the
> >>>>>> license of
> >>>>>> neo4j is compatible to the apache license, and I think it brings
> >>> lucene
> >>>>>> as
> >>>>>> dependency too. I will have a look.
> >>>>>> Problem is, if there is fulltext search needed, I think, for most
> >of
> >>> the
> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> >>>>>>
> >>>>>> Other alternatives:
> >>>>>> - Implement fulltext search by our own (index of the metadata
> >stored
> >>> via
> >>>>>> the
> >>>>>> archiva api) and use the lucene dependency that comes from the
> >>>>>> maven-indexer
> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> >>>>>> application
> >>>>>> (war).
> >>>>>>
> >>>>>> Greetings
> >>>>>>
> >>>>>> Martin
> >>>>>>
> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> >>>>>>> well this gonna be a pain.
> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> >>>>>>> And something not using Lucene as it's a real pain to have
> >different
> >>>>>>> librairies using lucene as they do not update in the same time
> >(and
> >>>>>>
> >>>>>> Lucene
> >>>>>>
> >>>>>>> break backward compat so quickly...)
> >>>>>>> Any ideas? I'd like to have something embedded (but with a
> >possible
> >>>>>>> external server configuration).
> >>>>>>> There is currently a Cassandra implementation. I was not
> >satisfied
> >>>>>>> about
> >>>>>>> performance but I guess I did that 4yo ago so can be improved
> >for
> >>> sure
> >>>>>> :
> >>>>>> :-)
> >>>>>> :
> >>>>>>> Maybe orientdb?
> >>>>>>> What else?
> >>>>>>>
> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
> >wrote:
> >>>>>>>> well the issue is non compatible version of Lucene for Maven
> >>> Indexer
> >>>>>>
> >>>>>> and
> >>>>>>
> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
> >>>>>>>>
> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
> >wrote:
> >>>>>>>>> Hi
> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr
> >>> module.
> >>>>>>>>> I definitely agree on the upgrade.
> >>>>>>>>> Well we can simply detect it's not oak compatible and schedule
> >a
> >>>>>>>>> full
> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> >>>>>>>>> But we need to be sure we can still read central index and not
> >>> sure
> >>>>>>
> >>>>>> about
> >>>>>>
> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it
> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >>>>>>>>> va-jcr-oak-branch/)
> >>>>>>>>> If you prefer master I would say no worries neither.
> >>>>>>>>> Something else to look at is upgrading maven-core etc...
> >>>>>>>>> Anyway
> >>>>>>>>> Cheers
> >>>>>>>>> Olivier
> >>>>>>>>>
> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> upgrading the maven indexer leads to some major changes.
> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
> >>> Jackrabbit
> >>>>>>>>>> sticks to
> >>>>>>>>>> the old 3.x version and, as I see it, they will not move to a
> >>> newer
> >>>>>>>>>> version.
> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
> >>>>>>>>>> implementation of
> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got
> >the
> >>>>>>
> >>>>>> unit
> >>>>>>
> >>>>>>>>>> tests of
> >>>>>>>>>> this module all to pass.
> >>>>>>>>>> But switching to Oak has some drawbacks:
> >>>>>>>>>> - The repository format changed and we must provide a way to
> >>>>>>>>>> migrate
> >>>>>>>>>> (either
> >>>>>>>>>> migrate the existing repository or create a new one by
> >>> reindexing)
> >>>>>>>>>> - The lucene version used is newer but does not match to the
> >>>>>>>>>> version
> >>>>>>>>>> from the
> >>>>>>>>>> maven-indexer dependencies. There may come up some
> >>>>>>>>>> incompatibilities
> >>>>>>>>>> that are
> >>>>>>>>>> not solvable without using a modified version of one of the
> >>> both.
> >>>>>>>>>> Or
> >>>>>>>>>> there may
> >>>>>>>>>> be the possibility to switch to solr (as separate component)
> >and
> >>>>>>
> >>>>>> get rid
> >>>>>>
> >>>>>>>>>> of
> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
> >>>>>>>>>>
> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
> >too:
> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
> >>>>>>>>>>
> >>>>>>>>>> So switching to the new indexer and oak means more work as
> >>> expected
> >>>>>>
> >>>>>> and
> >>>>>>
> >>>>>>>>>> some
> >>>>>>>>>> risks regarding new incompatibility problems. And I think
> >this
> >>>>>>
> >>>>>> cannot be
> >>>>>>
> >>>>>>>>>> done
> >>>>>>>>>> without broken master builds for some time period.
> >>>>>>>>>>
> >>>>>>>>>> So, what should we do? I think maven indexer is one of the
> >core
> >>>>>>>>>> components of
> >>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate to
> >>> the
> >>>>>>
> >>>>>> new
> >>>>>>
> >>>>>>>>>> indexer
> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
> >it
> >>>>>>>>>> would
> >>>>>>>>>> mean to
> >>>>>>>>>> stick to the old version for the next years.
> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> >>> changes, I
> >>>>>>
> >>>>>> hope
> >>>>>>
> >>>>>>>>>> you
> >>>>>>>>>> can provide  useful help.
> >>>>>>>>>>
> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are
> >>> some
> >>>>>>>>>> modules
> >>>>>>>>>> where the tests do not pass (mainly because of the indexer
> >API
> >>>>>>
> >>>>>> changes).
> >>>>>>
> >>>>>>>>>> Any comments?
> >>>>>>>>>>
> >>>>>>>>>> Cheers
> >>>>>>>>>>
> >>>>>>>>>> Martin
> >>>>>>>>>>
> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
> >Lamy:
> >>>>>>>>>>> forget it but we need to ensure we can read maven index
> >>> files....
> >>>>>>>>>>>
> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
> >>> wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading
> >>>>>>
> >>>>>> Lucene
> >>>>>>
> >>>>>>>>>> can be a
> >>>>>>>>>>
> >>>>>>>>>>>> problem here.
> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
> >>> until
> >>>>>>
> >>>>>> the
> >>>>>>
> >>>>>>>>>> release.
> >>>>>>>>>>
> >>>>>>>>>>>> I can release it ;-)
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
> >>> wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
> >>> not
> >>>>>>
> >>>>>> sure
> >>>>>>
> >>>>>>>>>> about
> >>>>>>>>>>
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> current state of maven-indexer. The version has not
> >changed
> >>>>>>
> >>>>>> since
> >>>>>>
> >>>>>>>>>> some
> >>>>>>>>>>
> >>>>>>>>>>>>> 2013.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> There are commits on the master branch since then, and the
> >>>>>>
> >>>>>> lucene
> >>>>>>
> >>>>>>>>>> version
> >>>>>>>>>>
> >>>>>>>>>>>>> has
> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> As I know there are new compact index formats with new
> >>> lucene
> >>>>>>>>>>
> >>>>>>>>>> versions
> >>>>>>>>>>
> >>>>>>>>>>>>> but I'm
> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Martin
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Olivier Lamy
> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Olivier Lamy
> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Olivier Lamy
> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>>>>
> >>>>> --
> >>>>> Olivier Lamy
> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Olivier Lamy
> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>
> --
> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin Stockhammer <ma...@apache.org>.
We have different lucene (incompatible) dependencies that prevents us to update the maven indexer and/or jackrabbit. And this will happen again with each upgrade from one of these two packages in the future. 
So would be really good if we can find a solution that removes one of the lucene dependencies.

Greetings

Martin


Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <ch...@gmail.com>:
>Can I please an obvious/stupid question?
>
>What is driving this need for change?
>
>From a quick read of the thread above, all of the options appear to
>introduce a lot of breaking changes, and a whole lot more uncertainty.
>
>So, what is so broken that it is driving these changes?
>
>Sent from my iPhone
>
>> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
>> 
>> Yup.
>> The idea is to have an extra jar produced by the maven-indexer with
>shaded
>> lucene version.
>> So the lucene classes (version used by Maven indexer) will be
>relocated in
>> a package called org.apache.maven.index.shaded.lucene (such
>> org.apache.maven.index.shaded.lucene.search.BooleanClause )
>> Then you exclude lucene dependencies used by maven indexer and voila.
>> The voila is a bit optimistic and not so ezy but anyway working on it
>ATM.
>> 
>> 
>>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
>>> 
>>> What do you mean exactly by shading? Moving to another package name?
>>> 
>>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
>>>> maybe an option is to use some shading?
>>>> I'm thinking of shading lucene packages used by maven indexer. I
>can
>>> easily
>>>> provide a build for that.
>>>> WDYT?
>>>> 
>>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
>>>>> Hi
>>>>> graph/document storage could be convenient (but not possible with
>>> neo4j as
>>>>> it's GPL license [1])
>>>>> well we can add solr as an additional webapp with our jetty
>>> distribution
>>>>> but this will be a pain for users who want to use tomcat or any
>other
>>>>> servlet container...
>>>>> we still need to investigate a new storage model :-)
>>>>> 
>>>>> Olivier
>>>>> [1] https://neo4j.com/licensing/
>>>>> 
>>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
>>>>>> Yes, you are right. The lucene dependency causes a lot of trouble
>and
>>>>>> will
>>>>>> cause headaches with each version change of one of the
>dependencies.
>>>>>> What are the requirements for a replacement?
>>>>>> - We want to store hierarchical data?
>>>>>> - We want to store metadata for nodes ?
>>>>>> - Fulltext search (only metadata or for artifacts too?)
>>>>>> - Blob / Artifact storage (I don't think so, but not so familiar
>with
>>> the
>>>>>> archiva artifact model)?
>>>>>> 
>>>>>> Maybe some graph database may be an alternative. Don't know if
>the
>>>>>> license of
>>>>>> neo4j is compatible to the apache license, and I think it brings
>>> lucene
>>>>>> as
>>>>>> dependency too. I will have a look.
>>>>>> Problem is, if there is fulltext search needed, I think, for most
>of
>>> the
>>>>>> frameworks we get a lucene dependency, if it's embedded.
>>>>>> 
>>>>>> Other alternatives:
>>>>>> - Implement fulltext search by our own (index of the metadata
>stored
>>> via
>>>>>> the
>>>>>> archiva api) and use the lucene dependency that comes from the
>>>>>> maven-indexer
>>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
>>>>>> application
>>>>>> (war).
>>>>>> 
>>>>>> Greetings
>>>>>> 
>>>>>> Martin
>>>>>> 
>>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
>>>>>>> well this gonna be a pain.
>>>>>>> IMHO we need to find a new alternative to jcr oak.
>>>>>>> And something not using Lucene as it's a real pain to have
>different
>>>>>>> librairies using lucene as they do not update in the same time
>(and
>>>>>> 
>>>>>> Lucene
>>>>>> 
>>>>>>> break backward compat so quickly...)
>>>>>>> Any ideas? I'd like to have something embedded (but with a
>possible
>>>>>>> external server configuration).
>>>>>>> There is currently a Cassandra implementation. I was not
>satisfied
>>>>>>> about
>>>>>>> performance but I guess I did that 4yo ago so can be improved
>for
>>> sure
>>>>>> :
>>>>>> :-)
>>>>>> :
>>>>>>> Maybe orientdb?
>>>>>>> What else?
>>>>>>> 
>>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
>wrote:
>>>>>>>> well the issue is non compatible version of Lucene for Maven
>>> Indexer
>>>>>> 
>>>>>> and
>>>>>> 
>>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
>>>>>>>> 
>>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
>wrote:
>>>>>>>>> Hi
>>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
>>>>>>>>> Not sure why but I have intermittent failure with store-jcr
>>> module.
>>>>>>>>> I definitely agree on the upgrade.
>>>>>>>>> Well we can simply detect it's not oak compatible and schedule
>a
>>>>>>>>> full
>>>>>>>>> reindex (maybe with a message in logs and ui?)
>>>>>>>>> But we need to be sure we can still read central index and not
>>> sure
>>>>>> 
>>>>>> about
>>>>>> 
>>>>>>>>> possible lucene conflict with oak and maven indexer.
>>>>>>>>> We can work on this branch? (I created a Jenkins job for it
>>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>>>>>>>>> va-jcr-oak-branch/)
>>>>>>>>> If you prefer master I would say no worries neither.
>>>>>>>>> Something else to look at is upgrading maven-core etc...
>>>>>>>>> Anyway
>>>>>>>>> Cheers
>>>>>>>>> Olivier
>>>>>>>>> 
>>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> upgrading the maven indexer leads to some major changes.
>>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
>>> Jackrabbit
>>>>>>>>>> sticks to
>>>>>>>>>> the old 3.x version and, as I see it, they will not move to a
>>> newer
>>>>>>>>>> version.
>>>>>>>>>> There is Jackrabbit Oak as alternative.
>>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
>>>>>>>>>> implementation of
>>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got
>the
>>>>>> 
>>>>>> unit
>>>>>> 
>>>>>>>>>> tests of
>>>>>>>>>> this module all to pass.
>>>>>>>>>> But switching to Oak has some drawbacks:
>>>>>>>>>> - The repository format changed and we must provide a way to
>>>>>>>>>> migrate
>>>>>>>>>> (either
>>>>>>>>>> migrate the existing repository or create a new one by
>>> reindexing)
>>>>>>>>>> - The lucene version used is newer but does not match to the
>>>>>>>>>> version
>>>>>>>>>> from the
>>>>>>>>>> maven-indexer dependencies. There may come up some
>>>>>>>>>> incompatibilities
>>>>>>>>>> that are
>>>>>>>>>> not solvable without using a modified version of one of the
>>> both.
>>>>>>>>>> Or
>>>>>>>>>> there may
>>>>>>>>>> be the possibility to switch to solr (as separate component)
>and
>>>>>> 
>>>>>> get rid
>>>>>> 
>>>>>>>>>> of
>>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
>>>>>>>>>> 
>>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
>too:
>>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
>>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
>>>>>>>>>> 
>>>>>>>>>> So switching to the new indexer and oak means more work as
>>> expected
>>>>>> 
>>>>>> and
>>>>>> 
>>>>>>>>>> some
>>>>>>>>>> risks regarding new incompatibility problems. And I think
>this
>>>>>> 
>>>>>> cannot be
>>>>>> 
>>>>>>>>>> done
>>>>>>>>>> without broken master builds for some time period.
>>>>>>>>>> 
>>>>>>>>>> So, what should we do? I think maven indexer is one of the
>core
>>>>>>>>>> components of
>>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate to
>>> the
>>>>>> 
>>>>>> new
>>>>>> 
>>>>>>>>>> indexer
>>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
>it
>>>>>>>>>> would
>>>>>>>>>> mean to
>>>>>>>>>> stick to the old version for the next years.
>>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
>>> changes, I
>>>>>> 
>>>>>> hope
>>>>>> 
>>>>>>>>>> you
>>>>>>>>>> can provide  useful help.
>>>>>>>>>> 
>>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are
>>> some
>>>>>>>>>> modules
>>>>>>>>>> where the tests do not pass (mainly because of the indexer
>API
>>>>>> 
>>>>>> changes).
>>>>>> 
>>>>>>>>>> Any comments?
>>>>>>>>>> 
>>>>>>>>>> Cheers
>>>>>>>>>> 
>>>>>>>>>> Martin
>>>>>>>>>> 
>>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
>Lamy:
>>>>>>>>>>> forget it but we need to ensure we can read maven index
>>> files....
>>>>>>>>>>> 
>>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
>>> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading
>>>>>> 
>>>>>> Lucene
>>>>>> 
>>>>>>>>>> can be a
>>>>>>>>>> 
>>>>>>>>>>>> problem here.
>>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
>>> until
>>>>>> 
>>>>>> the
>>>>>> 
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>>>> I can release it ;-)
>>>>>>>>>>>> 
>>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
>>> wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
>>> not
>>>>>> 
>>>>>> sure
>>>>>> 
>>>>>>>>>> about
>>>>>>>>>> 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> current state of maven-indexer. The version has not
>changed
>>>>>> 
>>>>>> since
>>>>>> 
>>>>>>>>>> some
>>>>>>>>>> 
>>>>>>>>>>>>> 2013.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> There are commits on the master branch since then, and the
>>>>>> 
>>>>>> lucene
>>>>>> 
>>>>>>>>>> version
>>>>>>>>>> 
>>>>>>>>>>>>> has
>>>>>>>>>>>>> been changed too, but no releases were tagged.
>>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
>>>>>>>>>>>>> 6.0-SNAPSHOT?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> As I know there are new compact index formats with new
>>> lucene
>>>>>>>>>> 
>>>>>>>>>> versions
>>>>>>>>>> 
>>>>>>>>>>>>> but I'm
>>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Martin
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Olivier Lamy
>>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Olivier Lamy
>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Olivier Lamy
>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>>> 
>>>>> --
>>>>> Olivier Lamy
>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>> 
>>> 
>>> 
>> 
>> 
>> -- 
>> Olivier Lamy
>> http://twitter.com/olamy | http://linkedin.com/in/olamy

-- 
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

Re: maven-indexer / Lucene

Posted by Chris Graham <ch...@gmail.com>.
Can I please an obvious/stupid question?

What is driving this need for change?

From a quick read of the thread above, all of the options appear to introduce a lot of breaking changes, and a whole lot more uncertainty.

So, what is so broken that it is driving these changes?

Sent from my iPhone

> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
> 
> Yup.
> The idea is to have an extra jar produced by the maven-indexer with shaded
> lucene version.
> So the lucene classes (version used by Maven indexer) will be relocated in
> a package called org.apache.maven.index.shaded.lucene (such
> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> Then you exclude lucene dependencies used by maven indexer and voila.
> The voila is a bit optimistic and not so ezy but anyway working on it ATM.
> 
> 
>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
>> 
>> What do you mean exactly by shading? Moving to another package name?
>> 
>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
>>> maybe an option is to use some shading?
>>> I'm thinking of shading lucene packages used by maven indexer. I can
>> easily
>>> provide a build for that.
>>> WDYT?
>>> 
>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
>>>> Hi
>>>> graph/document storage could be convenient (but not possible with
>> neo4j as
>>>> it's GPL license [1])
>>>> well we can add solr as an additional webapp with our jetty
>> distribution
>>>> but this will be a pain for users who want to use tomcat or any other
>>>> servlet container...
>>>> we still need to investigate a new storage model :-)
>>>> 
>>>> Olivier
>>>> [1] https://neo4j.com/licensing/
>>>> 
>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
>>>>> Yes, you are right. The lucene dependency causes a lot of trouble and
>>>>> will
>>>>> cause headaches with each version change of one of the dependencies.
>>>>> What are the requirements for a replacement?
>>>>> - We want to store hierarchical data?
>>>>> - We want to store metadata for nodes ?
>>>>> - Fulltext search (only metadata or for artifacts too?)
>>>>> - Blob / Artifact storage (I don't think so, but not so familiar with
>> the
>>>>> archiva artifact model)?
>>>>> 
>>>>> Maybe some graph database may be an alternative. Don't know if the
>>>>> license of
>>>>> neo4j is compatible to the apache license, and I think it brings
>> lucene
>>>>> as
>>>>> dependency too. I will have a look.
>>>>> Problem is, if there is fulltext search needed, I think, for most of
>> the
>>>>> frameworks we get a lucene dependency, if it's embedded.
>>>>> 
>>>>> Other alternatives:
>>>>> - Implement fulltext search by our own (index of the metadata stored
>> via
>>>>> the
>>>>> archiva api) and use the lucene dependency that comes from the
>>>>> maven-indexer
>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
>>>>> application
>>>>> (war).
>>>>> 
>>>>> Greetings
>>>>> 
>>>>> Martin
>>>>> 
>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
>>>>>> well this gonna be a pain.
>>>>>> IMHO we need to find a new alternative to jcr oak.
>>>>>> And something not using Lucene as it's a real pain to have different
>>>>>> librairies using lucene as they do not update in the same time (and
>>>>> 
>>>>> Lucene
>>>>> 
>>>>>> break backward compat so quickly...)
>>>>>> Any ideas? I'd like to have something embedded (but with a possible
>>>>>> external server configuration).
>>>>>> There is currently a Cassandra implementation. I was not satisfied
>>>>>> about
>>>>>> performance but I guess I did that 4yo ago so can be improved for
>> sure
>>>>> :
>>>>> :-)
>>>>> :
>>>>>> Maybe orientdb?
>>>>>> What else?
>>>>>> 
>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
>>>>>>> well the issue is non compatible version of Lucene for Maven
>> Indexer
>>>>> 
>>>>> and
>>>>> 
>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
>>>>>>> 
>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
>>>>>>>> Hi
>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
>>>>>>>> Not sure why but I have intermittent failure with store-jcr
>> module.
>>>>>>>> I definitely agree on the upgrade.
>>>>>>>> Well we can simply detect it's not oak compatible and schedule a
>>>>>>>> full
>>>>>>>> reindex (maybe with a message in logs and ui?)
>>>>>>>> But we need to be sure we can still read central index and not
>> sure
>>>>> 
>>>>> about
>>>>> 
>>>>>>>> possible lucene conflict with oak and maven indexer.
>>>>>>>> We can work on this branch? (I created a Jenkins job for it
>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>>>>>>>> va-jcr-oak-branch/)
>>>>>>>> If you prefer master I would say no worries neither.
>>>>>>>> Something else to look at is upgrading maven-core etc...
>>>>>>>> Anyway
>>>>>>>> Cheers
>>>>>>>> Olivier
>>>>>>>> 
>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> upgrading the maven indexer leads to some major changes.
>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
>> Jackrabbit
>>>>>>>>> sticks to
>>>>>>>>> the old 3.x version and, as I see it, they will not move to a
>> newer
>>>>>>>>> version.
>>>>>>>>> There is Jackrabbit Oak as alternative.
>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
>>>>>>>>> implementation of
>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got the
>>>>> 
>>>>> unit
>>>>> 
>>>>>>>>> tests of
>>>>>>>>> this module all to pass.
>>>>>>>>> But switching to Oak has some drawbacks:
>>>>>>>>> - The repository format changed and we must provide a way to
>>>>>>>>> migrate
>>>>>>>>> (either
>>>>>>>>> migrate the existing repository or create a new one by
>> reindexing)
>>>>>>>>> - The lucene version used is newer but does not match to the
>>>>>>>>> version
>>>>>>>>> from the
>>>>>>>>> maven-indexer dependencies. There may come up some
>>>>>>>>> incompatibilities
>>>>>>>>> that are
>>>>>>>>> not solvable without using a modified version of one of the
>> both.
>>>>>>>>> Or
>>>>>>>>> there may
>>>>>>>>> be the possibility to switch to solr (as separate component) and
>>>>> 
>>>>> get rid
>>>>> 
>>>>>>>>> of
>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
>>>>>>>>> 
>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
>>>>>>>>> 
>>>>>>>>> So switching to the new indexer and oak means more work as
>> expected
>>>>> 
>>>>> and
>>>>> 
>>>>>>>>> some
>>>>>>>>> risks regarding new incompatibility problems. And I think this
>>>>> 
>>>>> cannot be
>>>>> 
>>>>>>>>> done
>>>>>>>>> without broken master builds for some time period.
>>>>>>>>> 
>>>>>>>>> So, what should we do? I think maven indexer is one of the core
>>>>>>>>> components of
>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate to
>> the
>>>>> 
>>>>> new
>>>>> 
>>>>>>>>> indexer
>>>>>>>>> version, even if this means switching to jcr oak. Otherwise it
>>>>>>>>> would
>>>>>>>>> mean to
>>>>>>>>> stick to the old version for the next years.
>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
>> changes, I
>>>>> 
>>>>> hope
>>>>> 
>>>>>>>>> you
>>>>>>>>> can provide  useful help.
>>>>>>>>> 
>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are
>> some
>>>>>>>>> modules
>>>>>>>>> where the tests do not pass (mainly because of the indexer API
>>>>> 
>>>>> changes).
>>>>> 
>>>>>>>>> Any comments?
>>>>>>>>> 
>>>>>>>>> Cheers
>>>>>>>>> 
>>>>>>>>> Martin
>>>>>>>>> 
>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
>>>>>>>>>> forget it but we need to ensure we can read maven index
>> files....
>>>>>>>>>> 
>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
>> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading
>>>>> 
>>>>> Lucene
>>>>> 
>>>>>>>>> can be a
>>>>>>>>> 
>>>>>>>>>>> problem here.
>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
>> until
>>>>> 
>>>>> the
>>>>> 
>>>>>>>>> release.
>>>>>>>>> 
>>>>>>>>>>> I can release it ;-)
>>>>>>>>>>> 
>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
>> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
>> not
>>>>> 
>>>>> sure
>>>>> 
>>>>>>>>> about
>>>>>>>>> 
>>>>>>>>>>>> the
>>>>>>>>>>>> current state of maven-indexer. The version has not changed
>>>>> 
>>>>> since
>>>>> 
>>>>>>>>> some
>>>>>>>>> 
>>>>>>>>>>>> 2013.
>>>>>>>>>>>> 
>>>>>>>>>>>> There are commits on the master branch since then, and the
>>>>> 
>>>>> lucene
>>>>> 
>>>>>>>>> version
>>>>>>>>> 
>>>>>>>>>>>> has
>>>>>>>>>>>> been changed too, but no releases were tagged.
>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
>>>>>>>>>>>> 6.0-SNAPSHOT?
>>>>>>>>>>>> 
>>>>>>>>>>>> As I know there are new compact index formats with new
>> lucene
>>>>>>>>> 
>>>>>>>>> versions
>>>>>>>>> 
>>>>>>>>>>>> but I'm
>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
>>>>>>>>>>>> 
>>>>>>>>>>>> Cheers
>>>>>>>>>>>> 
>>>>>>>>>>>> Martin
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Olivier Lamy
>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Olivier Lamy
>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>>>>> 
>>>>>>> --
>>>>>>> Olivier Lamy
>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>> 
>>>> --
>>>> Olivier Lamy
>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> 
>> 
>> 
> 
> 
> -- 
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Yup.
The idea is to have an extra jar produced by the maven-indexer with shaded
lucene version.
So the lucene classes (version used by Maven indexer) will be relocated in
a package called org.apache.maven.index.shaded.lucene (such
org.apache.maven.index.shaded.lucene.search.BooleanClause )
Then you exclude lucene dependencies used by maven indexer and voila.
The voila is a bit optimistic and not so ezy but anyway working on it ATM.


On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:

> What do you mean exactly by shading? Moving to another package name?
>
> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> > maybe an option is to use some shading?
> > I'm thinking of shading lucene packages used by maven indexer. I can
> easily
> > provide a build for that.
> > WDYT?
> >
> > On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
> > > Hi
> > > graph/document storage could be convenient (but not possible with
> neo4j as
> > > it's GPL license [1])
> > > well we can add solr as an additional webapp with our jetty
> distribution
> > > but this will be a pain for users who want to use tomcat or any other
> > > servlet container...
> > > we still need to investigate a new storage model :-)
> > >
> > > Olivier
> > > [1] https://neo4j.com/licensing/
> > >
> > > On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> > >> Yes, you are right. The lucene dependency causes a lot of trouble and
> > >> will
> > >> cause headaches with each version change of one of the dependencies.
> > >> What are the requirements for a replacement?
> > >> - We want to store hierarchical data?
> > >> - We want to store metadata for nodes ?
> > >> - Fulltext search (only metadata or for artifacts too?)
> > >> - Blob / Artifact storage (I don't think so, but not so familiar with
> the
> > >> archiva artifact model)?
> > >>
> > >> Maybe some graph database may be an alternative. Don't know if the
> > >> license of
> > >> neo4j is compatible to the apache license, and I think it brings
> lucene
> > >> as
> > >> dependency too. I will have a look.
> > >> Problem is, if there is fulltext search needed, I think, for most of
> the
> > >> frameworks we get a lucene dependency, if it's embedded.
> > >>
> > >> Other alternatives:
> > >> - Implement fulltext search by our own (index of the metadata stored
> via
> > >> the
> > >> archiva api) and use the lucene dependency that comes from the
> > >> maven-indexer
> > >> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> > >> application
> > >> (war).
> > >>
> > >> Greetings
> > >>
> > >> Martin
> > >>
> > >> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> > >> > well this gonna be a pain.
> > >> > IMHO we need to find a new alternative to jcr oak.
> > >> > And something not using Lucene as it's a real pain to have different
> > >> > librairies using lucene as they do not update in the same time (and
> > >>
> > >> Lucene
> > >>
> > >> > break backward compat so quickly...)
> > >> > Any ideas? I'd like to have something embedded (but with a possible
> > >> > external server configuration).
> > >> > There is currently a Cassandra implementation. I was not satisfied
> > >> > about
> > >> > performance but I guess I did that 4yo ago so can be improved for
> sure
> > >> :
> > >> :-)
> > >> :
> > >> > Maybe orientdb?
> > >> > What else?
> > >> >
> > >> > On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
> > >> > > well the issue is non compatible version of Lucene for Maven
> Indexer
> > >>
> > >> and
> > >>
> > >> > > Oak (well I can try push a patch to Oak for upgrading...)
> > >> > >
> > >> > > On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
> > >> > >> Hi
> > >> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> > >> > >> I'm working on it in the branch ( feature/jcr_oak )
> > >> > >> Not sure why but I have intermittent failure with store-jcr
> module.
> > >> > >> I definitely agree on the upgrade.
> > >> > >> Well we can simply detect it's not oak compatible and schedule a
> > >> > >> full
> > >> > >> reindex (maybe with a message in logs and ui?)
> > >> > >> But we need to be sure we can still read central index and not
> sure
> > >>
> > >> about
> > >>
> > >> > >> possible lucene conflict with oak and maven indexer.
> > >> > >> We can work on this branch? (I created a Jenkins job for it
> > >> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> > >> > >> va-jcr-oak-branch/)
> > >> > >> If you prefer master I would say no worries neither.
> > >> > >> Something else to look at is upgrading maven-core etc...
> > >> > >> Anyway
> > >> > >> Cheers
> > >> > >> Olivier
> > >> > >>
> > >> > >> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> > >> > >>> Hi,
> > >> > >>>
> > >> > >>> upgrading the maven indexer leads to some major changes.
> > >> > >>> Lucene is used by maven-indexer and also by jackrabbit.
> Jackrabbit
> > >> > >>> sticks to
> > >> > >>> the old 3.x version and, as I see it, they will not move to a
> newer
> > >> > >>> version.
> > >> > >>> There is Jackrabbit Oak as alternative.
> > >> > >>> I tried a proof of concept and could replace the jackrabbit
> > >> > >>> implementation of
> > >> > >>> metadata-store-jcr with a oak implementation. At least I got the
> > >>
> > >> unit
> > >>
> > >> > >>> tests of
> > >> > >>> this module all to pass.
> > >> > >>> But switching to Oak has some drawbacks:
> > >> > >>> - The repository format changed and we must provide a way to
> > >> > >>> migrate
> > >> > >>> (either
> > >> > >>> migrate the existing repository or create a new one by
> reindexing)
> > >> > >>> - The lucene version used is newer but does not match to the
> > >> > >>> version
> > >> > >>> from the
> > >> > >>> maven-indexer dependencies. There may come up some
> > >> > >>> incompatibilities
> > >> > >>> that are
> > >> > >>> not solvable without using a modified version of one of the
> both.
> > >> > >>> Or
> > >> > >>> there may
> > >> > >>> be the possibility to switch to solr (as separate component) and
> > >>
> > >> get rid
> > >>
> > >> > >>> of
> > >> > >>> the lucene dependencies for jcr inside the archiva project.
> > >> > >>>
> > >> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> > >> > >>> - The Plexus-Sisu-Bridge does not work as before.
> > >> > >>> - We must migrate from the NexusIndexer to the indexer API.
> > >> > >>>
> > >> > >>> So switching to the new indexer and oak means more work as
> expected
> > >>
> > >> and
> > >>
> > >> > >>> some
> > >> > >>> risks regarding new incompatibility problems. And I think this
> > >>
> > >> cannot be
> > >>
> > >> > >>> done
> > >> > >>> without broken master builds for some time period.
> > >> > >>>
> > >> > >>> So, what should we do? I think maven indexer is one of the core
> > >> > >>> components of
> > >> > >>> archiva, and we should utilize the 3.x-version to  migrate to
> the
> > >>
> > >> new
> > >>
> > >> > >>> indexer
> > >> > >>> version, even if this means switching to jcr oak. Otherwise it
> > >> > >>> would
> > >> > >>> mean to
> > >> > >>> stick to the old version for the next years.
> > >> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> changes, I
> > >>
> > >> hope
> > >>
> > >> > >>> you
> > >> > >>> can provide  useful help.
> > >> > >>>
> > >> > >>> I committed the PoC to the branch feature/jcr_oak. There are
> some
> > >> > >>> modules
> > >> > >>> where the tests do not pass (mainly because of the indexer API
> > >>
> > >> changes).
> > >>
> > >> > >>> Any comments?
> > >> > >>>
> > >> > >>> Cheers
> > >> > >>>
> > >> > >>> Martin
> > >> > >>>
> > >> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> > >> > >>> > forget it but we need to ensure we can read maven index
> files....
> > >> > >>> >
> > >> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
> wrote:
> > >> > >>> > > Hi,
> > >> > >>> > > Remember jackrabbit depends on Lucene as well so upgrading
> > >>
> > >> Lucene
> > >>
> > >> > >>> can be a
> > >> > >>>
> > >> > >>> > > problem here.
> > >> > >>> > > Regarding maven-indexer yes we can depend on a snapshot
> until
> > >>
> > >> the
> > >>
> > >> > >>> release.
> > >> > >>>
> > >> > >>> > > I can release it ;-)
> > >> > >>> > >
> > >> > >>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org>
> wrote:
> > >> > >>> > >> Hi,
> > >> > >>> > >>
> > >> > >>> > >> the lucene version depends on the maven indexer. But I'm
> not
> > >>
> > >> sure
> > >>
> > >> > >>> about
> > >> > >>>
> > >> > >>> > >> the
> > >> > >>> > >> current state of maven-indexer. The version has not changed
> > >>
> > >> since
> > >>
> > >> > >>> some
> > >> > >>>
> > >> > >>> > >> 2013.
> > >> > >>> > >>
> > >> > >>> > >> There are commits on the master branch since then, and the
> > >>
> > >> lucene
> > >>
> > >> > >>> version
> > >> > >>>
> > >> > >>> > >> has
> > >> > >>> > >> been changed too, but no releases were tagged.
> > >> > >>> > >> Does it make sense to switch to the maven-indexer
> > >> > >>> > >> 6.0-SNAPSHOT?
> > >> > >>> > >>
> > >> > >>> > >> As I know there are new compact index formats with new
> lucene
> > >> > >>>
> > >> > >>> versions
> > >> > >>>
> > >> > >>> > >> but I'm
> > >> > >>> > >> not sure if this is relevant for the maven indexes.
> > >> > >>> > >>
> > >> > >>> > >> Cheers
> > >> > >>> > >>
> > >> > >>> > >> Martin
> > >> > >>> > >
> > >> > >>> > > --
> > >> > >>> > > Olivier Lamy
> > >> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> > >>
> > >> > >> --
> > >> > >> Olivier Lamy
> > >> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> > >
> > >> > > --
> > >> > > Olivier Lamy
> > >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
What do you mean exactly by shading? Moving to another package name?

Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> maybe an option is to use some shading?
> I'm thinking of shading lucene packages used by maven indexer. I can easily
> provide a build for that.
> WDYT?
> 
> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:
> > Hi
> > graph/document storage could be convenient (but not possible with neo4j as
> > it's GPL license [1])
> > well we can add solr as an additional webapp with our jetty distribution
> > but this will be a pain for users who want to use tomcat or any other
> > servlet container...
> > we still need to investigate a new storage model :-)
> > 
> > Olivier
> > [1] https://neo4j.com/licensing/
> > 
> > On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> >> Yes, you are right. The lucene dependency causes a lot of trouble and
> >> will
> >> cause headaches with each version change of one of the dependencies.
> >> What are the requirements for a replacement?
> >> - We want to store hierarchical data?
> >> - We want to store metadata for nodes ?
> >> - Fulltext search (only metadata or for artifacts too?)
> >> - Blob / Artifact storage (I don't think so, but not so familiar with the
> >> archiva artifact model)?
> >> 
> >> Maybe some graph database may be an alternative. Don't know if the
> >> license of
> >> neo4j is compatible to the apache license, and I think it brings lucene
> >> as
> >> dependency too. I will have a look.
> >> Problem is, if there is fulltext search needed, I think, for most of the
> >> frameworks we get a lucene dependency, if it's embedded.
> >> 
> >> Other alternatives:
> >> - Implement fulltext search by our own (index of the metadata stored via
> >> the
> >> archiva api) and use the lucene dependency that comes from the
> >> maven-indexer
> >> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> >> application
> >> (war).
> >> 
> >> Greetings
> >> 
> >> Martin
> >> 
> >> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> >> > well this gonna be a pain.
> >> > IMHO we need to find a new alternative to jcr oak.
> >> > And something not using Lucene as it's a real pain to have different
> >> > librairies using lucene as they do not update in the same time (and
> >> 
> >> Lucene
> >> 
> >> > break backward compat so quickly...)
> >> > Any ideas? I'd like to have something embedded (but with a possible
> >> > external server configuration).
> >> > There is currently a Cassandra implementation. I was not satisfied
> >> > about
> >> > performance but I guess I did that 4yo ago so can be improved for sure
> >> :
> >> :-)
> >> :
> >> > Maybe orientdb?
> >> > What else?
> >> > 
> >> > On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
> >> > > well the issue is non compatible version of Lucene for Maven Indexer
> >> 
> >> and
> >> 
> >> > > Oak (well I can try push a patch to Oak for upgrading...)
> >> > > 
> >> > > On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
> >> > >> Hi
> >> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> >> > >> I'm working on it in the branch ( feature/jcr_oak )
> >> > >> Not sure why but I have intermittent failure with store-jcr module.
> >> > >> I definitely agree on the upgrade.
> >> > >> Well we can simply detect it's not oak compatible and schedule a
> >> > >> full
> >> > >> reindex (maybe with a message in logs and ui?)
> >> > >> But we need to be sure we can still read central index and not sure
> >> 
> >> about
> >> 
> >> > >> possible lucene conflict with oak and maven indexer.
> >> > >> We can work on this branch? (I created a Jenkins job for it
> >> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >> > >> va-jcr-oak-branch/)
> >> > >> If you prefer master I would say no worries neither.
> >> > >> Something else to look at is upgrading maven-core etc...
> >> > >> Anyway
> >> > >> Cheers
> >> > >> Olivier
> >> > >> 
> >> > >> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> >> > >>> Hi,
> >> > >>> 
> >> > >>> upgrading the maven indexer leads to some major changes.
> >> > >>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
> >> > >>> sticks to
> >> > >>> the old 3.x version and, as I see it, they will not move to a newer
> >> > >>> version.
> >> > >>> There is Jackrabbit Oak as alternative.
> >> > >>> I tried a proof of concept and could replace the jackrabbit
> >> > >>> implementation of
> >> > >>> metadata-store-jcr with a oak implementation. At least I got the
> >> 
> >> unit
> >> 
> >> > >>> tests of
> >> > >>> this module all to pass.
> >> > >>> But switching to Oak has some drawbacks:
> >> > >>> - The repository format changed and we must provide a way to
> >> > >>> migrate
> >> > >>> (either
> >> > >>> migrate the existing repository or create a new one by reindexing)
> >> > >>> - The lucene version used is newer but does not match to the
> >> > >>> version
> >> > >>> from the
> >> > >>> maven-indexer dependencies. There may come up some
> >> > >>> incompatibilities
> >> > >>> that are
> >> > >>> not solvable without using a modified version of one of the both.
> >> > >>> Or
> >> > >>> there may
> >> > >>> be the possibility to switch to solr (as separate component) and
> >> 
> >> get rid
> >> 
> >> > >>> of
> >> > >>> the lucene dependencies for jcr inside the archiva project.
> >> > >>> 
> >> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> >> > >>> - The Plexus-Sisu-Bridge does not work as before.
> >> > >>> - We must migrate from the NexusIndexer to the indexer API.
> >> > >>> 
> >> > >>> So switching to the new indexer and oak means more work as expected
> >> 
> >> and
> >> 
> >> > >>> some
> >> > >>> risks regarding new incompatibility problems. And I think this
> >> 
> >> cannot be
> >> 
> >> > >>> done
> >> > >>> without broken master builds for some time period.
> >> > >>> 
> >> > >>> So, what should we do? I think maven indexer is one of the core
> >> > >>> components of
> >> > >>> archiva, and we should utilize the 3.x-version to  migrate to the
> >> 
> >> new
> >> 
> >> > >>> indexer
> >> > >>> version, even if this means switching to jcr oak. Otherwise it
> >> > >>> would
> >> > >>> mean to
> >> > >>> stick to the old version for the next years.
> >> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I
> >> 
> >> hope
> >> 
> >> > >>> you
> >> > >>> can provide  useful help.
> >> > >>> 
> >> > >>> I committed the PoC to the branch feature/jcr_oak. There are some
> >> > >>> modules
> >> > >>> where the tests do not pass (mainly because of the indexer API
> >> 
> >> changes).
> >> 
> >> > >>> Any comments?
> >> > >>> 
> >> > >>> Cheers
> >> > >>> 
> >> > >>> Martin
> >> > >>> 
> >> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> >> > >>> > forget it but we need to ensure we can read maven index files....
> >> > >>> > 
> >> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> >> > >>> > > Hi,
> >> > >>> > > Remember jackrabbit depends on Lucene as well so upgrading
> >> 
> >> Lucene
> >> 
> >> > >>> can be a
> >> > >>> 
> >> > >>> > > problem here.
> >> > >>> > > Regarding maven-indexer yes we can depend on a snapshot until
> >> 
> >> the
> >> 
> >> > >>> release.
> >> > >>> 
> >> > >>> > > I can release it ;-)
> >> > >>> > > 
> >> > >>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> >> > >>> > >> Hi,
> >> > >>> > >> 
> >> > >>> > >> the lucene version depends on the maven indexer. But I'm not
> >> 
> >> sure
> >> 
> >> > >>> about
> >> > >>> 
> >> > >>> > >> the
> >> > >>> > >> current state of maven-indexer. The version has not changed
> >> 
> >> since
> >> 
> >> > >>> some
> >> > >>> 
> >> > >>> > >> 2013.
> >> > >>> > >> 
> >> > >>> > >> There are commits on the master branch since then, and the
> >> 
> >> lucene
> >> 
> >> > >>> version
> >> > >>> 
> >> > >>> > >> has
> >> > >>> > >> been changed too, but no releases were tagged.
> >> > >>> > >> Does it make sense to switch to the maven-indexer
> >> > >>> > >> 6.0-SNAPSHOT?
> >> > >>> > >> 
> >> > >>> > >> As I know there are new compact index formats with new lucene
> >> > >>> 
> >> > >>> versions
> >> > >>> 
> >> > >>> > >> but I'm
> >> > >>> > >> not sure if this is relevant for the maven indexes.
> >> > >>> > >> 
> >> > >>> > >> Cheers
> >> > >>> > >> 
> >> > >>> > >> Martin
> >> > >>> > > 
> >> > >>> > > --
> >> > >>> > > Olivier Lamy
> >> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >> 
> >> > >> --
> >> > >> Olivier Lamy
> >> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > > 
> >> > > --
> >> > > Olivier Lamy
> >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > 
> > --
> > Olivier Lamy
> > http://twitter.com/olamy | http://linkedin.com/in/olamy



Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
maybe an option is to use some shading?
I'm thinking of shading lucene packages used by maven indexer. I can easily
provide a build for that.
WDYT?


On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote:

> Hi
> graph/document storage could be convenient (but not possible with neo4j as
> it's GPL license [1])
> well we can add solr as an additional webapp with our jetty distribution
> but this will be a pain for users who want to use tomcat or any other
> servlet container...
> we still need to investigate a new storage model :-)
>
> Olivier
> [1] https://neo4j.com/licensing/
>
> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
>
>> Yes, you are right. The lucene dependency causes a lot of trouble and will
>> cause headaches with each version change of one of the dependencies.
>> What are the requirements for a replacement?
>> - We want to store hierarchical data?
>> - We want to store metadata for nodes ?
>> - Fulltext search (only metadata or for artifacts too?)
>> - Blob / Artifact storage (I don't think so, but not so familiar with the
>> archiva artifact model)?
>>
>> Maybe some graph database may be an alternative. Don't know if the
>> license of
>> neo4j is compatible to the apache license, and I think it brings lucene as
>> dependency too. I will have a look.
>> Problem is, if there is fulltext search needed, I think, for most of the
>> frameworks we get a lucene dependency, if it's embedded.
>>
>> Other alternatives:
>> - Implement fulltext search by our own (index of the metadata stored via
>> the
>> archiva api) and use the lucene dependency that comes from the
>> maven-indexer
>> - Jcr Oak with Solr. Solr is not embedded, must run as its own application
>> (war).
>>
>> Greetings
>>
>> Martin
>>
>>
>>
>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
>> > well this gonna be a pain.
>> > IMHO we need to find a new alternative to jcr oak.
>> > And something not using Lucene as it's a real pain to have different
>> > librairies using lucene as they do not update in the same time (and
>> Lucene
>> > break backward compat so quickly...)
>> > Any ideas? I'd like to have something embedded (but with a possible
>> > external server configuration).
>> > There is currently a Cassandra implementation. I was not satisfied about
>> > performance but I guess I did that 4yo ago so can be improved for sure
>> :-)
>> > Maybe orientdb?
>> > What else?
>> >
>> > On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
>> > > well the issue is non compatible version of Lucene for Maven Indexer
>> and
>> > > Oak (well I can try push a patch to Oak for upgrading...)
>> > >
>> > > On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
>> > >> Hi
>> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>> > >> I'm working on it in the branch ( feature/jcr_oak )
>> > >> Not sure why but I have intermittent failure with store-jcr module.
>> > >> I definitely agree on the upgrade.
>> > >> Well we can simply detect it's not oak compatible and schedule a full
>> > >> reindex (maybe with a message in logs and ui?)
>> > >> But we need to be sure we can still read central index and not sure
>> about
>> > >> possible lucene conflict with oak and maven indexer.
>> > >> We can work on this branch? (I created a Jenkins job for it
>> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> > >> va-jcr-oak-branch/)
>> > >> If you prefer master I would say no worries neither.
>> > >> Something else to look at is upgrading maven-core etc...
>> > >> Anyway
>> > >> Cheers
>> > >> Olivier
>> > >>
>> > >> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>> > >>> Hi,
>> > >>>
>> > >>> upgrading the maven indexer leads to some major changes.
>> > >>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
>> > >>> sticks to
>> > >>> the old 3.x version and, as I see it, they will not move to a newer
>> > >>> version.
>> > >>> There is Jackrabbit Oak as alternative.
>> > >>> I tried a proof of concept and could replace the jackrabbit
>> > >>> implementation of
>> > >>> metadata-store-jcr with a oak implementation. At least I got the
>> unit
>> > >>> tests of
>> > >>> this module all to pass.
>> > >>> But switching to Oak has some drawbacks:
>> > >>> - The repository format changed and we must provide a way to migrate
>> > >>> (either
>> > >>> migrate the existing repository or create a new one by reindexing)
>> > >>> - The lucene version used is newer but does not match to the version
>> > >>> from the
>> > >>> maven-indexer dependencies. There may come up some incompatibilities
>> > >>> that are
>> > >>> not solvable without using a modified version of one of the both. Or
>> > >>> there may
>> > >>> be the possibility to switch to solr (as separate component) and
>> get rid
>> > >>> of
>> > >>> the lucene dependencies for jcr inside the archiva project.
>> > >>>
>> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
>> > >>> - The Plexus-Sisu-Bridge does not work as before.
>> > >>> - We must migrate from the NexusIndexer to the indexer API.
>> > >>>
>> > >>> So switching to the new indexer and oak means more work as expected
>> and
>> > >>> some
>> > >>> risks regarding new incompatibility problems. And I think this
>> cannot be
>> > >>> done
>> > >>> without broken master builds for some time period.
>> > >>>
>> > >>> So, what should we do? I think maven indexer is one of the core
>> > >>> components of
>> > >>> archiva, and we should utilize the 3.x-version to  migrate to the
>> new
>> > >>> indexer
>> > >>> version, even if this means switching to jcr oak. Otherwise it would
>> > >>> mean to
>> > >>> stick to the old version for the next years.
>> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I
>> hope
>> > >>> you
>> > >>> can provide  useful help.
>> > >>>
>> > >>> I committed the PoC to the branch feature/jcr_oak. There are some
>> > >>> modules
>> > >>> where the tests do not pass (mainly because of the indexer API
>> changes).
>> > >>>
>> > >>> Any comments?
>> > >>>
>> > >>> Cheers
>> > >>>
>> > >>> Martin
>> > >>>
>> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
>> > >>> > forget it but we need to ensure we can read maven index files....
>> > >>> >
>> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
>> > >>> > > Hi,
>> > >>> > > Remember jackrabbit depends on Lucene as well so upgrading
>> Lucene
>> > >>>
>> > >>> can be a
>> > >>>
>> > >>> > > problem here.
>> > >>> > > Regarding maven-indexer yes we can depend on a snapshot until
>> the
>> > >>>
>> > >>> release.
>> > >>>
>> > >>> > > I can release it ;-)
>> > >>> > >
>> > >>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
>> > >>> > >> Hi,
>> > >>> > >>
>> > >>> > >> the lucene version depends on the maven indexer. But I'm not
>> sure
>> > >>>
>> > >>> about
>> > >>>
>> > >>> > >> the
>> > >>> > >> current state of maven-indexer. The version has not changed
>> since
>> > >>>
>> > >>> some
>> > >>>
>> > >>> > >> 2013.
>> > >>> > >>
>> > >>> > >> There are commits on the master branch since then, and the
>> lucene
>> > >>>
>> > >>> version
>> > >>>
>> > >>> > >> has
>> > >>> > >> been changed too, but no releases were tagged.
>> > >>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>> > >>> > >>
>> > >>> > >> As I know there are new compact index formats with new lucene
>> > >>>
>> > >>> versions
>> > >>>
>> > >>> > >> but I'm
>> > >>> > >> not sure if this is relevant for the maven indexes.
>> > >>> > >>
>> > >>> > >> Cheers
>> > >>> > >>
>> > >>> > >> Martin
>> > >>> > >
>> > >>> > > --
>> > >>> > > Olivier Lamy
>> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >>
>> > >> --
>> > >> Olivier Lamy
>> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >
>> > > --
>> > > Olivier Lamy
>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>>
>>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi
graph/document storage could be convenient (but not possible with neo4j as
it's GPL license [1])
well we can add solr as an additional webapp with our jetty distribution
but this will be a pain for users who want to use tomcat or any other
servlet container...
we still need to investigate a new storage model :-)

Olivier
[1] https://neo4j.com/licensing/

On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:

> Yes, you are right. The lucene dependency causes a lot of trouble and will
> cause headaches with each version change of one of the dependencies.
> What are the requirements for a replacement?
> - We want to store hierarchical data?
> - We want to store metadata for nodes ?
> - Fulltext search (only metadata or for artifacts too?)
> - Blob / Artifact storage (I don't think so, but not so familiar with the
> archiva artifact model)?
>
> Maybe some graph database may be an alternative. Don't know if the license
> of
> neo4j is compatible to the apache license, and I think it brings lucene as
> dependency too. I will have a look.
> Problem is, if there is fulltext search needed, I think, for most of the
> frameworks we get a lucene dependency, if it's embedded.
>
> Other alternatives:
> - Implement fulltext search by our own (index of the metadata stored via
> the
> archiva api) and use the lucene dependency that comes from the
> maven-indexer
> - Jcr Oak with Solr. Solr is not embedded, must run as its own application
> (war).
>
> Greetings
>
> Martin
>
>
>
> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> > well this gonna be a pain.
> > IMHO we need to find a new alternative to jcr oak.
> > And something not using Lucene as it's a real pain to have different
> > librairies using lucene as they do not update in the same time (and
> Lucene
> > break backward compat so quickly...)
> > Any ideas? I'd like to have something embedded (but with a possible
> > external server configuration).
> > There is currently a Cassandra implementation. I was not satisfied about
> > performance but I guess I did that 4yo ago so can be improved for sure
> :-)
> > Maybe orientdb?
> > What else?
> >
> > On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
> > > well the issue is non compatible version of Lucene for Maven Indexer
> and
> > > Oak (well I can try push a patch to Oak for upgrading...)
> > >
> > > On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
> > >> Hi
> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> > >> I'm working on it in the branch ( feature/jcr_oak )
> > >> Not sure why but I have intermittent failure with store-jcr module.
> > >> I definitely agree on the upgrade.
> > >> Well we can simply detect it's not oak compatible and schedule a full
> > >> reindex (maybe with a message in logs and ui?)
> > >> But we need to be sure we can still read central index and not sure
> about
> > >> possible lucene conflict with oak and maven indexer.
> > >> We can work on this branch? (I created a Jenkins job for it
> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> > >> va-jcr-oak-branch/)
> > >> If you prefer master I would say no worries neither.
> > >> Something else to look at is upgrading maven-core etc...
> > >> Anyway
> > >> Cheers
> > >> Olivier
> > >>
> > >> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> > >>> Hi,
> > >>>
> > >>> upgrading the maven indexer leads to some major changes.
> > >>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
> > >>> sticks to
> > >>> the old 3.x version and, as I see it, they will not move to a newer
> > >>> version.
> > >>> There is Jackrabbit Oak as alternative.
> > >>> I tried a proof of concept and could replace the jackrabbit
> > >>> implementation of
> > >>> metadata-store-jcr with a oak implementation. At least I got the unit
> > >>> tests of
> > >>> this module all to pass.
> > >>> But switching to Oak has some drawbacks:
> > >>> - The repository format changed and we must provide a way to migrate
> > >>> (either
> > >>> migrate the existing repository or create a new one by reindexing)
> > >>> - The lucene version used is newer but does not match to the version
> > >>> from the
> > >>> maven-indexer dependencies. There may come up some incompatibilities
> > >>> that are
> > >>> not solvable without using a modified version of one of the both. Or
> > >>> there may
> > >>> be the possibility to switch to solr (as separate component) and get
> rid
> > >>> of
> > >>> the lucene dependencies for jcr inside the archiva project.
> > >>>
> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> > >>> - The Plexus-Sisu-Bridge does not work as before.
> > >>> - We must migrate from the NexusIndexer to the indexer API.
> > >>>
> > >>> So switching to the new indexer and oak means more work as expected
> and
> > >>> some
> > >>> risks regarding new incompatibility problems. And I think this
> cannot be
> > >>> done
> > >>> without broken master builds for some time period.
> > >>>
> > >>> So, what should we do? I think maven indexer is one of the core
> > >>> components of
> > >>> archiva, and we should utilize the 3.x-version to  migrate to the new
> > >>> indexer
> > >>> version, even if this means switching to jcr oak. Otherwise it would
> > >>> mean to
> > >>> stick to the old version for the next years.
> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I
> hope
> > >>> you
> > >>> can provide  useful help.
> > >>>
> > >>> I committed the PoC to the branch feature/jcr_oak. There are some
> > >>> modules
> > >>> where the tests do not pass (mainly because of the indexer API
> changes).
> > >>>
> > >>> Any comments?
> > >>>
> > >>> Cheers
> > >>>
> > >>> Martin
> > >>>
> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> > >>> > forget it but we need to ensure we can read maven index files....
> > >>> >
> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> > >>> > > Hi,
> > >>> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene
> > >>>
> > >>> can be a
> > >>>
> > >>> > > problem here.
> > >>> > > Regarding maven-indexer yes we can depend on a snapshot until the
> > >>>
> > >>> release.
> > >>>
> > >>> > > I can release it ;-)
> > >>> > >
> > >>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> > >>> > >> Hi,
> > >>> > >>
> > >>> > >> the lucene version depends on the maven indexer. But I'm not
> sure
> > >>>
> > >>> about
> > >>>
> > >>> > >> the
> > >>> > >> current state of maven-indexer. The version has not changed
> since
> > >>>
> > >>> some
> > >>>
> > >>> > >> 2013.
> > >>> > >>
> > >>> > >> There are commits on the master branch since then, and the
> lucene
> > >>>
> > >>> version
> > >>>
> > >>> > >> has
> > >>> > >> been changed too, but no releases were tagged.
> > >>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
> > >>> > >>
> > >>> > >> As I know there are new compact index formats with new lucene
> > >>>
> > >>> versions
> > >>>
> > >>> > >> but I'm
> > >>> > >> not sure if this is relevant for the maven indexes.
> > >>> > >>
> > >>> > >> Cheers
> > >>> > >>
> > >>> > >> Martin
> > >>> > >
> > >>> > > --
> > >>> > > Olivier Lamy
> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >>
> > >> --
> > >> Olivier Lamy
> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
Yes, you are right. The lucene dependency causes a lot of trouble and will 
cause headaches with each version change of one of the dependencies.
What are the requirements for a replacement?
- We want to store hierarchical data?
- We want to store metadata for nodes ?
- Fulltext search (only metadata or for artifacts too?)
- Blob / Artifact storage (I don't think so, but not so familiar with the 
archiva artifact model)?

Maybe some graph database may be an alternative. Don't know if the license of 
neo4j is compatible to the apache license, and I think it brings lucene as 
dependency too. I will have a look.
Problem is, if there is fulltext search needed, I think, for most of the 
frameworks we get a lucene dependency, if it's embedded. 

Other alternatives:
- Implement fulltext search by our own (index of the metadata stored via the 
archiva api) and use the lucene dependency that comes from the maven-indexer
- Jcr Oak with Solr. Solr is not embedded, must run as its own application 
(war). 

Greetings

Martin



Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> well this gonna be a pain.
> IMHO we need to find a new alternative to jcr oak.
> And something not using Lucene as it's a real pain to have different
> librairies using lucene as they do not update in the same time (and Lucene
> break backward compat so quickly...)
> Any ideas? I'd like to have something embedded (but with a possible
> external server configuration).
> There is currently a Cassandra implementation. I was not satisfied about
> performance but I guess I did that 4yo ago so can be improved for sure :-)
> Maybe orientdb?
> What else?
> 
> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:
> > well the issue is non compatible version of Lucene for Maven Indexer and
> > Oak (well I can try push a patch to Oak for upgrading...)
> > 
> > On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
> >> Hi
> >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> >> I'm working on it in the branch ( feature/jcr_oak )
> >> Not sure why but I have intermittent failure with store-jcr module.
> >> I definitely agree on the upgrade.
> >> Well we can simply detect it's not oak compatible and schedule a full
> >> reindex (maybe with a message in logs and ui?)
> >> But we need to be sure we can still read central index and not sure about
> >> possible lucene conflict with oak and maven indexer.
> >> We can work on this branch? (I created a Jenkins job for it
> >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >> va-jcr-oak-branch/)
> >> If you prefer master I would say no worries neither.
> >> Something else to look at is upgrading maven-core etc...
> >> Anyway
> >> Cheers
> >> Olivier
> >> 
> >> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
> >>> Hi,
> >>> 
> >>> upgrading the maven indexer leads to some major changes.
> >>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
> >>> sticks to
> >>> the old 3.x version and, as I see it, they will not move to a newer
> >>> version.
> >>> There is Jackrabbit Oak as alternative.
> >>> I tried a proof of concept and could replace the jackrabbit
> >>> implementation of
> >>> metadata-store-jcr with a oak implementation. At least I got the unit
> >>> tests of
> >>> this module all to pass.
> >>> But switching to Oak has some drawbacks:
> >>> - The repository format changed and we must provide a way to migrate
> >>> (either
> >>> migrate the existing repository or create a new one by reindexing)
> >>> - The lucene version used is newer but does not match to the version
> >>> from the
> >>> maven-indexer dependencies. There may come up some incompatibilities
> >>> that are
> >>> not solvable without using a modified version of one of the both. Or
> >>> there may
> >>> be the possibility to switch to solr (as separate component) and get rid
> >>> of
> >>> the lucene dependencies for jcr inside the archiva project.
> >>> 
> >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> >>> - The Plexus-Sisu-Bridge does not work as before.
> >>> - We must migrate from the NexusIndexer to the indexer API.
> >>> 
> >>> So switching to the new indexer and oak means more work as expected and
> >>> some
> >>> risks regarding new incompatibility problems. And I think this cannot be
> >>> done
> >>> without broken master builds for some time period.
> >>> 
> >>> So, what should we do? I think maven indexer is one of the core
> >>> components of
> >>> archiva, and we should utilize the 3.x-version to  migrate to the new
> >>> indexer
> >>> version, even if this means switching to jcr oak. Otherwise it would
> >>> mean to
> >>> stick to the old version for the next years.
> >>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope
> >>> you
> >>> can provide  useful help.
> >>> 
> >>> I committed the PoC to the branch feature/jcr_oak. There are some
> >>> modules
> >>> where the tests do not pass (mainly because of the indexer API changes).
> >>> 
> >>> Any comments?
> >>> 
> >>> Cheers
> >>> 
> >>> Martin
> >>> 
> >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> >>> > forget it but we need to ensure we can read maven index files....
> >>> > 
> >>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> >>> > > Hi,
> >>> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene
> >>> 
> >>> can be a
> >>> 
> >>> > > problem here.
> >>> > > Regarding maven-indexer yes we can depend on a snapshot until the
> >>> 
> >>> release.
> >>> 
> >>> > > I can release it ;-)
> >>> > > 
> >>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> >>> > >> Hi,
> >>> > >> 
> >>> > >> the lucene version depends on the maven indexer. But I'm not sure
> >>> 
> >>> about
> >>> 
> >>> > >> the
> >>> > >> current state of maven-indexer. The version has not changed since
> >>> 
> >>> some
> >>> 
> >>> > >> 2013.
> >>> > >> 
> >>> > >> There are commits on the master branch since then, and the lucene
> >>> 
> >>> version
> >>> 
> >>> > >> has
> >>> > >> been changed too, but no releases were tagged.
> >>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
> >>> > >> 
> >>> > >> As I know there are new compact index formats with new lucene
> >>> 
> >>> versions
> >>> 
> >>> > >> but I'm
> >>> > >> not sure if this is relevant for the maven indexes.
> >>> > >> 
> >>> > >> Cheers
> >>> > >> 
> >>> > >> Martin
> >>> > > 
> >>> > > --
> >>> > > Olivier Lamy
> >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> 
> >> --
> >> Olivier Lamy
> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > 
> > --
> > Olivier Lamy
> > http://twitter.com/olamy | http://linkedin.com/in/olamy



Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
well this gonna be a pain.
IMHO we need to find a new alternative to jcr oak.
And something not using Lucene as it's a real pain to have different
librairies using lucene as they do not update in the same time (and Lucene
break backward compat so quickly...)
Any ideas? I'd like to have something embedded (but with a possible
external server configuration).
There is currently a Cassandra implementation. I was not satisfied about
performance but I guess I did that 4yo ago so can be improved for sure :-)
Maybe orientdb?
What else?



On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> wrote:

> well the issue is non compatible version of Lucene for Maven Indexer and
> Oak (well I can try push a patch to Oak for upgrading...)
>
> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:
>
>> Hi
>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>> I'm working on it in the branch ( feature/jcr_oak )
>> Not sure why but I have intermittent failure with store-jcr module.
>> I definitely agree on the upgrade.
>> Well we can simply detect it's not oak compatible and schedule a full
>> reindex (maybe with a message in logs and ui?)
>> But we need to be sure we can still read central index and not sure about
>> possible lucene conflict with oak and maven indexer.
>> We can work on this branch? (I created a Jenkins job for it
>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> va-jcr-oak-branch/)
>> If you prefer master I would say no worries neither.
>> Something else to look at is upgrading maven-core etc...
>> Anyway
>> Cheers
>> Olivier
>>
>>
>>
>> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>>
>>> Hi,
>>>
>>> upgrading the maven indexer leads to some major changes.
>>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
>>> sticks to
>>> the old 3.x version and, as I see it, they will not move to a newer
>>> version.
>>> There is Jackrabbit Oak as alternative.
>>> I tried a proof of concept and could replace the jackrabbit
>>> implementation of
>>> metadata-store-jcr with a oak implementation. At least I got the unit
>>> tests of
>>> this module all to pass.
>>> But switching to Oak has some drawbacks:
>>> - The repository format changed and we must provide a way to migrate
>>> (either
>>> migrate the existing repository or create a new one by reindexing)
>>> - The lucene version used is newer but does not match to the version
>>> from the
>>> maven-indexer dependencies. There may come up some incompatibilities
>>> that are
>>> not solvable without using a modified version of one of the both. Or
>>> there may
>>> be the possibility to switch to solr (as separate component) and get rid
>>> of
>>> the lucene dependencies for jcr inside the archiva project.
>>>
>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
>>> - The Plexus-Sisu-Bridge does not work as before.
>>> - We must migrate from the NexusIndexer to the indexer API.
>>>
>>> So switching to the new indexer and oak means more work as expected and
>>> some
>>> risks regarding new incompatibility problems. And I think this cannot be
>>> done
>>> without broken master builds for some time period.
>>>
>>> So, what should we do? I think maven indexer is one of the core
>>> components of
>>> archiva, and we should utilize the 3.x-version to  migrate to the new
>>> indexer
>>> version, even if this means switching to jcr oak. Otherwise it would
>>> mean to
>>> stick to the old version for the next years.
>>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope
>>> you
>>> can provide  useful help.
>>>
>>> I committed the PoC to the branch feature/jcr_oak. There are some modules
>>> where the tests do not pass (mainly because of the indexer API changes).
>>>
>>> Any comments?
>>>
>>> Cheers
>>>
>>> Martin
>>>
>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
>>> > forget it but we need to ensure we can read maven index files....
>>> >
>>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
>>> > > Hi,
>>> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene
>>> can be a
>>> > > problem here.
>>> > > Regarding maven-indexer yes we can depend on a snapshot until the
>>> release.
>>> > > I can release it ;-)
>>> > >
>>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
>>> > >> Hi,
>>> > >>
>>> > >> the lucene version depends on the maven indexer. But I'm not sure
>>> about
>>> > >> the
>>> > >> current state of maven-indexer. The version has not changed since
>>> some
>>> > >> 2013.
>>> > >>
>>> > >> There are commits on the master branch since then, and the lucene
>>> version
>>> > >> has
>>> > >> been changed too, but no releases were tagged.
>>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>>> > >>
>>> > >> As I know there are new compact index formats with new lucene
>>> versions
>>> > >> but I'm
>>> > >> not sure if this is relevant for the maven indexes.
>>> > >>
>>> > >> Cheers
>>> > >>
>>> > >> Martin
>>> > >
>>> > > --
>>> > > Olivier Lamy
>>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>>
>>>
>>>
>>
>>
>> --
>> Olivier Lamy
>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
well the issue is non compatible version of Lucene for Maven Indexer and
Oak (well I can try push a patch to Oak for upgrading...)

On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> wrote:

> Hi
> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
> I'm working on it in the branch ( feature/jcr_oak )
> Not sure why but I have intermittent failure with store-jcr module.
> I definitely agree on the upgrade.
> Well we can simply detect it's not oak compatible and schedule a full
> reindex (maybe with a message in logs and ui?)
> But we need to be sure we can still read central index and not sure about
> possible lucene conflict with oak and maven indexer.
> We can work on this branch? (I created a Jenkins job for it
> https://builds.apache.org/view/A-D/view/Archiva/job/
> archiva-jcr-oak-branch/)
> If you prefer master I would say no worries neither.
> Something else to look at is upgrading maven-core etc...
> Anyway
> Cheers
> Olivier
>
>
>
> On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:
>
>> Hi,
>>
>> upgrading the maven indexer leads to some major changes.
>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit sticks
>> to
>> the old 3.x version and, as I see it, they will not move to a newer
>> version.
>> There is Jackrabbit Oak as alternative.
>> I tried a proof of concept and could replace the jackrabbit
>> implementation of
>> metadata-store-jcr with a oak implementation. At least I got the unit
>> tests of
>> this module all to pass.
>> But switching to Oak has some drawbacks:
>> - The repository format changed and we must provide a way to migrate
>> (either
>> migrate the existing repository or create a new one by reindexing)
>> - The lucene version used is newer but does not match to the version from
>> the
>> maven-indexer dependencies. There may come up some incompatibilities that
>> are
>> not solvable without using a modified version of one of the both. Or
>> there may
>> be the possibility to switch to solr (as separate component) and get rid
>> of
>> the lucene dependencies for jcr inside the archiva project.
>>
>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
>> - The Plexus-Sisu-Bridge does not work as before.
>> - We must migrate from the NexusIndexer to the indexer API.
>>
>> So switching to the new indexer and oak means more work as expected and
>> some
>> risks regarding new incompatibility problems. And I think this cannot be
>> done
>> without broken master builds for some time period.
>>
>> So, what should we do? I think maven indexer is one of the core
>> components of
>> archiva, and we should utilize the 3.x-version to  migrate to the new
>> indexer
>> version, even if this means switching to jcr oak. Otherwise it would mean
>> to
>> stick to the old version for the next years.
>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope
>> you
>> can provide  useful help.
>>
>> I committed the PoC to the branch feature/jcr_oak. There are some modules
>> where the tests do not pass (mainly because of the indexer API changes).
>>
>> Any comments?
>>
>> Cheers
>>
>> Martin
>>
>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
>> > forget it but we need to ensure we can read maven index files....
>> >
>> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
>> > > Hi,
>> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene can
>> be a
>> > > problem here.
>> > > Regarding maven-indexer yes we can depend on a snapshot until the
>> release.
>> > > I can release it ;-)
>> > >
>> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
>> > >> Hi,
>> > >>
>> > >> the lucene version depends on the maven indexer. But I'm not sure
>> about
>> > >> the
>> > >> current state of maven-indexer. The version has not changed since
>> some
>> > >> 2013.
>> > >>
>> > >> There are commits on the master branch since then, and the lucene
>> version
>> > >> has
>> > >> been changed too, but no releases were tagged.
>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>> > >>
>> > >> As I know there are new compact index formats with new lucene
>> versions
>> > >> but I'm
>> > >> not sure if this is relevant for the maven indexes.
>> > >>
>> > >> Cheers
>> > >>
>> > >> Martin
>> > >
>> > > --
>> > > Olivier Lamy
>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>>
>>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi
Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
I'm working on it in the branch ( feature/jcr_oak )
Not sure why but I have intermittent failure with store-jcr module.
I definitely agree on the upgrade.
Well we can simply detect it's not oak compatible and schedule a full
reindex (maybe with a message in logs and ui?)
But we need to be sure we can still read central index and not sure about
possible lucene conflict with oak and maven indexer.
We can work on this branch? (I created a Jenkins job for it
https://builds.apache.org/view/A-D/view/Archiva/job/archiva-jcr-oak-branch/)
If you prefer master I would say no worries neither.
Something else to look at is upgrading maven-core etc...
Anyway
Cheers
Olivier



On 22 June 2017 at 19:16, Martin <ma...@apache.org> wrote:

> Hi,
>
> upgrading the maven indexer leads to some major changes.
> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit sticks
> to
> the old 3.x version and, as I see it, they will not move to a newer
> version.
> There is Jackrabbit Oak as alternative.
> I tried a proof of concept and could replace the jackrabbit implementation
> of
> metadata-store-jcr with a oak implementation. At least I got the unit
> tests of
> this module all to pass.
> But switching to Oak has some drawbacks:
> - The repository format changed and we must provide a way to migrate
> (either
> migrate the existing repository or create a new one by reindexing)
> - The lucene version used is newer but does not match to the version from
> the
> maven-indexer dependencies. There may come up some incompatibilities that
> are
> not solvable without using a modified version of one of the both. Or there
> may
> be the possibility to switch to solr (as separate component) and get rid of
> the lucene dependencies for jcr inside the archiva project.
>
> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
> - The Plexus-Sisu-Bridge does not work as before.
> - We must migrate from the NexusIndexer to the indexer API.
>
> So switching to the new indexer and oak means more work as expected and
> some
> risks regarding new incompatibility problems. And I think this cannot be
> done
> without broken master builds for some time period.
>
> So, what should we do? I think maven indexer is one of the core components
> of
> archiva, and we should utilize the 3.x-version to  migrate to the new
> indexer
> version, even if this means switching to jcr oak. Otherwise it would mean
> to
> stick to the old version for the next years.
> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope you
> can provide  useful help.
>
> I committed the PoC to the branch feature/jcr_oak. There are some modules
> where the tests do not pass (mainly because of the indexer API changes).
>
> Any comments?
>
> Cheers
>
> Martin
>
> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> > forget it but we need to ensure we can read maven index files....
> >
> > On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> > > Hi,
> > > Remember jackrabbit depends on Lucene as well so upgrading Lucene can
> be a
> > > problem here.
> > > Regarding maven-indexer yes we can depend on a snapshot until the
> release.
> > > I can release it ;-)
> > >
> > > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> > >> Hi,
> > >>
> > >> the lucene version depends on the maven indexer. But I'm not sure
> about
> > >> the
> > >> current state of maven-indexer. The version has not changed since some
> > >> 2013.
> > >>
> > >> There are commits on the master branch since then, and the lucene
> version
> > >> has
> > >> been changed too, but no releases were tagged.
> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
> > >>
> > >> As I know there are new compact index formats with new lucene versions
> > >> but I'm
> > >> not sure if this is relevant for the maven indexes.
> > >>
> > >> Cheers
> > >>
> > >> Martin
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
Hi,

upgrading the maven indexer leads to some major changes.
Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit sticks to 
the old 3.x version and, as I see it, they will not move to a newer version.
There is Jackrabbit Oak as alternative.
I tried a proof of concept and could replace the jackrabbit implementation of 
metadata-store-jcr with a oak implementation. At least I got the unit tests of 
this module all to pass.
But switching to Oak has some drawbacks:
- The repository format changed and we must provide a way to migrate (either 
migrate the existing repository or create a new one by reindexing)
- The lucene version used is newer but does not match to the version from the 
maven-indexer dependencies. There may come up some incompatibilities that are 
not solvable without using a modified version of one of the both. Or there may 
be the possibility to switch to solr (as separate component) and get rid of 
the lucene dependencies for jcr inside the archiva project.

Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
- The Plexus-Sisu-Bridge does not work as before. 
- We must migrate from the NexusIndexer to the indexer API.

So switching to the new indexer and oak means more work as expected and some 
risks regarding new incompatibility problems. And I think this cannot be done 
without broken master builds for some time period.

So, what should we do? I think maven indexer is one of the core components of 
archiva, and we should utilize the 3.x-version to  migrate to the new indexer 
version, even if this means switching to jcr oak. Otherwise it would mean to 
stick to the old version for the next years.
@Olivier, regarding the maven-indexer / sisu-Bridge API changes, I hope you 
can provide  useful help.

I committed the PoC to the branch feature/jcr_oak. There are some modules 
where the tests do not pass (mainly because of the indexer API changes).

Any comments?

Cheers

Martin

Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
> forget it but we need to ensure we can read maven index files....
> 
> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:
> > Hi,
> > Remember jackrabbit depends on Lucene as well so upgrading Lucene can be a
> > problem here.
> > Regarding maven-indexer yes we can depend on a snapshot until the release.
> > I can release it ;-)
> > 
> > On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
> >> Hi,
> >> 
> >> the lucene version depends on the maven indexer. But I'm not sure about
> >> the
> >> current state of maven-indexer. The version has not changed since some
> >> 2013.
> >> 
> >> There are commits on the master branch since then, and the lucene version
> >> has
> >> been changed too, but no releases were tagged.
> >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
> >> 
> >> As I know there are new compact index formats with new lucene versions
> >> but I'm
> >> not sure if this is relevant for the maven indexes.
> >> 
> >> Cheers
> >> 
> >> Martin
> > 
> > --
> > Olivier Lamy
> > http://twitter.com/olamy | http://linkedin.com/in/olamy



Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
forget it but we need to ensure we can read maven index files....

On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> wrote:

> Hi,
> Remember jackrabbit depends on Lucene as well so upgrading Lucene can be a
> problem here.
> Regarding maven-indexer yes we can depend on a snapshot until the release.
> I can release it ;-)
>
>
> On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:
>
>> Hi,
>>
>> the lucene version depends on the maven indexer. But I'm not sure about
>> the
>> current state of maven-indexer. The version has not changed since some
>> 2013.
>>
>> There are commits on the master branch since then, and the lucene version
>> has
>> been changed too, but no releases were tagged.
>> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>>
>> As I know there are new compact index formats with new lucene versions
>> but I'm
>> not sure if this is relevant for the maven indexes.
>>
>> Cheers
>>
>> Martin
>>
>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi,
Remember jackrabbit depends on Lucene as well so upgrading Lucene can be a
problem here.
Regarding maven-indexer yes we can depend on a snapshot until the release.
I can release it ;-)


On 13 June 2017 at 06:06, Martin <ma...@apache.org> wrote:

> Hi,
>
> the lucene version depends on the maven indexer. But I'm not sure about the
> current state of maven-indexer. The version has not changed since some
> 2013.
>
> There are commits on the master branch since then, and the lucene version
> has
> been changed too, but no releases were tagged.
> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>
> As I know there are new compact index formats with new lucene versions but
> I'm
> not sure if this is relevant for the maven indexes.
>
> Cheers
>
> Martin
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy