You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@archiva.apache.org by Olivier Lamy <ol...@apache.org> on 2017/08/15 09:30:04 UTC

Re: maven-indexer / Lucene

Hi
Took a bit of time but I finally get the branch working :-)
branch: feature/jcr_oak
Let me know what do you think of?
Well I guess there are still some optimisations to do for jcr oak
I can see some logs:
21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id /*
oak-internal */; consider creating an index
21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor -
Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base] WHERE
[jcr:uuid] = $id /* oak-internal */, path=*,
property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
creating an index or changing the query





On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:

> Hi Olivier,
>
> great!
> For my understanding: The dependency to lucene in the pom of indexer-core
> is
> still there, but the lucene packages are moved to the
> ...maven.index.shaded...
> package? You develop indexer-core with the standard lucene packages and the
> shading is executed during the build of the indexer package?
>
> I think that may solve our dependency problem.
>
> I still got errors in the maven-indexer module, but I think the status is
> still "work in progress". I don't want to interfere too much with your
> changes.
>
> I'm not sure, if we should keep the JCR Oak as metadata implementation. I
> think OrientDB may be a feasible alternative: Embeddable,  Graph database,
> Lucene index optional and may be omitted, Apache License. And with JCR Oak
> we
> also have to convert the existing metadata index.
>
> But one step after the other. If we agree that the shaded indexer works, we
> should merge only the maven indexer changes to the master branch without
> the
> JCR/lucene update and change the JCR and or lucene afterwards.
>
> Greetings
>
> Martin
>
> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
> > So the repo contains a branch feature/jar_shaded_lucene here
> > https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
> > and I pushed what I started for Archiva in the branch called
> feature/jcr_oak
> > So in order to test it you need to build first maven-indexer from the
> > branch feature/jar_shaded_lucene
> >
> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
> > > I will try to share the work I did tomorrow in a branch
> > >
> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <martin_s@apache.org
> >
> > >
> > > wrote:
> > >> We have different lucene (incompatible) dependencies that prevents us
> to
> > >> update the maven indexer and/or jackrabbit. And this will happen again
> > >> with
> > >> each upgrade from one of these two packages in the future.
> > >> So would be really good if we can find a solution that removes one of
> the
> > >> lucene dependencies.
> > >>
> > >> Greetings
> > >>
> > >> Martin
> > >>
> > >>
> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
> chrisgwarp@gmail.com
> > >>
> > >> >Can I please an obvious/stupid question?
> > >> >
> > >> >What is driving this need for change?
> > >> >
> > >> >From a quick read of the thread above, all of the options appear to
> > >> >introduce a lot of breaking changes, and a whole lot more
> uncertainty.
> > >> >
> > >> >So, what is so broken that it is driving these changes?
> > >> >
> > >> >Sent from my iPhone
> > >> >
> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote:
> > >> >>
> > >> >> Yup.
> > >> >> The idea is to have an extra jar produced by the maven-indexer with
> > >> >
> > >> >shaded
> > >> >
> > >> >> lucene version.
> > >> >> So the lucene classes (version used by Maven indexer) will be
> > >> >
> > >> >relocated in
> > >> >
> > >> >> a package called org.apache.maven.index.shaded.lucene (such
> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> > >> >> Then you exclude lucene dependencies used by maven indexer and
> voila.
> > >> >> The voila is a bit optimistic and not so ezy but anyway working on
> it
> > >> >
> > >> >ATM.
> > >> >
> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> > >> >>>
> > >> >>> What do you mean exactly by shading? Moving to another package
> name?
> > >> >>>
> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy:
> > >> >>>> maybe an option is to use some shading?
> > >> >>>> I'm thinking of shading lucene packages used by maven indexer. I
> > >> >
> > >> >can
> > >> >
> > >> >>> easily
> > >> >>>
> > >> >>>> provide a build for that.
> > >> >>>> WDYT?
> > >> >>>>
> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
> wrote:
> > >> >>>>> Hi
> > >> >>>>> graph/document storage could be convenient (but not possible
> with
> > >> >>>
> > >> >>> neo4j as
> > >> >>>
> > >> >>>>> it's GPL license [1])
> > >> >>>>> well we can add solr as an additional webapp with our jetty
> > >> >>>
> > >> >>> distribution
> > >> >>>
> > >> >>>>> but this will be a pain for users who want to use tomcat or any
> > >> >
> > >> >other
> > >> >
> > >> >>>>> servlet container...
> > >> >>>>> we still need to investigate a new storage model :-)
> > >> >>>>>
> > >> >>>>> Olivier
> > >> >>>>> [1] https://neo4j.com/licensing/
> > >> >>>>>
> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org> wrote:
> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
> trouble
> > >> >
> > >> >and
> > >> >
> > >> >>>>>> will
> > >> >>>>>> cause headaches with each version change of one of the
> > >> >
> > >> >dependencies.
> > >> >
> > >> >>>>>> What are the requirements for a replacement?
> > >> >>>>>> - We want to store hierarchical data?
> > >> >>>>>> - We want to store metadata for nodes ?
> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
> familiar
> > >> >
> > >> >with
> > >> >
> > >> >>> the
> > >> >>>
> > >> >>>>>> archiva artifact model)?
> > >> >>>>>>
> > >> >>>>>> Maybe some graph database may be an alternative. Don't know if
> > >> >
> > >> >the
> > >> >
> > >> >>>>>> license of
> > >> >>>>>> neo4j is compatible to the apache license, and I think it
> brings
> > >> >>>
> > >> >>> lucene
> > >> >>>
> > >> >>>>>> as
> > >> >>>>>> dependency too. I will have a look.
> > >> >>>>>> Problem is, if there is fulltext search needed, I think, for
> most
> > >> >
> > >> >of
> > >> >
> > >> >>> the
> > >> >>>
> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> > >> >>>>>>
> > >> >>>>>> Other alternatives:
> > >> >>>>>> - Implement fulltext search by our own (index of the metadata
> > >> >
> > >> >stored
> > >> >
> > >> >>> via
> > >> >>>
> > >> >>>>>> the
> > >> >>>>>> archiva api) and use the lucene dependency that comes from the
> > >> >>>>>> maven-indexer
> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own
> > >> >>>>>> application
> > >> >>>>>> (war).
> > >> >>>>>>
> > >> >>>>>> Greetings
> > >> >>>>>>
> > >> >>>>>> Martin
> > >> >>>>>>
> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
> > >> >>>>>>> well this gonna be a pain.
> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> > >> >>>>>>> And something not using Lucene as it's a real pain to have
> > >> >
> > >> >different
> > >> >
> > >> >>>>>>> librairies using lucene as they do not update in the same time
> > >> >
> > >> >(and
> > >> >
> > >> >>>>>> Lucene
> > >> >>>>>>
> > >> >>>>>>> break backward compat so quickly...)
> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with a
> > >> >
> > >> >possible
> > >> >
> > >> >>>>>>> external server configuration).
> > >> >>>>>>> There is currently a Cassandra implementation. I was not
> > >> >
> > >> >satisfied
> > >> >
> > >> >>>>>>> about
> > >> >>>>>>> performance but I guess I did that 4yo ago so can be improved
> > >> >
> > >> >for
> > >> >
> > >> >>> sure
> > >> >>>
> > >> >>>>>> :-)
> > >> >>>>>> :
> > >> >>>>>>> Maybe orientdb?
> > >> >>>>>>> What else?
> > >> >>>>>>>
> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org>
> > >> >
> > >> >wrote:
> > >> >>>>>>>> well the issue is non compatible version of Lucene for Maven
> > >> >>>
> > >> >>> Indexer
> > >> >>>
> > >> >>>>>> and
> > >> >>>>>>
> > >> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...)
> > >> >>>>>>>>
> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org>
> > >> >
> > >> >wrote:
> > >> >>>>>>>>> Hi
> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
> bridge.
> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> > >> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr
> > >> >>>
> > >> >>> module.
> > >> >>>
> > >> >>>>>>>>> I definitely agree on the upgrade.
> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
> schedule
> > >> >
> > >> >a
> > >> >
> > >> >>>>>>>>> full
> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> > >> >>>>>>>>> But we need to be sure we can still read central index and
> not
> > >> >>>
> > >> >>> sure
> > >> >>>
> > >> >>>>>> about
> > >> >>>>>>
> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it
> > >> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi
> > >> >>>>>>>>> va-jcr-oak-branch/)
> > >> >>>>>>>>> If you prefer master I would say no worries neither.
> > >> >>>>>>>>> Something else to look at is upgrading maven-core etc...
> > >> >>>>>>>>> Anyway
> > >> >>>>>>>>> Cheers
> > >> >>>>>>>>> Olivier
> > >> >>>>>>>>>
> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <ma...@apache.org>
> wrote:
> > >> >>>>>>>>>> Hi,
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> upgrading the maven indexer leads to some major changes.
> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit.
> > >> >>>
> > >> >>> Jackrabbit
> > >> >>>
> > >> >>>>>>>>>> sticks to
> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not move
> to a
> > >> >>>
> > >> >>> newer
> > >> >>>
> > >> >>>>>>>>>> version.
> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> > >> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit
> > >> >>>>>>>>>> implementation of
> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I
> got
> > >> >
> > >> >the
> > >> >
> > >> >>>>>> unit
> > >> >>>>>>
> > >> >>>>>>>>>> tests of
> > >> >>>>>>>>>> this module all to pass.
> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
> > >> >>>>>>>>>> - The repository format changed and we must provide a way
> to
> > >> >>>>>>>>>> migrate
> > >> >>>>>>>>>> (either
> > >> >>>>>>>>>> migrate the existing repository or create a new one by
> > >> >>>
> > >> >>> reindexing)
> > >> >>>
> > >> >>>>>>>>>> - The lucene version used is newer but does not match to
> the
> > >> >>>>>>>>>> version
> > >> >>>>>>>>>> from the
> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
> > >> >>>>>>>>>> incompatibilities
> > >> >>>>>>>>>> that are
> > >> >>>>>>>>>> not solvable without using a modified version of one of the
> > >> >>>
> > >> >>> both.
> > >> >>>
> > >> >>>>>>>>>> Or
> > >> >>>>>>>>>> there may
> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
> component)
> > >> >
> > >> >and
> > >> >
> > >> >>>>>> get rid
> > >> >>>>>>
> > >> >>>>>>>>>> of
> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes
> > >> >
> > >> >too:
> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> So switching to the new indexer and oak means more work as
> > >> >>>
> > >> >>> expected
> > >> >>>
> > >> >>>>>> and
> > >> >>>>>>
> > >> >>>>>>>>>> some
> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I think
> > >> >
> > >> >this
> > >> >
> > >> >>>>>> cannot be
> > >> >>>>>>
> > >> >>>>>>>>>> done
> > >> >>>>>>>>>> without broken master builds for some time period.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one of the
> > >> >
> > >> >core
> > >> >
> > >> >>>>>>>>>> components of
> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to  migrate
> to
> > >> >>>
> > >> >>> the
> > >> >>>
> > >> >>>>>> new
> > >> >>>>>>
> > >> >>>>>>>>>> indexer
> > >> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise
> > >> >
> > >> >it
> > >> >
> > >> >>>>>>>>>> would
> > >> >>>>>>>>>> mean to
> > >> >>>>>>>>>> stick to the old version for the next years.
> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API
> > >> >>>
> > >> >>> changes, I
> > >> >>>
> > >> >>>>>> hope
> > >> >>>>>>
> > >> >>>>>>>>>> you
> > >> >>>>>>>>>> can provide  useful help.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There
> are
> > >> >>>
> > >> >>> some
> > >> >>>
> > >> >>>>>>>>>> modules
> > >> >>>>>>>>>> where the tests do not pass (mainly because of the indexer
> > >> >
> > >> >API
> > >> >
> > >> >>>>>> changes).
> > >> >>>>>>
> > >> >>>>>>>>>> Any comments?
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Cheers
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Martin
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier
> > >> >
> > >> >Lamy:
> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven index
> > >> >>>
> > >> >>> files....
> > >> >>>
> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org>
> > >> >>>
> > >> >>> wrote:
> > >> >>>>>>>>>>>> Hi,
> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
> upgrading
> > >> >>>>>>
> > >> >>>>>> Lucene
> > >> >>>>>>
> > >> >>>>>>>>>> can be a
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>> problem here.
> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot
> > >> >>>
> > >> >>> until
> > >> >>>
> > >> >>>>>> the
> > >> >>>>>>
> > >> >>>>>>>>>> release.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>> I can release it ;-)
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <ma...@apache.org>
> > >> >>>
> > >> >>> wrote:
> > >> >>>>>>>>>>>>> Hi,
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm
> > >> >>>
> > >> >>> not
> > >> >>>
> > >> >>>>>> sure
> > >> >>>>>>
> > >> >>>>>>>>>> about
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> the
> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has not
> > >> >
> > >> >changed
> > >> >
> > >> >>>>>> since
> > >> >>>>>>
> > >> >>>>>>>>>> some
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> 2013.
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> There are commits on the master branch since then, and
> the
> > >> >>>>>>
> > >> >>>>>> lucene
> > >> >>>>>>
> > >> >>>>>>>>>> version
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> has
> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> As I know there are new compact index formats with new
> > >> >>>
> > >> >>> lucene
> > >> >>>
> > >> >>>>>>>>>> versions
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>>>> but I'm
> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes.
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> Cheers
> > >> >>>>>>>>>>>>>
> > >> >>>>>>>>>>>>> Martin
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> --
> > >> >>>>>>>>>>>> Olivier Lamy
> > >> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>>>>>
> > >> >>>>>>>>> --
> > >> >>>>>>>>> Olivier Lamy
> > >> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>>>>
> > >> >>>>>>>> --
> > >> >>>>>>>> Olivier Lamy
> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>>>>
> > >> >>>>> --
> > >> >>>>> Olivier Lamy
> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >> >>
> > >> >> --
> > >> >> Olivier Lamy
> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> > >>
> > >> --
> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
> > >
> > > --
> > > Olivier Lamy
> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>
>
>


-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin <ma...@apache.org>.
Hi,

I got it now running on my local machine (had to fight some issues with old packages in my
local mvn repository).
So the shaded lucene is now in the maven-indexer master, if I see it correctly.

We have a dependency problem with the guava version. The selenium tests need guava 22.0
and jcr oak runs only with guava 15.0.
Currently I have a (poor) workaround by setting the 22.0 for the webtests on test scope. That
should work because the webtest module is not included for the normal build.
But I would prefer, if we can change to the newer version for the whole project. I will try to find
out, what we can do about it.

Greetings

Martin

Am Samstag, 19. August 2017, 13:42:03 CEST schrieb Olivier Lamy:
> Hi
> So I have merged to master :-)
> 
> On 18 August 2017 at 01:22, Martin Stockhammer <ma...@apache.org> wrote:
> 
> > Hi Olivier,
> >
> > great! I will look at it. I will give you feedback the next days.
> > And yes I have to optimize the jcr oak part and stabilize it. I will work
> > on it.
> >
> > Greetings
> >
> > Martin
> >
> >
> >
> >
> > Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
> > >Hi
> > >Took a bit of time but I finally get the branch working :-)
> > >branch: feature/jcr_oak
> > >Let me know what do you think of?
> > >Well I guess there are still some optimisations to do for jcr oak
> > >I can see some logs:
> > >21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
> > >(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
> > >/*
> > >oak-internal */; consider creating an index
> > >21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
> > >-
> > >Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
> > >WHERE
> > >[jcr:uuid] = $id /* oak-internal */, path=*,
> > >property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
> > >creating an index or changing the query


Re: maven-indexer / Lucene

Posted by Olivier Lamy <ol...@apache.org>.
Hi
So I have merged to master :-)

On 18 August 2017 at 01:22, Martin Stockhammer <ma...@apache.org> wrote:

> Hi Olivier,
>
> great! I will look at it. I will give you feedback the next days.
> And yes I have to optimize the jcr oak part and stabilize it. I will work
> on it.
>
> Greetings
>
> Martin
>
>
>
>
> Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
> >Hi
> >Took a bit of time but I finally get the branch working :-)
> >branch: feature/jcr_oak
> >Let me know what do you think of?
> >Well I guess there are still some optimisations to do for jcr oak
> >I can see some logs:
> >21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
> >(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
> >/*
> >oak-internal */; consider creating an index
> >21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
> >-
> >Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
> >WHERE
> >[jcr:uuid] = $id /* oak-internal */, path=*,
> >property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
> >creating an index or changing the query
> >
> >
> >
> >
> >
> >On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:
> >
> >> Hi Olivier,
> >>
> >> great!
> >> For my understanding: The dependency to lucene in the pom of
> >indexer-core
> >> is
> >> still there, but the lucene packages are moved to the
> >> ...maven.index.shaded...
> >> package? You develop indexer-core with the standard lucene packages
> >and the
> >> shading is executed during the build of the indexer package?
> >>
> >> I think that may solve our dependency problem.
> >>
> >> I still got errors in the maven-indexer module, but I think the
> >status is
> >> still "work in progress". I don't want to interfere too much with
> >your
> >> changes.
> >>
> >> I'm not sure, if we should keep the JCR Oak as metadata
> >implementation. I
> >> think OrientDB may be a feasible alternative: Embeddable,  Graph
> >database,
> >> Lucene index optional and may be omitted, Apache License. And with
> >JCR Oak
> >> we
> >> also have to convert the existing metadata index.
> >>
> >> But one step after the other. If we agree that the shaded indexer
> >works, we
> >> should merge only the maven indexer changes to the master branch
> >without
> >> the
> >> JCR/lucene update and change the JCR and or lucene afterwards.
> >>
> >> Greetings
> >>
> >> Martin
> >>
> >> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
> >> > So the repo contains a branch feature/jar_shaded_lucene here
> >> >
> >https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
> >> > and I pushed what I started for Archiva in the branch called
> >> feature/jcr_oak
> >> > So in order to test it you need to build first maven-indexer from
> >the
> >> > branch feature/jar_shaded_lucene
> >> >
> >> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
> >> > > I will try to share the work I did tomorrow in a branch
> >> > >
> >> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer
> ><martin_s@apache.org
> >> >
> >> > >
> >> > > wrote:
> >> > >> We have different lucene (incompatible) dependencies that
> >prevents us
> >> to
> >> > >> update the maven indexer and/or jackrabbit. And this will happen
> >again
> >> > >> with
> >> > >> each upgrade from one of these two packages in the future.
> >> > >> So would be really good if we can find a solution that removes
> >one of
> >> the
> >> > >> lucene dependencies.
> >> > >>
> >> > >> Greetings
> >> > >>
> >> > >> Martin
> >> > >>
> >> > >>
> >> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
> >> chrisgwarp@gmail.com
> >> > >>
> >> > >> >Can I please an obvious/stupid question?
> >> > >> >
> >> > >> >What is driving this need for change?
> >> > >> >
> >> > >> >From a quick read of the thread above, all of the options
> >appear to
> >> > >> >introduce a lot of breaking changes, and a whole lot more
> >> uncertainty.
> >> > >> >
> >> > >> >So, what is so broken that it is driving these changes?
> >> > >> >
> >> > >> >Sent from my iPhone
> >> > >> >
> >> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org>
> >wrote:
> >> > >> >>
> >> > >> >> Yup.
> >> > >> >> The idea is to have an extra jar produced by the
> >maven-indexer with
> >> > >> >
> >> > >> >shaded
> >> > >> >
> >> > >> >> lucene version.
> >> > >> >> So the lucene classes (version used by Maven indexer) will be
> >> > >> >
> >> > >> >relocated in
> >> > >> >
> >> > >> >> a package called org.apache.maven.index.shaded.lucene (such
> >> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
> >> > >> >> Then you exclude lucene dependencies used by maven indexer
> >and
> >> voila.
> >> > >> >> The voila is a bit optimistic and not so ezy but anyway
> >working on
> >> it
> >> > >> >
> >> > >> >ATM.
> >> > >> >
> >> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
> >> > >> >>>
> >> > >> >>> What do you mean exactly by shading? Moving to another
> >package
> >> name?
> >> > >> >>>
> >> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier
> >Lamy:
> >> > >> >>>> maybe an option is to use some shading?
> >> > >> >>>> I'm thinking of shading lucene packages used by maven
> >indexer. I
> >> > >> >
> >> > >> >can
> >> > >> >
> >> > >> >>> easily
> >> > >> >>>
> >> > >> >>>> provide a build for that.
> >> > >> >>>> WDYT?
> >> > >> >>>>
> >> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
> >> wrote:
> >> > >> >>>>> Hi
> >> > >> >>>>> graph/document storage could be convenient (but not
> >possible
> >> with
> >> > >> >>>
> >> > >> >>> neo4j as
> >> > >> >>>
> >> > >> >>>>> it's GPL license [1])
> >> > >> >>>>> well we can add solr as an additional webapp with our
> >jetty
> >> > >> >>>
> >> > >> >>> distribution
> >> > >> >>>
> >> > >> >>>>> but this will be a pain for users who want to use tomcat
> >or any
> >> > >> >
> >> > >> >other
> >> > >> >
> >> > >> >>>>> servlet container...
> >> > >> >>>>> we still need to investigate a new storage model :-)
> >> > >> >>>>>
> >> > >> >>>>> Olivier
> >> > >> >>>>> [1] https://neo4j.com/licensing/
> >> > >> >>>>>
> >> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org>
> >wrote:
> >> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
> >> trouble
> >> > >> >
> >> > >> >and
> >> > >> >
> >> > >> >>>>>> will
> >> > >> >>>>>> cause headaches with each version change of one of the
> >> > >> >
> >> > >> >dependencies.
> >> > >> >
> >> > >> >>>>>> What are the requirements for a replacement?
> >> > >> >>>>>> - We want to store hierarchical data?
> >> > >> >>>>>> - We want to store metadata for nodes ?
> >> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
> >> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
> >> familiar
> >> > >> >
> >> > >> >with
> >> > >> >
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> archiva artifact model)?
> >> > >> >>>>>>
> >> > >> >>>>>> Maybe some graph database may be an alternative. Don't
> >know if
> >> > >> >
> >> > >> >the
> >> > >> >
> >> > >> >>>>>> license of
> >> > >> >>>>>> neo4j is compatible to the apache license, and I think it
> >> brings
> >> > >> >>>
> >> > >> >>> lucene
> >> > >> >>>
> >> > >> >>>>>> as
> >> > >> >>>>>> dependency too. I will have a look.
> >> > >> >>>>>> Problem is, if there is fulltext search needed, I think,
> >for
> >> most
> >> > >> >
> >> > >> >of
> >> > >> >
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
> >> > >> >>>>>>
> >> > >> >>>>>> Other alternatives:
> >> > >> >>>>>> - Implement fulltext search by our own (index of the
> >metadata
> >> > >> >
> >> > >> >stored
> >> > >> >
> >> > >> >>> via
> >> > >> >>>
> >> > >> >>>>>> the
> >> > >> >>>>>> archiva api) and use the lucene dependency that comes
> >from the
> >> > >> >>>>>> maven-indexer
> >> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as
> >its own
> >> > >> >>>>>> application
> >> > >> >>>>>> (war).
> >> > >> >>>>>>
> >> > >> >>>>>> Greetings
> >> > >> >>>>>>
> >> > >> >>>>>> Martin
> >> > >> >>>>>>
> >> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier
> >Lamy:
> >> > >> >>>>>>> well this gonna be a pain.
> >> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
> >> > >> >>>>>>> And something not using Lucene as it's a real pain to
> >have
> >> > >> >
> >> > >> >different
> >> > >> >
> >> > >> >>>>>>> librairies using lucene as they do not update in the
> >same time
> >> > >> >
> >> > >> >(and
> >> > >> >
> >> > >> >>>>>> Lucene
> >> > >> >>>>>>
> >> > >> >>>>>>> break backward compat so quickly...)
> >> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with
> >a
> >> > >> >
> >> > >> >possible
> >> > >> >
> >> > >> >>>>>>> external server configuration).
> >> > >> >>>>>>> There is currently a Cassandra implementation. I was not
> >> > >> >
> >> > >> >satisfied
> >> > >> >
> >> > >> >>>>>>> about
> >> > >> >>>>>>> performance but I guess I did that 4yo ago so can be
> >improved
> >> > >> >
> >> > >> >for
> >> > >> >
> >> > >> >>> sure
> >> > >> >>>
> >> > >> >>>>>> :-)
> >> > >> >>>>>> :
> >> > >> >>>>>>> Maybe orientdb?
> >> > >> >>>>>>> What else?
> >> > >> >>>>>>>
> >> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >
> >> > >> >wrote:
> >> > >> >>>>>>>> well the issue is non compatible version of Lucene for
> >Maven
> >> > >> >>>
> >> > >> >>> Indexer
> >> > >> >>>
> >> > >> >>>>>> and
> >> > >> >>>>>>
> >> > >> >>>>>>>> Oak (well I can try push a patch to Oak for
> >upgrading...)
> >> > >> >>>>>>>>
> >> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >
> >> > >> >wrote:
> >> > >> >>>>>>>>> Hi
> >> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
> >> bridge.
> >> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
> >> > >> >>>>>>>>> Not sure why but I have intermittent failure with
> >store-jcr
> >> > >> >>>
> >> > >> >>> module.
> >> > >> >>>
> >> > >> >>>>>>>>> I definitely agree on the upgrade.
> >> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
> >> schedule
> >> > >> >
> >> > >> >a
> >> > >> >
> >> > >> >>>>>>>>> full
> >> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
> >> > >> >>>>>>>>> But we need to be sure we can still read central index
> >and
> >> not
> >> > >> >>>
> >> > >> >>> sure
> >> > >> >>>
> >> > >> >>>>>> about
> >> > >> >>>>>>
> >> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
> >> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job
> >for it
> >> > >> >>>>>>>>>
> >https://builds.apache.org/view/A-D/view/Archiva/job/archi
> >> > >> >>>>>>>>> va-jcr-oak-branch/)
> >> > >> >>>>>>>>> If you prefer master I would say no worries neither.
> >> > >> >>>>>>>>> Something else to look at is upgrading maven-core
> >etc...
> >> > >> >>>>>>>>> Anyway
> >> > >> >>>>>>>>> Cheers
> >> > >> >>>>>>>>> Olivier
> >> > >> >>>>>>>>>
> >> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin
> ><ma...@apache.org>
> >> wrote:
> >> > >> >>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> upgrading the maven indexer leads to some major
> >changes.
> >> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by
> >jackrabbit.
> >> > >> >>>
> >> > >> >>> Jackrabbit
> >> > >> >>>
> >> > >> >>>>>>>>>> sticks to
> >> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not
> >move
> >> to a
> >> > >> >>>
> >> > >> >>> newer
> >> > >> >>>
> >> > >> >>>>>>>>>> version.
> >> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
> >> > >> >>>>>>>>>> I tried a proof of concept and could replace the
> >jackrabbit
> >> > >> >>>>>>>>>> implementation of
> >> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At
> >least I
> >> got
> >> > >> >
> >> > >> >the
> >> > >> >
> >> > >> >>>>>> unit
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> tests of
> >> > >> >>>>>>>>>> this module all to pass.
> >> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
> >> > >> >>>>>>>>>> - The repository format changed and we must provide a
> >way
> >> to
> >> > >> >>>>>>>>>> migrate
> >> > >> >>>>>>>>>> (either
> >> > >> >>>>>>>>>> migrate the existing repository or create a new one
> >by
> >> > >> >>>
> >> > >> >>> reindexing)
> >> > >> >>>
> >> > >> >>>>>>>>>> - The lucene version used is newer but does not match
> >to
> >> the
> >> > >> >>>>>>>>>> version
> >> > >> >>>>>>>>>> from the
> >> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
> >> > >> >>>>>>>>>> incompatibilities
> >> > >> >>>>>>>>>> that are
> >> > >> >>>>>>>>>> not solvable without using a modified version of one
> >of the
> >> > >> >>>
> >> > >> >>> both.
> >> > >> >>>
> >> > >> >>>>>>>>>> Or
> >> > >> >>>>>>>>>> there may
> >> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
> >> component)
> >> > >> >
> >> > >> >and
> >> > >> >
> >> > >> >>>>>> get rid
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> of
> >> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva
> >project.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some
> >changes
> >> > >> >
> >> > >> >too:
> >> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
> >> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the
> >indexer API.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> So switching to the new indexer and oak means more
> >work as
> >> > >> >>>
> >> > >> >>> expected
> >> > >> >>>
> >> > >> >>>>>> and
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> some
> >> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I
> >think
> >> > >> >
> >> > >> >this
> >> > >> >
> >> > >> >>>>>> cannot be
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> done
> >> > >> >>>>>>>>>> without broken master builds for some time period.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one
> >of the
> >> > >> >
> >> > >> >core
> >> > >> >
> >> > >> >>>>>>>>>> components of
> >> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to
> >migrate
> >> to
> >> > >> >>>
> >> > >> >>> the
> >> > >> >>>
> >> > >> >>>>>> new
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> indexer
> >> > >> >>>>>>>>>> version, even if this means switching to jcr oak.
> >Otherwise
> >> > >> >
> >> > >> >it
> >> > >> >
> >> > >> >>>>>>>>>> would
> >> > >> >>>>>>>>>> mean to
> >> > >> >>>>>>>>>> stick to the old version for the next years.
> >> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge
> >API
> >> > >> >>>
> >> > >> >>> changes, I
> >> > >> >>>
> >> > >> >>>>>> hope
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> you
> >> > >> >>>>>>>>>> can provide  useful help.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak.
> >There
> >> are
> >> > >> >>>
> >> > >> >>> some
> >> > >> >>>
> >> > >> >>>>>>>>>> modules
> >> > >> >>>>>>>>>> where the tests do not pass (mainly because of the
> >indexer
> >> > >> >
> >> > >> >API
> >> > >> >
> >> > >> >>>>>> changes).
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> Any comments?
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Cheers
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Martin
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb
> >Olivier
> >> > >> >
> >> > >> >Lamy:
> >> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven
> >index
> >> > >> >>>
> >> > >> >>> files....
> >> > >> >>>
> >> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy
> ><ol...@apache.org>
> >> > >> >>>
> >> > >> >>> wrote:
> >> > >> >>>>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
> >> upgrading
> >> > >> >>>>>>
> >> > >> >>>>>> Lucene
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> can be a
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>> problem here.
> >> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a
> >snapshot
> >> > >> >>>
> >> > >> >>> until
> >> > >> >>>
> >> > >> >>>>>> the
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> release.
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>> I can release it ;-)
> >> > >> >>>>>>>>>>>>
> >> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin
> ><ma...@apache.org>
> >> > >> >>>
> >> > >> >>> wrote:
> >> > >> >>>>>>>>>>>>> Hi,
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer.
> >But I'm
> >> > >> >>>
> >> > >> >>> not
> >> > >> >>>
> >> > >> >>>>>> sure
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> about
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> the
> >> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has
> >not
> >> > >> >
> >> > >> >changed
> >> > >> >
> >> > >> >>>>>> since
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> some
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> 2013.
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> There are commits on the master branch since then,
> >and
> >> the
> >> > >> >>>>>>
> >> > >> >>>>>> lucene
> >> > >> >>>>>>
> >> > >> >>>>>>>>>> version
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> has
> >> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
> >> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
> >> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> As I know there are new compact index formats with
> >new
> >> > >> >>>
> >> > >> >>> lucene
> >> > >> >>>
> >> > >> >>>>>>>>>> versions
> >> > >> >>>>>>>>>>
> >> > >> >>>>>>>>>>>>> but I'm
> >> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven
> >indexes.
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> Cheers
> >> > >> >>>>>>>>>>>>>
> >> > >> >>>>>>>>>>>>> Martin
> >> > >> >>>>>>>>>>>>
> >> > >> >>>>>>>>>>>> --
> >> > >> >>>>>>>>>>>> Olivier Lamy
> >> > >> >>>>>>>>>>>> http://twitter.com/olamy |
> >http://linkedin.com/in/olamy
> >> > >> >>>>>>>>>
> >> > >> >>>>>>>>> --
> >> > >> >>>>>>>>> Olivier Lamy
> >> > >> >>>>>>>>> http://twitter.com/olamy |
> >http://linkedin.com/in/olamy
> >> > >> >>>>>>>>
> >> > >> >>>>>>>> --
> >> > >> >>>>>>>> Olivier Lamy
> >> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >> >>>>>
> >> > >> >>>>> --
> >> > >> >>>>> Olivier Lamy
> >> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >> >>
> >> > >> >> --
> >> > >> >> Olivier Lamy
> >> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
> >> > >>
> >> > >> --
> >> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail
> >gesendet.
> >> > >
> >> > > --
> >> > > Olivier Lamy
> >> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
> >>
> >>
> >>
> >
> >
> >--
> >Olivier Lamy
> >http://twitter.com/olamy | http://linkedin.com/in/olamy
>
> --
> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Posted by Martin Stockhammer <ma...@apache.org>.
Hi Olivier,

great! I will look at it. I will give you feedback the next days.
And yes I have to optimize the jcr oak part and stabilize it. I will work on it.

Greetings

Martin




Am 15. August 2017 11:30:04 MESZ schrieb Olivier Lamy <ol...@apache.org>:
>Hi
>Took a bit of time but I finally get the branch working :-)
>branch: feature/jcr_oak
>Let me know what do you think of?
>Well I guess there are still some optimisations to do for jcr oak
>I can see some logs:
>21:02:39.559 [1071] [main] WARN  oak.query.QueryImpl - Traversal query
>(query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id
>/*
>oak-internal */; consider creating an index
>21:02:39.563 [328] [main] WARN  plugins.index.Cursors$TraversingCursor
>-
>Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base]
>WHERE
>[jcr:uuid] = $id /* oak-internal */, path=*,
>property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider
>creating an index or changing the query
>
>
>
>
>
>On 8 July 2017 at 06:22, Martin <ma...@apache.org> wrote:
>
>> Hi Olivier,
>>
>> great!
>> For my understanding: The dependency to lucene in the pom of
>indexer-core
>> is
>> still there, but the lucene packages are moved to the
>> ...maven.index.shaded...
>> package? You develop indexer-core with the standard lucene packages
>and the
>> shading is executed during the build of the indexer package?
>>
>> I think that may solve our dependency problem.
>>
>> I still got errors in the maven-indexer module, but I think the
>status is
>> still "work in progress". I don't want to interfere too much with
>your
>> changes.
>>
>> I'm not sure, if we should keep the JCR Oak as metadata
>implementation. I
>> think OrientDB may be a feasible alternative: Embeddable,  Graph
>database,
>> Lucene index optional and may be omitted, Apache License. And with
>JCR Oak
>> we
>> also have to convert the existing metadata index.
>>
>> But one step after the other. If we agree that the shaded indexer
>works, we
>> should merge only the maven indexer changes to the master branch
>without
>> the
>> JCR/lucene update and change the JCR and or lucene afterwards.
>>
>> Greetings
>>
>> Martin
>>
>> Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy:
>> > So the repo contains a branch feature/jar_shaded_lucene here
>> >
>https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary
>> > and I pushed what I started for Archiva in the branch called
>> feature/jcr_oak
>> > So in order to test it you need to build first maven-indexer from
>the
>> > branch feature/jar_shaded_lucene
>> >
>> > On 6 July 2017 at 22:31, Olivier Lamy <ol...@apache.org> wrote:
>> > > I will try to share the work I did tomorrow in a branch
>> > >
>> > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer
><martin_s@apache.org
>> >
>> > >
>> > > wrote:
>> > >> We have different lucene (incompatible) dependencies that
>prevents us
>> to
>> > >> update the maven indexer and/or jackrabbit. And this will happen
>again
>> > >> with
>> > >> each upgrade from one of these two packages in the future.
>> > >> So would be really good if we can find a solution that removes
>one of
>> the
>> > >> lucene dependencies.
>> > >>
>> > >> Greetings
>> > >>
>> > >> Martin
>> > >>
>> > >>
>> > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <
>> chrisgwarp@gmail.com
>> > >>
>> > >> >Can I please an obvious/stupid question?
>> > >> >
>> > >> >What is driving this need for change?
>> > >> >
>> > >> >From a quick read of the thread above, all of the options
>appear to
>> > >> >introduce a lot of breaking changes, and a whole lot more
>> uncertainty.
>> > >> >
>> > >> >So, what is so broken that it is driving these changes?
>> > >> >
>> > >> >Sent from my iPhone
>> > >> >
>> > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org>
>wrote:
>> > >> >>
>> > >> >> Yup.
>> > >> >> The idea is to have an extra jar produced by the
>maven-indexer with
>> > >> >
>> > >> >shaded
>> > >> >
>> > >> >> lucene version.
>> > >> >> So the lucene classes (version used by Maven indexer) will be
>> > >> >
>> > >> >relocated in
>> > >> >
>> > >> >> a package called org.apache.maven.index.shaded.lucene (such
>> > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause )
>> > >> >> Then you exclude lucene dependencies used by maven indexer
>and
>> voila.
>> > >> >> The voila is a bit optimistic and not so ezy but anyway
>working on
>> it
>> > >> >
>> > >> >ATM.
>> > >> >
>> > >> >>> On 6 July 2017 at 07:08, Martin <ma...@apache.org> wrote:
>> > >> >>>
>> > >> >>> What do you mean exactly by shading? Moving to another
>package
>> name?
>> > >> >>>
>> > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier
>Lamy:
>> > >> >>>> maybe an option is to use some shading?
>> > >> >>>> I'm thinking of shading lucene packages used by maven
>indexer. I
>> > >> >
>> > >> >can
>> > >> >
>> > >> >>> easily
>> > >> >>>
>> > >> >>>> provide a build for that.
>> > >> >>>> WDYT?
>> > >> >>>>
>> > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org>
>> wrote:
>> > >> >>>>> Hi
>> > >> >>>>> graph/document storage could be convenient (but not
>possible
>> with
>> > >> >>>
>> > >> >>> neo4j as
>> > >> >>>
>> > >> >>>>> it's GPL license [1])
>> > >> >>>>> well we can add solr as an additional webapp with our
>jetty
>> > >> >>>
>> > >> >>> distribution
>> > >> >>>
>> > >> >>>>> but this will be a pain for users who want to use tomcat
>or any
>> > >> >
>> > >> >other
>> > >> >
>> > >> >>>>> servlet container...
>> > >> >>>>> we still need to investigate a new storage model :-)
>> > >> >>>>>
>> > >> >>>>> Olivier
>> > >> >>>>> [1] https://neo4j.com/licensing/
>> > >> >>>>>
>> > >> >>>>>> On 25 June 2017 at 06:26, Martin <ma...@apache.org>
>wrote:
>> > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of
>> trouble
>> > >> >
>> > >> >and
>> > >> >
>> > >> >>>>>> will
>> > >> >>>>>> cause headaches with each version change of one of the
>> > >> >
>> > >> >dependencies.
>> > >> >
>> > >> >>>>>> What are the requirements for a replacement?
>> > >> >>>>>> - We want to store hierarchical data?
>> > >> >>>>>> - We want to store metadata for nodes ?
>> > >> >>>>>> - Fulltext search (only metadata or for artifacts too?)
>> > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so
>> familiar
>> > >> >
>> > >> >with
>> > >> >
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> archiva artifact model)?
>> > >> >>>>>>
>> > >> >>>>>> Maybe some graph database may be an alternative. Don't
>know if
>> > >> >
>> > >> >the
>> > >> >
>> > >> >>>>>> license of
>> > >> >>>>>> neo4j is compatible to the apache license, and I think it
>> brings
>> > >> >>>
>> > >> >>> lucene
>> > >> >>>
>> > >> >>>>>> as
>> > >> >>>>>> dependency too. I will have a look.
>> > >> >>>>>> Problem is, if there is fulltext search needed, I think,
>for
>> most
>> > >> >
>> > >> >of
>> > >> >
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> frameworks we get a lucene dependency, if it's embedded.
>> > >> >>>>>>
>> > >> >>>>>> Other alternatives:
>> > >> >>>>>> - Implement fulltext search by our own (index of the
>metadata
>> > >> >
>> > >> >stored
>> > >> >
>> > >> >>> via
>> > >> >>>
>> > >> >>>>>> the
>> > >> >>>>>> archiva api) and use the lucene dependency that comes
>from the
>> > >> >>>>>> maven-indexer
>> > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as
>its own
>> > >> >>>>>> application
>> > >> >>>>>> (war).
>> > >> >>>>>>
>> > >> >>>>>> Greetings
>> > >> >>>>>>
>> > >> >>>>>> Martin
>> > >> >>>>>>
>> > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier
>Lamy:
>> > >> >>>>>>> well this gonna be a pain.
>> > >> >>>>>>> IMHO we need to find a new alternative to jcr oak.
>> > >> >>>>>>> And something not using Lucene as it's a real pain to
>have
>> > >> >
>> > >> >different
>> > >> >
>> > >> >>>>>>> librairies using lucene as they do not update in the
>same time
>> > >> >
>> > >> >(and
>> > >> >
>> > >> >>>>>> Lucene
>> > >> >>>>>>
>> > >> >>>>>>> break backward compat so quickly...)
>> > >> >>>>>>> Any ideas? I'd like to have something embedded (but with
>a
>> > >> >
>> > >> >possible
>> > >> >
>> > >> >>>>>>> external server configuration).
>> > >> >>>>>>> There is currently a Cassandra implementation. I was not
>> > >> >
>> > >> >satisfied
>> > >> >
>> > >> >>>>>>> about
>> > >> >>>>>>> performance but I guess I did that 4yo ago so can be
>improved
>> > >> >
>> > >> >for
>> > >> >
>> > >> >>> sure
>> > >> >>>
>> > >> >>>>>> :-)
>> > >> >>>>>> :
>> > >> >>>>>>> Maybe orientdb?
>> > >> >>>>>>> What else?
>> > >> >>>>>>>
>> > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy
><ol...@apache.org>
>> > >> >
>> > >> >wrote:
>> > >> >>>>>>>> well the issue is non compatible version of Lucene for
>Maven
>> > >> >>>
>> > >> >>> Indexer
>> > >> >>>
>> > >> >>>>>> and
>> > >> >>>>>>
>> > >> >>>>>>>> Oak (well I can try push a patch to Oak for
>upgrading...)
>> > >> >>>>>>>>
>> > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy
><ol...@apache.org>
>> > >> >
>> > >> >wrote:
>> > >> >>>>>>>>> Hi
>> > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus
>> bridge.
>> > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak )
>> > >> >>>>>>>>> Not sure why but I have intermittent failure with
>store-jcr
>> > >> >>>
>> > >> >>> module.
>> > >> >>>
>> > >> >>>>>>>>> I definitely agree on the upgrade.
>> > >> >>>>>>>>> Well we can simply detect it's not oak compatible and
>> schedule
>> > >> >
>> > >> >a
>> > >> >
>> > >> >>>>>>>>> full
>> > >> >>>>>>>>> reindex (maybe with a message in logs and ui?)
>> > >> >>>>>>>>> But we need to be sure we can still read central index
>and
>> not
>> > >> >>>
>> > >> >>> sure
>> > >> >>>
>> > >> >>>>>> about
>> > >> >>>>>>
>> > >> >>>>>>>>> possible lucene conflict with oak and maven indexer.
>> > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job
>for it
>> > >> >>>>>>>>>
>https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> > >> >>>>>>>>> va-jcr-oak-branch/)
>> > >> >>>>>>>>> If you prefer master I would say no worries neither.
>> > >> >>>>>>>>> Something else to look at is upgrading maven-core
>etc...
>> > >> >>>>>>>>> Anyway
>> > >> >>>>>>>>> Cheers
>> > >> >>>>>>>>> Olivier
>> > >> >>>>>>>>>
>> > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin
><ma...@apache.org>
>> wrote:
>> > >> >>>>>>>>>> Hi,
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> upgrading the maven indexer leads to some major
>changes.
>> > >> >>>>>>>>>> Lucene is used by maven-indexer and also by
>jackrabbit.
>> > >> >>>
>> > >> >>> Jackrabbit
>> > >> >>>
>> > >> >>>>>>>>>> sticks to
>> > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not
>move
>> to a
>> > >> >>>
>> > >> >>> newer
>> > >> >>>
>> > >> >>>>>>>>>> version.
>> > >> >>>>>>>>>> There is Jackrabbit Oak as alternative.
>> > >> >>>>>>>>>> I tried a proof of concept and could replace the
>jackrabbit
>> > >> >>>>>>>>>> implementation of
>> > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At
>least I
>> got
>> > >> >
>> > >> >the
>> > >> >
>> > >> >>>>>> unit
>> > >> >>>>>>
>> > >> >>>>>>>>>> tests of
>> > >> >>>>>>>>>> this module all to pass.
>> > >> >>>>>>>>>> But switching to Oak has some drawbacks:
>> > >> >>>>>>>>>> - The repository format changed and we must provide a
>way
>> to
>> > >> >>>>>>>>>> migrate
>> > >> >>>>>>>>>> (either
>> > >> >>>>>>>>>> migrate the existing repository or create a new one
>by
>> > >> >>>
>> > >> >>> reindexing)
>> > >> >>>
>> > >> >>>>>>>>>> - The lucene version used is newer but does not match
>to
>> the
>> > >> >>>>>>>>>> version
>> > >> >>>>>>>>>> from the
>> > >> >>>>>>>>>> maven-indexer dependencies. There may come up some
>> > >> >>>>>>>>>> incompatibilities
>> > >> >>>>>>>>>> that are
>> > >> >>>>>>>>>> not solvable without using a modified version of one
>of the
>> > >> >>>
>> > >> >>> both.
>> > >> >>>
>> > >> >>>>>>>>>> Or
>> > >> >>>>>>>>>> there may
>> > >> >>>>>>>>>> be the possibility to switch to solr (as separate
>> component)
>> > >> >
>> > >> >and
>> > >> >
>> > >> >>>>>> get rid
>> > >> >>>>>>
>> > >> >>>>>>>>>> of
>> > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva
>project.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some
>changes
>> > >> >
>> > >> >too:
>> > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before.
>> > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the
>indexer API.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> So switching to the new indexer and oak means more
>work as
>> > >> >>>
>> > >> >>> expected
>> > >> >>>
>> > >> >>>>>> and
>> > >> >>>>>>
>> > >> >>>>>>>>>> some
>> > >> >>>>>>>>>> risks regarding new incompatibility problems. And I
>think
>> > >> >
>> > >> >this
>> > >> >
>> > >> >>>>>> cannot be
>> > >> >>>>>>
>> > >> >>>>>>>>>> done
>> > >> >>>>>>>>>> without broken master builds for some time period.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> So, what should we do? I think maven indexer is one
>of the
>> > >> >
>> > >> >core
>> > >> >
>> > >> >>>>>>>>>> components of
>> > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to 
>migrate
>> to
>> > >> >>>
>> > >> >>> the
>> > >> >>>
>> > >> >>>>>> new
>> > >> >>>>>>
>> > >> >>>>>>>>>> indexer
>> > >> >>>>>>>>>> version, even if this means switching to jcr oak.
>Otherwise
>> > >> >
>> > >> >it
>> > >> >
>> > >> >>>>>>>>>> would
>> > >> >>>>>>>>>> mean to
>> > >> >>>>>>>>>> stick to the old version for the next years.
>> > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge
>API
>> > >> >>>
>> > >> >>> changes, I
>> > >> >>>
>> > >> >>>>>> hope
>> > >> >>>>>>
>> > >> >>>>>>>>>> you
>> > >> >>>>>>>>>> can provide  useful help.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak.
>There
>> are
>> > >> >>>
>> > >> >>> some
>> > >> >>>
>> > >> >>>>>>>>>> modules
>> > >> >>>>>>>>>> where the tests do not pass (mainly because of the
>indexer
>> > >> >
>> > >> >API
>> > >> >
>> > >> >>>>>> changes).
>> > >> >>>>>>
>> > >> >>>>>>>>>> Any comments?
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Cheers
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Martin
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb
>Olivier
>> > >> >
>> > >> >Lamy:
>> > >> >>>>>>>>>>> forget it but we need to ensure we can read maven
>index
>> > >> >>>
>> > >> >>> files....
>> > >> >>>
>> > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy
><ol...@apache.org>
>> > >> >>>
>> > >> >>> wrote:
>> > >> >>>>>>>>>>>> Hi,
>> > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so
>> upgrading
>> > >> >>>>>>
>> > >> >>>>>> Lucene
>> > >> >>>>>>
>> > >> >>>>>>>>>> can be a
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>> problem here.
>> > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a
>snapshot
>> > >> >>>
>> > >> >>> until
>> > >> >>>
>> > >> >>>>>> the
>> > >> >>>>>>
>> > >> >>>>>>>>>> release.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>> I can release it ;-)
>> > >> >>>>>>>>>>>>
>> > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin
><ma...@apache.org>
>> > >> >>>
>> > >> >>> wrote:
>> > >> >>>>>>>>>>>>> Hi,
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer.
>But I'm
>> > >> >>>
>> > >> >>> not
>> > >> >>>
>> > >> >>>>>> sure
>> > >> >>>>>>
>> > >> >>>>>>>>>> about
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> the
>> > >> >>>>>>>>>>>>> current state of maven-indexer. The version has
>not
>> > >> >
>> > >> >changed
>> > >> >
>> > >> >>>>>> since
>> > >> >>>>>>
>> > >> >>>>>>>>>> some
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> 2013.
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> There are commits on the master branch since then,
>and
>> the
>> > >> >>>>>>
>> > >> >>>>>> lucene
>> > >> >>>>>>
>> > >> >>>>>>>>>> version
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> has
>> > >> >>>>>>>>>>>>> been changed too, but no releases were tagged.
>> > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer
>> > >> >>>>>>>>>>>>> 6.0-SNAPSHOT?
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> As I know there are new compact index formats with
>new
>> > >> >>>
>> > >> >>> lucene
>> > >> >>>
>> > >> >>>>>>>>>> versions
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>>>>> but I'm
>> > >> >>>>>>>>>>>>> not sure if this is relevant for the maven
>indexes.
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> Cheers
>> > >> >>>>>>>>>>>>>
>> > >> >>>>>>>>>>>>> Martin
>> > >> >>>>>>>>>>>>
>> > >> >>>>>>>>>>>> --
>> > >> >>>>>>>>>>>> Olivier Lamy
>> > >> >>>>>>>>>>>> http://twitter.com/olamy |
>http://linkedin.com/in/olamy
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> --
>> > >> >>>>>>>>> Olivier Lamy
>> > >> >>>>>>>>> http://twitter.com/olamy |
>http://linkedin.com/in/olamy
>> > >> >>>>>>>>
>> > >> >>>>>>>> --
>> > >> >>>>>>>> Olivier Lamy
>> > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >> >>>>>
>> > >> >>>>> --
>> > >> >>>>> Olivier Lamy
>> > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >> >>
>> > >> >> --
>> > >> >> Olivier Lamy
>> > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >>
>> > >> --
>> > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail
>gesendet.
>> > >
>> > > --
>> > > Olivier Lamy
>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>>
>>
>
>
>-- 
>Olivier Lamy
>http://twitter.com/olamy | http://linkedin.com/in/olamy

-- 
Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.