You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2011/12/06 21:14:45 UTC

Solr Lucene Index Version

Is there a way to specify the index version solr uses?  We're
currently using SolrCloud but with the index format changing I'd be
preferable to be able to specify a particular index format to avoid
having to do a complete reindex.  Is this possible?

Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Thanks Robert.  I'll watch them all.  Any others that are good to keep track of?

On Thu, Dec 8, 2011 at 1:25 PM, Robert Muir <rc...@gmail.com> wrote:
> On Thu, Dec 8, 2011 at 12:55 PM, Jamie Johnson <je...@gmail.com> wrote:
>> Thanks Andrzej.  I'll continue to follow the portable format JIRA
>> along with 3622, are there any others that you're aware of that are
>> blockers that would be useful to watch?
>>
>
> There is a lot to be done, particularly norms and deleted documents.
> Some progress on norms is made on LUCENE-3606 (moved to codec,
> simpletext implementation)
> but its a stop-gap measure really until LUCENE-3622 and LUCENE-3074
> are finished, then norms can be implemented via IndexDocValues APIs.
>
> I havent really investigated deleted documents yet, but it should be
> feasible after LUCENE-3606.
>
> Then there still remains things like the fact that codec cannot
> control how the compound file format is encoded and other minor
> issues.
>
> --
> lucidimagination.com

Re: Solr Lucene Index Version

Posted by Robert Muir <rc...@gmail.com>.
On Thu, Dec 8, 2011 at 12:55 PM, Jamie Johnson <je...@gmail.com> wrote:
> Thanks Andrzej.  I'll continue to follow the portable format JIRA
> along with 3622, are there any others that you're aware of that are
> blockers that would be useful to watch?
>

There is a lot to be done, particularly norms and deleted documents.
Some progress on norms is made on LUCENE-3606 (moved to codec,
simpletext implementation)
but its a stop-gap measure really until LUCENE-3622 and LUCENE-3074
are finished, then norms can be implemented via IndexDocValues APIs.

I havent really investigated deleted documents yet, but it should be
feasible after LUCENE-3606.

Then there still remains things like the fact that codec cannot
control how the compound file format is encoded and other minor
issues.

-- 
lucidimagination.com

Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Thanks Andrzej.  I'll continue to follow the portable format JIRA
along with 3622, are there any others that you're aware of that are
blockers that would be useful to watch?

On Thu, Dec 8, 2011 at 10:49 AM, Andrzej Bialecki <ab...@getopt.org> wrote:
> On 08/12/2011 14:50, Jamie Johnson wrote:
>>
>> Mark,
>>
>> Agreed that Replication wouldn't help, I was dreaming that there was
>> some intermediate format used in replication.
>>
>> Ideally you are right, I could just reindex the data and go on with
>> life, but my case is not so simple.  Currently we have some set of
>> processes which is run against the raw artifact to index things of
>> interest within the text document.  I don't believe (and I need to
>> check with the folks who wrote this) that I have an easy way to do
>> this currently but this would be my preference.
>>
>> Andrzej,
>>
>> Isn't the codec stuff merged with trunk now?  Admittedly I know very
>> little about Lucene's index format but I'd be willing to be a guinea
>> pig if you needed a tester.
>
>
> Bulk of the work described in LUCENE-2621 has been done by Robert Muir (big
> thanks!!) and merged with trunk, but I think there may be still some parts
> missing - see LUCENE-3622.
>
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>

Re: Solr Lucene Index Version

Posted by Andrzej Bialecki <ab...@getopt.org>.
On 08/12/2011 14:50, Jamie Johnson wrote:
> Mark,
>
> Agreed that Replication wouldn't help, I was dreaming that there was
> some intermediate format used in replication.
>
> Ideally you are right, I could just reindex the data and go on with
> life, but my case is not so simple.  Currently we have some set of
> processes which is run against the raw artifact to index things of
> interest within the text document.  I don't believe (and I need to
> check with the folks who wrote this) that I have an easy way to do
> this currently but this would be my preference.
>
> Andrzej,
>
> Isn't the codec stuff merged with trunk now?  Admittedly I know very
> little about Lucene's index format but I'd be willing to be a guinea
> pig if you needed a tester.

Bulk of the work described in LUCENE-2621 has been done by Robert Muir 
(big thanks!!) and merged with trunk, but I think there may be still 
some parts missing - see LUCENE-3622.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Thanks Robert.  I'll continue to watch the Jira and try not to bother
folks about this.  Again greatly appreciate the insight.

On Thu, Dec 8, 2011 at 11:31 AM, Robert Muir <rc...@gmail.com> wrote:
> On Thu, Dec 8, 2011 at 10:46 AM, Mark Miller <ma...@gmail.com> wrote:
>>
>> On Dec 8, 2011, at 8:50 AM, Jamie Johnson wrote:
>>
>>> Isn't the codec stuff merged with trunk now?
>>
>> Robert merged this recently AFAIK.
>>
>
> true but that issue only moved the majority of the rest of the index
> (stored fields, term vectors, fieldinfos, etc) to codec.
>
> There is more work in progress/to be done before the format is really
> extensible, particularly the long TODO list at the end of
> https://issues.apache.org/jira/browse/LUCENE-2621
>
> --
> lucidimagination.com

Re: Solr Lucene Index Version

Posted by Robert Muir <rc...@gmail.com>.
On Thu, Dec 8, 2011 at 10:46 AM, Mark Miller <ma...@gmail.com> wrote:
>
> On Dec 8, 2011, at 8:50 AM, Jamie Johnson wrote:
>
>> Isn't the codec stuff merged with trunk now?
>
> Robert merged this recently AFAIK.
>

true but that issue only moved the majority of the rest of the index
(stored fields, term vectors, fieldinfos, etc) to codec.

There is more work in progress/to be done before the format is really
extensible, particularly the long TODO list at the end of
https://issues.apache.org/jira/browse/LUCENE-2621

-- 
lucidimagination.com

Re: Solr Lucene Index Version

Posted by Mark Miller <ma...@gmail.com>.
On Dec 8, 2011, at 8:50 AM, Jamie Johnson wrote:

> Isn't the codec stuff merged with trunk now? 

Robert merged this recently AFAIK.

- Mark Miller
lucidimagination.com












Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Mark,

Agreed that Replication wouldn't help, I was dreaming that there was
some intermediate format used in replication.

Ideally you are right, I could just reindex the data and go on with
life, but my case is not so simple.  Currently we have some set of
processes which is run against the raw artifact to index things of
interest within the text document.  I don't believe (and I need to
check with the folks who wrote this) that I have an easy way to do
this currently but this would be my preference.

Andrzej,

Isn't the codec stuff merged with trunk now?  Admittedly I know very
little about Lucene's index format but I'd be willing to be a guinea
pig if you needed a tester.


On Thu, Dec 8, 2011 at 5:34 AM, Andrzej Bialecki <ab...@getopt.org> wrote:
> On 08/12/2011 05:00, Mark Miller wrote:
>>
>> Replication just copies the index, so I'm not sure how this would help
>> offhand?
>>
>> With SolrCloud this is a breeze - just fire up another replica for a shard
>> and the current index will replicate to it.
>>
>> If you where willing to export the data to some portable format and then
>> pull it back in, why not just store the original data and reindex?
>
>
> This was actually one of the situations that motivated that jira issue -
> there are scenarios where reindexing, or keeping the original data, is very
> costly, in terms of space, time, I/O, pre-processing costs, curating,
> merging, etc, etc...
>
> The good news is that once the recent work on the codecs is merged with the
> trunk then we can revisit this issue and implement it with much less effort
> than before - we could even start by modifying SimpleTextCodec to be more
> lenient, and proceed from there.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>

Re: Solr Lucene Index Version

Posted by Andrzej Bialecki <ab...@getopt.org>.
On 08/12/2011 05:00, Mark Miller wrote:
> Replication just copies the index, so I'm not sure how this would help offhand?
>
> With SolrCloud this is a breeze - just fire up another replica for a shard and the current index will replicate to it.
>
> If you where willing to export the data to some portable format and then pull it back in, why not just store the original data and reindex?

This was actually one of the situations that motivated that jira issue - 
there are scenarios where reindexing, or keeping the original data, is 
very costly, in terms of space, time, I/O, pre-processing costs, 
curating, merging, etc, etc...

The good news is that once the recent work on the codecs is merged with 
the trunk then we can revisit this issue and implement it with much less 
effort than before - we could even start by modifying SimpleTextCodec to 
be more lenient, and proceed from there.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Solr Lucene Index Version

Posted by Mark Miller <ma...@gmail.com>.
Replication just copies the index, so I'm not sure how this would help offhand?

With SolrCloud this is a breeze - just fire up another replica for a shard and the current index will replicate to it.

If you where willing to export the data to some portable format and then pull it back in, why not just store the original data and reindex?

On Dec 7, 2011, at 8:39 PM, Jamie Johnson wrote:

> Yeah I was actually hoping that some how I could use the replication
> handler to do this, fire up 1 shard, set another as a slave and see if
> it would replicate the index to it but obviously I'm not sure that
> would work either.
> 
> Something like this would be great too
> https://issues.apache.org/jira/browse/LUCENE-3491
> 
> On Wed, Dec 7, 2011 at 7:48 PM, Mark Miller <ma...@gmail.com> wrote:
>> Unfortunately, I think the the only silver bullet here, for pure Solr, is to build a system that makes it possible to reindex somehow.
>> 
>> On Dec 7, 2011, at 1:38 PM, Erik Hatcher wrote:
>> 
>>> 
>>> On Dec 7, 2011, at 13:20 , Shawn Heisey wrote:
>>> 
>>>> On 12/6/2011 2:06 PM, Erik Hatcher wrote:
>>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>> 
>>>> I have no weight with the Lucene project, especially because I know very little of its internals.
>>>> 
>>>> If the code that handles each new index format were also able to read the index format that preceded it, one could incrementally step forward from revision to revision within trunk, running an optimize (forcedMerge?) at each version to upgrade the index format.
>>> 
>>> Shawn - that is the case with Lucene.  The issue Jamie is bringing up is going from an *unreleased* snapshot of Lucene to a later *unreleased* snapshot of Lucene - and those types of guarantees aren't made across snapshots like this.
>>> 
>>> 
>> 
>> - Mark Miller
>> lucidimagination.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 

- Mark Miller
lucidimagination.com












Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Yeah I was actually hoping that some how I could use the replication
handler to do this, fire up 1 shard, set another as a slave and see if
it would replicate the index to it but obviously I'm not sure that
would work either.

Something like this would be great too
https://issues.apache.org/jira/browse/LUCENE-3491

On Wed, Dec 7, 2011 at 7:48 PM, Mark Miller <ma...@gmail.com> wrote:
> Unfortunately, I think the the only silver bullet here, for pure Solr, is to build a system that makes it possible to reindex somehow.
>
> On Dec 7, 2011, at 1:38 PM, Erik Hatcher wrote:
>
>>
>> On Dec 7, 2011, at 13:20 , Shawn Heisey wrote:
>>
>>> On 12/6/2011 2:06 PM, Erik Hatcher wrote:
>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>
>>> I have no weight with the Lucene project, especially because I know very little of its internals.
>>>
>>> If the code that handles each new index format were also able to read the index format that preceded it, one could incrementally step forward from revision to revision within trunk, running an optimize (forcedMerge?) at each version to upgrade the index format.
>>
>> Shawn - that is the case with Lucene.  The issue Jamie is bringing up is going from an *unreleased* snapshot of Lucene to a later *unreleased* snapshot of Lucene - and those types of guarantees aren't made across snapshots like this.
>>
>>
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>

Re: Solr Lucene Index Version

Posted by Mark Miller <ma...@gmail.com>.
Unfortunately, I think the the only silver bullet here, for pure Solr, is to build a system that makes it possible to reindex somehow.

On Dec 7, 2011, at 1:38 PM, Erik Hatcher wrote:

> 
> On Dec 7, 2011, at 13:20 , Shawn Heisey wrote:
> 
>> On 12/6/2011 2:06 PM, Erik Hatcher wrote:
>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>> 
>> I have no weight with the Lucene project, especially because I know very little of its internals.
>> 
>> If the code that handles each new index format were also able to read the index format that preceded it, one could incrementally step forward from revision to revision within trunk, running an optimize (forcedMerge?) at each version to upgrade the index format.
> 
> Shawn - that is the case with Lucene.  The issue Jamie is bringing up is going from an *unreleased* snapshot of Lucene to a later *unreleased* snapshot of Lucene - and those types of guarantees aren't made across snapshots like this.
> 
> 

- Mark Miller
lucidimagination.com












Re: Solr Lucene Index Version

Posted by Erik Hatcher <er...@gmail.com>.
On Dec 7, 2011, at 13:20 , Shawn Heisey wrote:

> On 12/6/2011 2:06 PM, Erik Hatcher wrote:
>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
> 
> I have no weight with the Lucene project, especially because I know very little of its internals.
> 
> If the code that handles each new index format were also able to read the index format that preceded it, one could incrementally step forward from revision to revision within trunk, running an optimize (forcedMerge?) at each version to upgrade the index format.

Shawn - that is the case with Lucene.  The issue Jamie is bringing up is going from an *unreleased* snapshot of Lucene to a later *unreleased* snapshot of Lucene - and those types of guarantees aren't made across snapshots like this.



Re: Solr Lucene Index Version

Posted by Shawn Heisey <so...@elyograg.org>.
On 12/6/2011 2:06 PM, Erik Hatcher wrote:
> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.

I have no weight with the Lucene project, especially because I know very 
little of its internals.

If the code that handles each new index format were also able to read 
the index format that preceded it, one could incrementally step forward 
from revision to revision within trunk, running an optimize 
(forcedMerge?) at each version to upgrade the index format.

On the other hand, any reasonable production installation will consist 
of redundant hardware, and if someone is already willing to take the 
risk of running trunk, it can be argued that they should also be 
prepared to take down one of their redundant systems and use it to 
reindex on an upgraded version, then run parallel update programs on 
both versions.  This is what I did when upgrading from 1.4.1 to 3.2, 
because of the javabin difficulties.  Once I had that system in place, I 
also used it for the minor steps to 3.4 and 3.5.

I'd hope for the former, but if that's not going to happen, there is the 
latter.

Thanks,
Shawn


Re: Solr Lucene Index Version

Posted by Erik Hatcher <er...@gmail.com>.
Jamie -

The details would of course be entirely dependent on what changed, but with Lucene trunk/4.0 there is the flexible indexing API with codecs.  I imagine with a compatibility codec layer one could provide some insulation to changes.

You're at big scale, so the "just reindex everything" answer isn't really satisfactory I understand.  But locking in to a version of Lucene may be a decent stop-gap solution, and if/when the format changes you can upgrade one node at a time (the Solr request/response won't change!) and reindex in a rolling manner probably.  Again, it's still risky as there may be changes to the index format needed for enhancements to SolrCloud that you want so you'd be stuck at a fixed place with SolrCloud until you could do some reindexing.

	Erik


On Dec 7, 2011, at 08:50 , Jamie Johnson wrote:

> Erik,
> 
> Do you have any details behind what would be required to write a tool
> to move from one index format to another?  Any examples/suggestions
> would be appreciated.
> 
> On Tue, Dec 6, 2011 at 5:19 PM, Jamie Johnson <je...@gmail.com> wrote:
>> What about modifying something like SolrIndexConfig.java to change the
>> lucene version that is used when creating the index?  (may not be the
>> right place, but is something like this possible?)
>> 
>> On Tue, Dec 6, 2011 at 5:13 PM, Erik Hatcher <er...@gmail.com> wrote:
>>> Right.  Not sure what to advise you.  We have worked on this problem with our LucidWorks platform and have some tools available to do this sort of thing, I think, but it's not generally something that you can do with Lucene going from a snapshot to a released version.  Perhaps others with deeper insight will chime in.
>>> 
>>>        Erik
>>> 
>>> 
>>> 
>>> On Dec 6, 2011, at 16:54 , Jamie Johnson wrote:
>>> 
>>>> Problem is that really doesn't help me.  We still have the same issue
>>>> that when the 4.0 becomes final there is no migration utility from
>>>> this pre 4.0 version to 4.0, right?
>>>> 
>>>> 
>>>> On Tue, Dec 6, 2011 at 4:36 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>>> Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.
>>>>> 
>>>>>        Erik
>>>>> 
>>>>> On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:
>>>>> 
>>>>>> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
>>>>>> able to just move the 3.5 jars in and remove any of the snapshot jars
>>>>>> that are present when I build locally?
>>>>>> 
>>>>>> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>>>>> Jamie -
>>>>>>> 
>>>>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>>>>> 
>>>>>>>        Erik
>>>>>>> 
>>>>>>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>>>>>>> 
>>>>>>>> Thanks, but I don't believe that will do it.  From my understanding
>>>>>>>> that does not control the index version written, it's used to control
>>>>>>>> the behavior of some analyzers (taken from some googling).  I'd love
>>>>>>>> if someone told me otherwise though.
>>>>>>>> 
>>>>>>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>>>>>>> Hi, I'm not sure if it would help.
>>>>>>>>> 
>>>>>>>>> in solrconfig.xml:
>>>>>>>>> 
>>>>>>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>>>>>>       adhere to.  Generally, you want to use the latest version to
>>>>>>>>>       get all bug fixes and improvements. It is highly recommended
>>>>>>>>>       that you fully re-index after changing this setting as it can
>>>>>>>>>       affect both how text is indexed and queried.
>>>>>>>>>    -->
>>>>>>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Is there a way to specify the index version solr uses?  We're
>>>>>>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>>>>>>> preferable to be able to specify a particular index format to avoid
>>>>>>>>>> having to do a complete reindex.  Is this possible?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Alireza Salimi
>>>>>>>>> Java EE Developer
>>>>>>> 
>>>>> 
>>> 


Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Erik,

Do you have any details behind what would be required to write a tool
to move from one index format to another?  Any examples/suggestions
would be appreciated.

On Tue, Dec 6, 2011 at 5:19 PM, Jamie Johnson <je...@gmail.com> wrote:
> What about modifying something like SolrIndexConfig.java to change the
> lucene version that is used when creating the index?  (may not be the
> right place, but is something like this possible?)
>
> On Tue, Dec 6, 2011 at 5:13 PM, Erik Hatcher <er...@gmail.com> wrote:
>> Right.  Not sure what to advise you.  We have worked on this problem with our LucidWorks platform and have some tools available to do this sort of thing, I think, but it's not generally something that you can do with Lucene going from a snapshot to a released version.  Perhaps others with deeper insight will chime in.
>>
>>        Erik
>>
>>
>>
>> On Dec 6, 2011, at 16:54 , Jamie Johnson wrote:
>>
>>> Problem is that really doesn't help me.  We still have the same issue
>>> that when the 4.0 becomes final there is no migration utility from
>>> this pre 4.0 version to 4.0, right?
>>>
>>>
>>> On Tue, Dec 6, 2011 at 4:36 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>> Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.
>>>>
>>>>        Erik
>>>>
>>>> On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:
>>>>
>>>>> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
>>>>> able to just move the 3.5 jars in and remove any of the snapshot jars
>>>>> that are present when I build locally?
>>>>>
>>>>> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>>>> Jamie -
>>>>>>
>>>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>>>>
>>>>>>        Erik
>>>>>>
>>>>>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>>>>>>
>>>>>>> Thanks, but I don't believe that will do it.  From my understanding
>>>>>>> that does not control the index version written, it's used to control
>>>>>>> the behavior of some analyzers (taken from some googling).  I'd love
>>>>>>> if someone told me otherwise though.
>>>>>>>
>>>>>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>>>>>> Hi, I'm not sure if it would help.
>>>>>>>>
>>>>>>>> in solrconfig.xml:
>>>>>>>>
>>>>>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>>>>>       adhere to.  Generally, you want to use the latest version to
>>>>>>>>       get all bug fixes and improvements. It is highly recommended
>>>>>>>>       that you fully re-index after changing this setting as it can
>>>>>>>>       affect both how text is indexed and queried.
>>>>>>>>    -->
>>>>>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Is there a way to specify the index version solr uses?  We're
>>>>>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>>>>>> preferable to be able to specify a particular index format to avoid
>>>>>>>>> having to do a complete reindex.  Is this possible?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alireza Salimi
>>>>>>>> Java EE Developer
>>>>>>
>>>>
>>

Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
What about modifying something like SolrIndexConfig.java to change the
lucene version that is used when creating the index?  (may not be the
right place, but is something like this possible?)

On Tue, Dec 6, 2011 at 5:13 PM, Erik Hatcher <er...@gmail.com> wrote:
> Right.  Not sure what to advise you.  We have worked on this problem with our LucidWorks platform and have some tools available to do this sort of thing, I think, but it's not generally something that you can do with Lucene going from a snapshot to a released version.  Perhaps others with deeper insight will chime in.
>
>        Erik
>
>
>
> On Dec 6, 2011, at 16:54 , Jamie Johnson wrote:
>
>> Problem is that really doesn't help me.  We still have the same issue
>> that when the 4.0 becomes final there is no migration utility from
>> this pre 4.0 version to 4.0, right?
>>
>>
>> On Tue, Dec 6, 2011 at 4:36 PM, Erik Hatcher <er...@gmail.com> wrote:
>>> Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.
>>>
>>>        Erik
>>>
>>> On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:
>>>
>>>> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
>>>> able to just move the 3.5 jars in and remove any of the snapshot jars
>>>> that are present when I build locally?
>>>>
>>>> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>>> Jamie -
>>>>>
>>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>>>
>>>>>        Erik
>>>>>
>>>>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>>>>>
>>>>>> Thanks, but I don't believe that will do it.  From my understanding
>>>>>> that does not control the index version written, it's used to control
>>>>>> the behavior of some analyzers (taken from some googling).  I'd love
>>>>>> if someone told me otherwise though.
>>>>>>
>>>>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>>>>> Hi, I'm not sure if it would help.
>>>>>>>
>>>>>>> in solrconfig.xml:
>>>>>>>
>>>>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>>>>       adhere to.  Generally, you want to use the latest version to
>>>>>>>       get all bug fixes and improvements. It is highly recommended
>>>>>>>       that you fully re-index after changing this setting as it can
>>>>>>>       affect both how text is indexed and queried.
>>>>>>>    -->
>>>>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Is there a way to specify the index version solr uses?  We're
>>>>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>>>>> preferable to be able to specify a particular index format to avoid
>>>>>>>> having to do a complete reindex.  Is this possible?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alireza Salimi
>>>>>>> Java EE Developer
>>>>>
>>>
>

Re: Solr Lucene Index Version

Posted by Erik Hatcher <er...@gmail.com>.
Right.  Not sure what to advise you.  We have worked on this problem with our LucidWorks platform and have some tools available to do this sort of thing, I think, but it's not generally something that you can do with Lucene going from a snapshot to a released version.  Perhaps others with deeper insight will chime in.

	Erik



On Dec 6, 2011, at 16:54 , Jamie Johnson wrote:

> Problem is that really doesn't help me.  We still have the same issue
> that when the 4.0 becomes final there is no migration utility from
> this pre 4.0 version to 4.0, right?
> 
> 
> On Tue, Dec 6, 2011 at 4:36 PM, Erik Hatcher <er...@gmail.com> wrote:
>> Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.
>> 
>>        Erik
>> 
>> On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:
>> 
>>> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
>>> able to just move the 3.5 jars in and remove any of the snapshot jars
>>> that are present when I build locally?
>>> 
>>> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>>>> Jamie -
>>>> 
>>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>> 
>>>>        Erik
>>>> 
>>>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>>>> 
>>>>> Thanks, but I don't believe that will do it.  From my understanding
>>>>> that does not control the index version written, it's used to control
>>>>> the behavior of some analyzers (taken from some googling).  I'd love
>>>>> if someone told me otherwise though.
>>>>> 
>>>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>>>> Hi, I'm not sure if it would help.
>>>>>> 
>>>>>> in solrconfig.xml:
>>>>>> 
>>>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>>>       adhere to.  Generally, you want to use the latest version to
>>>>>>       get all bug fixes and improvements. It is highly recommended
>>>>>>       that you fully re-index after changing this setting as it can
>>>>>>       affect both how text is indexed and queried.
>>>>>>    -->
>>>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>>>> 
>>>>>> 
>>>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>>>> 
>>>>>>> Is there a way to specify the index version solr uses?  We're
>>>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>>>> preferable to be able to specify a particular index format to avoid
>>>>>>> having to do a complete reindex.  Is this possible?
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Alireza Salimi
>>>>>> Java EE Developer
>>>> 
>> 


Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Problem is that really doesn't help me.  We still have the same issue
that when the 4.0 becomes final there is no migration utility from
this pre 4.0 version to 4.0, right?


On Tue, Dec 6, 2011 at 4:36 PM, Erik Hatcher <er...@gmail.com> wrote:
> Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.
>
>        Erik
>
> On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:
>
>> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
>> able to just move the 3.5 jars in and remove any of the snapshot jars
>> that are present when I build locally?
>>
>> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>>> Jamie -
>>>
>>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>>>
>>>        Erik
>>>
>>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>>>
>>>> Thanks, but I don't believe that will do it.  From my understanding
>>>> that does not control the index version written, it's used to control
>>>> the behavior of some analyzers (taken from some googling).  I'd love
>>>> if someone told me otherwise though.
>>>>
>>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>>> Hi, I'm not sure if it would help.
>>>>>
>>>>> in solrconfig.xml:
>>>>>
>>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>>       adhere to.  Generally, you want to use the latest version to
>>>>>       get all bug fixes and improvements. It is highly recommended
>>>>>       that you fully re-index after changing this setting as it can
>>>>>       affect both how text is indexed and queried.
>>>>>    -->
>>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>>>
>>>>>
>>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>>>
>>>>>> Is there a way to specify the index version solr uses?  We're
>>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>>> preferable to be able to specify a particular index format to avoid
>>>>>> having to do a complete reindex.  Is this possible?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alireza Salimi
>>>>> Java EE Developer
>>>
>

Re: Solr Lucene Index Version

Posted by Erik Hatcher <er...@gmail.com>.
Oh geez... no... I didn't mean 3.x JARs... I meant the trunk/4.0 ones that are there now.

	Erik

On Dec 6, 2011, at 16:22 , Jamie Johnson wrote:

> So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
> able to just move the 3.5 jars in and remove any of the snapshot jars
> that are present when I build locally?
> 
> On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
>> Jamie -
>> 
>> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>> 
>>        Erik
>> 
>> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>> 
>>> Thanks, but I don't believe that will do it.  From my understanding
>>> that does not control the index version written, it's used to control
>>> the behavior of some analyzers (taken from some googling).  I'd love
>>> if someone told me otherwise though.
>>> 
>>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>>> Hi, I'm not sure if it would help.
>>>> 
>>>> in solrconfig.xml:
>>>> 
>>>>  <!-- Controls what version of Lucene various components of Solr
>>>>       adhere to.  Generally, you want to use the latest version to
>>>>       get all bug fixes and improvements. It is highly recommended
>>>>       that you fully re-index after changing this setting as it can
>>>>       affect both how text is indexed and queried.
>>>>    -->
>>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>> 
>>>> 
>>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>> 
>>>>> Is there a way to specify the index version solr uses?  We're
>>>>> currently using SolrCloud but with the index format changing I'd be
>>>>> preferable to be able to specify a particular index format to avoid
>>>>> having to do a complete reindex.  Is this possible?
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Alireza Salimi
>>>> Java EE Developer
>> 


Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
So if I wanted to used lucene index 3.5 with SolrCloud I "should" be
able to just move the 3.5 jars in and remove any of the snapshot jars
that are present when I build locally?

On Tue, Dec 6, 2011 at 4:06 PM, Erik Hatcher <er...@gmail.com> wrote:
> Jamie -
>
> I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.
>
>        Erik
>
> On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:
>
>> Thanks, but I don't believe that will do it.  From my understanding
>> that does not control the index version written, it's used to control
>> the behavior of some analyzers (taken from some googling).  I'd love
>> if someone told me otherwise though.
>>
>> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>>> Hi, I'm not sure if it would help.
>>>
>>> in solrconfig.xml:
>>>
>>>  <!-- Controls what version of Lucene various components of Solr
>>>       adhere to.  Generally, you want to use the latest version to
>>>       get all bug fixes and improvements. It is highly recommended
>>>       that you fully re-index after changing this setting as it can
>>>       affect both how text is indexed and queried.
>>>    -->
>>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>>>
>>>
>>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>>>
>>>> Is there a way to specify the index version solr uses?  We're
>>>> currently using SolrCloud but with the index format changing I'd be
>>>> preferable to be able to specify a particular index format to avoid
>>>> having to do a complete reindex.  Is this possible?
>>>>
>>>
>>>
>>>
>>> --
>>> Alireza Salimi
>>> Java EE Developer
>

Re: Solr Lucene Index Version

Posted by Erik Hatcher <er...@gmail.com>.
Jamie -

I think the best thing that you could do here would be to lock in a version of Lucene (all the Lucene libraries) that you use with SolrCloud.  Certainly not out of the realm of possibilities of some upcoming SolrCloud capability that requires some upgrading of Lucene though, but you may be set for a little while at least.

	Erik

On Dec 6, 2011, at 15:57 , Jamie Johnson wrote:

> Thanks, but I don't believe that will do it.  From my understanding
> that does not control the index version written, it's used to control
> the behavior of some analyzers (taken from some googling).  I'd love
> if someone told me otherwise though.
> 
> On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
>> Hi, I'm not sure if it would help.
>> 
>> in solrconfig.xml:
>> 
>>  <!-- Controls what version of Lucene various components of Solr
>>       adhere to.  Generally, you want to use the latest version to
>>       get all bug fixes and improvements. It is highly recommended
>>       that you fully re-index after changing this setting as it can
>>       affect both how text is indexed and queried.
>>    -->
>>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>> 
>> 
>> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>> 
>>> Is there a way to specify the index version solr uses?  We're
>>> currently using SolrCloud but with the index format changing I'd be
>>> preferable to be able to specify a particular index format to avoid
>>> having to do a complete reindex.  Is this possible?
>>> 
>> 
>> 
>> 
>> --
>> Alireza Salimi
>> Java EE Developer


Re: Solr Lucene Index Version

Posted by Jamie Johnson <je...@gmail.com>.
Thanks, but I don't believe that will do it.  From my understanding
that does not control the index version written, it's used to control
the behavior of some analyzers (taken from some googling).  I'd love
if someone told me otherwise though.

On Tue, Dec 6, 2011 at 3:48 PM, Alireza Salimi <al...@gmail.com> wrote:
> Hi, I'm not sure if it would help.
>
> in solrconfig.xml:
>
>  <!-- Controls what version of Lucene various components of Solr
>       adhere to.  Generally, you want to use the latest version to
>       get all bug fixes and improvements. It is highly recommended
>       that you fully re-index after changing this setting as it can
>       affect both how text is indexed and queried.
>    -->
>  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>
>
>
> On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:
>
>> Is there a way to specify the index version solr uses?  We're
>> currently using SolrCloud but with the index format changing I'd be
>> preferable to be able to specify a particular index format to avoid
>> having to do a complete reindex.  Is this possible?
>>
>
>
>
> --
> Alireza Salimi
> Java EE Developer

Re: Solr Lucene Index Version

Posted by Alireza Salimi <al...@gmail.com>.
Hi, I'm not sure if it would help.

in solrconfig.xml:

 <!-- Controls what version of Lucene various components of Solr
       adhere to.  Generally, you want to use the latest version to
       get all bug fixes and improvements. It is highly recommended
       that you fully re-index after changing this setting as it can
       affect both how text is indexed and queried.
    -->
  <luceneMatchVersion>LUCENE_34</luceneMatchVersion>


On Tue, Dec 6, 2011 at 3:14 PM, Jamie Johnson <je...@gmail.com> wrote:

> Is there a way to specify the index version solr uses?  We're
> currently using SolrCloud but with the index format changing I'd be
> preferable to be able to specify a particular index format to avoid
> having to do a complete reindex.  Is this possible?
>



-- 
Alireza Salimi
Java EE Developer