You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Michael Conrad <mi...@newsrx.com> on 2021/10/08 12:47:42 UTC

Upgrade Solr Segments: UpgradeIndexMergePolicy

Would this help?

UpgradeIndexMergePolicy

This |MergePolicy| 
<https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/MergePolicy.html> 
is used for upgrading all existing segments of an index when calling 
|IndexWriter.forceMerge(int)| 
<https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/IndexWriter.html#forceMerge-int->. 
All other methods delegate to the base |MergePolicy| given to the 
constructor. This allows for an as-cheap-as possible upgrade of an older 
index by only upgrading segments that are created by previous Lucene 
versions. forceMerge does no longer really merge; it is just used to 
"forceMerge" older segment versions away.


On 10/7/21 8:46 AM, Rahul Goswami wrote:
> Won’t work. I have tried optimize on 7.7.2 to 8.x where several segments
> were originally written in 5.x and 6.x.
> We are scratching our heads to achieve this seamlessly since reindexing
> will take several weeks given the size of indexes for many of our customers.
>
> -Rahul
>
> On Thu, Oct 7, 2021 at 8:35 AM Michael Conrad <mi...@newsrx.com> wrote:
>
>> No, worst case is it closes the index writer and leaves the drive full.
>> 20k free space remaining.
>>
>> On 10/6/21 1:56 PM, Dave wrote:
>>> It’s ok. Worst case it just fails and kills the temporary index after
>> you run out of space. Really optimize is almost not even supported (it
>> still works) but a full reindex is always the best bet if you can destroy
>> original and it doesn’t effect anything
>>>> On Oct 6, 2021, at 1:53 PM, Michael Conrad <mi...@newsrx.com> wrote:
>>>>
>>>> too late.... it's in progress.
>>>>
>>>>> On 10/6/21 9:11 AM, Dave wrote:
>>>>> Hold on that idea then. An optimize will use three times your index
>> size possibly.
>>>>>>> On Oct 6, 2021, at 9:02 AM, Michael Conrad <mi...@newsrx.com>
>> wrote:
>>>>>> Thanks,
>>>>>>
>>>>>> I think we'll try the full optimize route as we don't have storage to
>> spare for second copies, etc.
>>>>>> -Mike
>>>>>>
>>>>>>> On 10/6/21 8:54 AM, Dave wrote:
>>>>>>> Personally I always do a full reindex when going to a new version,
>> just safer and you should always be able to do such at any point.  However
>> if you got the time to spare you can do an optimize and it will force the
>> segments all into the current version
>>>>>>>>> On Oct 6, 2021, at 8:46 AM, Michael Conrad <mi...@newsrx.com>
>> wrote:
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> Is there an easy way to determine Lucene versions for segments?
>>>>>>>>
>>>>>>>> If we were to do a full reindex, rewriting all segments, would that
>> update the segment version to match the current Lucene version in use?
>>>>>>>> We are working on upgrading from Solr 7.7.3 to Solr 8.x but have
>> discovered that several of our collections have segments that are Lucene 6.
>>>>>>>> -Mike
>>


How to use? Upgrade Solr Segments: UpgradeIndexMergePolicy

Posted by Michael Conrad <mi...@newsrx.com>.
It seems to cause my merge requests to become no-ops ?

I trying this for a single smaller collection.

<mergePolicyFactory 
class="org.apache.solr.index.UpgradeIndexMergePolicyFactory">
<str name="wrapped.prefix">mergePolicy</str>
<str 
name="mergePolicy.class">org.apache.solr.index.TieredMergePolicyFactory</str>
<double name="mergePolicy.noCFSRatio">0.1</double>
</mergePolicyFactory>



On 10/8/21 9:40 AM, Rahul Goswami wrote:
> Thanks. I will check this out. I remember going through the code a while
> back where there is an explicit check in one of the codec classes for
> versions older than 7.x and it throws an IndexFormatTooOldException. So I
> doubt this will help.
> But I will be glad to be proved wrong if this works.
>
>
> On Fri, Oct 8, 2021 at 8:48 AM Michael Conrad <mi...@newsrx.com> wrote:
>
>> Would this help?
>>
>> UpgradeIndexMergePolicy
>>
>> This |MergePolicy|
>> <
>> https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/MergePolicy.html>
>>
>> is used for upgrading all existing segments of an index when calling
>> |IndexWriter.forceMerge(int)|
>> <
>> https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/IndexWriter.html#forceMerge-int->.
>>
>> All other methods delegate to the base |MergePolicy| given to the
>> constructor. This allows for an as-cheap-as possible upgrade of an older
>> index by only upgrading segments that are created by previous Lucene
>> versions. forceMerge does no longer really merge; it is just used to
>> "forceMerge" older segment versions away.
>>
>>
>> On 10/7/21 8:46 AM, Rahul Goswami wrote:
>>> Won’t work. I have tried optimize on 7.7.2 to 8.x where several segments
>>> were originally written in 5.x and 6.x.
>>> We are scratching our heads to achieve this seamlessly since reindexing
>>> will take several weeks given the size of indexes for many of our
>> customers.
>>> -Rahul
>>>
>>> On Thu, Oct 7, 2021 at 8:35 AM Michael Conrad <mi...@newsrx.com>
>> wrote:
>>>> No, worst case is it closes the index writer and leaves the drive full.
>>>> 20k free space remaining.
>>>>
>>>> On 10/6/21 1:56 PM, Dave wrote:
>>>>> It’s ok. Worst case it just fails and kills the temporary index after
>>>> you run out of space. Really optimize is almost not even supported (it
>>>> still works) but a full reindex is always the best bet if you can
>> destroy
>>>> original and it doesn’t effect anything
>>>>>> On Oct 6, 2021, at 1:53 PM, Michael Conrad <mi...@newsrx.com>
>> wrote:
>>>>>> too late.... it's in progress.
>>>>>>
>>>>>>> On 10/6/21 9:11 AM, Dave wrote:
>>>>>>> Hold on that idea then. An optimize will use three times your index
>>>> size possibly.
>>>>>>>>> On Oct 6, 2021, at 9:02 AM, Michael Conrad <mi...@newsrx.com>
>>>> wrote:
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> I think we'll try the full optimize route as we don't have storage
>> to
>>>> spare for second copies, etc.
>>>>>>>> -Mike
>>>>>>>>
>>>>>>>>> On 10/6/21 8:54 AM, Dave wrote:
>>>>>>>>> Personally I always do a full reindex when going to a new version,
>>>> just safer and you should always be able to do such at any point.
>> However
>>>> if you got the time to spare you can do an optimize and it will force
>> the
>>>> segments all into the current version
>>>>>>>>>>> On Oct 6, 2021, at 8:46 AM, Michael Conrad <mi...@newsrx.com>
>>>> wrote:
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> Is there an easy way to determine Lucene versions for segments?
>>>>>>>>>>
>>>>>>>>>> If we were to do a full reindex, rewriting all segments, would
>> that
>>>> update the segment version to match the current Lucene version in use?
>>>>>>>>>> We are working on upgrading from Solr 7.7.3 to Solr 8.x but have
>>>> discovered that several of our collections have segments that are
>> Lucene 6.
>>>>>>>>>> -Mike
>>


Re: Upgrade Solr Segments: UpgradeIndexMergePolicy

Posted by Rahul Goswami <ra...@gmail.com>.
Thanks. I will check this out. I remember going through the code a while
back where there is an explicit check in one of the codec classes for
versions older than 7.x and it throws an IndexFormatTooOldException. So I
doubt this will help.
But I will be glad to be proved wrong if this works.


On Fri, Oct 8, 2021 at 8:48 AM Michael Conrad <mi...@newsrx.com> wrote:

> Would this help?
>
> UpgradeIndexMergePolicy
>
> This |MergePolicy|
> <
> https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/MergePolicy.html>
>
> is used for upgrading all existing segments of an index when calling
> |IndexWriter.forceMerge(int)|
> <
> https://lucene.apache.org/core/8_2_0//core/org/apache/lucene/index/IndexWriter.html#forceMerge-int->.
>
> All other methods delegate to the base |MergePolicy| given to the
> constructor. This allows for an as-cheap-as possible upgrade of an older
> index by only upgrading segments that are created by previous Lucene
> versions. forceMerge does no longer really merge; it is just used to
> "forceMerge" older segment versions away.
>
>
> On 10/7/21 8:46 AM, Rahul Goswami wrote:
> > Won’t work. I have tried optimize on 7.7.2 to 8.x where several segments
> > were originally written in 5.x and 6.x.
> > We are scratching our heads to achieve this seamlessly since reindexing
> > will take several weeks given the size of indexes for many of our
> customers.
> >
> > -Rahul
> >
> > On Thu, Oct 7, 2021 at 8:35 AM Michael Conrad <mi...@newsrx.com>
> wrote:
> >
> >> No, worst case is it closes the index writer and leaves the drive full.
> >> 20k free space remaining.
> >>
> >> On 10/6/21 1:56 PM, Dave wrote:
> >>> It’s ok. Worst case it just fails and kills the temporary index after
> >> you run out of space. Really optimize is almost not even supported (it
> >> still works) but a full reindex is always the best bet if you can
> destroy
> >> original and it doesn’t effect anything
> >>>> On Oct 6, 2021, at 1:53 PM, Michael Conrad <mi...@newsrx.com>
> wrote:
> >>>>
> >>>> too late.... it's in progress.
> >>>>
> >>>>> On 10/6/21 9:11 AM, Dave wrote:
> >>>>> Hold on that idea then. An optimize will use three times your index
> >> size possibly.
> >>>>>>> On Oct 6, 2021, at 9:02 AM, Michael Conrad <mi...@newsrx.com>
> >> wrote:
> >>>>>> Thanks,
> >>>>>>
> >>>>>> I think we'll try the full optimize route as we don't have storage
> to
> >> spare for second copies, etc.
> >>>>>> -Mike
> >>>>>>
> >>>>>>> On 10/6/21 8:54 AM, Dave wrote:
> >>>>>>> Personally I always do a full reindex when going to a new version,
> >> just safer and you should always be able to do such at any point.
> However
> >> if you got the time to spare you can do an optimize and it will force
> the
> >> segments all into the current version
> >>>>>>>>> On Oct 6, 2021, at 8:46 AM, Michael Conrad <mi...@newsrx.com>
> >> wrote:
> >>>>>>>> Hello all,
> >>>>>>>>
> >>>>>>>> Is there an easy way to determine Lucene versions for segments?
> >>>>>>>>
> >>>>>>>> If we were to do a full reindex, rewriting all segments, would
> that
> >> update the segment version to match the current Lucene version in use?
> >>>>>>>> We are working on upgrading from Solr 7.7.3 to Solr 8.x but have
> >> discovered that several of our collections have segments that are
> Lucene 6.
> >>>>>>>> -Mike
> >>
>
>