You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by Andrew Golightly <A....@dcs.shef.ac.uk> on 2006/05/04 15:57:50 UTC

backing up lucene indexing in lenya with svn

Hey everyone,

As you all probably know lucene saves it's indexes in the work 
directory. We store our publication in an svn repo, and I am wondering 
whether it is wise to save the indexing too.

Advantages:
- If the publication is checked out elsewhere, searching is immediately 
available.

Disadvantages:
- If people are working on their own local copies, then the indexing 
directory constantly conflicts within svn. And that includes binary 
files like: work/lucene/index/authoring/index/segments So resolution is 
near impossible.

 From what I understand, lenya 1.4 indexes it's pages when 
submit/publish is called in the workflow. So if a site was checked out 
elsewhere and the indexing was not saved, then the whole site would have 
to be manually re-published. Correct?

Appreciate your input.
Andrew

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Josias Thöny <jo...@wyona.com>.
On Fri, 2006-05-26 at 13:51 +0100, Andrew Golightly wrote:
> 
> Doug Chestnut wrote:
> 
> >
> >
> > Andrew Golightly wrote:
> >
> >> Hey everyone,
> >>
> >> As you all probably know lucene saves it's indexes in the work 
> >> directory. We store our publication in an svn repo, and I am 
> >> wondering whether it is wise to save the indexing too.
> >>
> >> Advantages:
> >> - If the publication is checked out elsewhere, searching is 
> >> immediately available.
> >>
> >> Disadvantages:
> >> - If people are working on their own local copies, then the indexing 
> >> directory constantly conflicts within svn. And that includes binary 
> >> files like: work/lucene/index/authoring/index/segments So resolution 
> >> is near impossible.
> >>
> >>  From what I understand, lenya 1.4 indexes it's pages when 
> >> submit/publish is called in the workflow. So if a site was checked 
> >> out elsewhere and the indexing was not saved, then the whole site 
> >> would have to be manually re-published. Correct?
> >
> > Well, the pages that were published elsewhere would need to be 
> > republished I guess.  We really need a reindex-publication usecase for 
> > this.
> 
> Anyone thought more about this? I'm constantly losing my my indexes 
> everytime I do a "./build.sh clean" 

You could try to store the indexes outside of the webapp by changing the
paths in lucene_index.xconf. This way they should not be deleted when
you do a build clean.

hth,
Josias

> If someone is close to releasing a 
> script that re-indexes the entire publication, please let me know!
> 
> thank you :)
> Andrew
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
> For additional commands, e-mail: user-help@lenya.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Andrew Golightly <A....@dcs.shef.ac.uk>.

Doug Chestnut wrote:

>
>
> Andrew Golightly wrote:
>
>> Hey everyone,
>>
>> As you all probably know lucene saves it's indexes in the work 
>> directory. We store our publication in an svn repo, and I am 
>> wondering whether it is wise to save the indexing too.
>>
>> Advantages:
>> - If the publication is checked out elsewhere, searching is 
>> immediately available.
>>
>> Disadvantages:
>> - If people are working on their own local copies, then the indexing 
>> directory constantly conflicts within svn. And that includes binary 
>> files like: work/lucene/index/authoring/index/segments So resolution 
>> is near impossible.
>>
>>  From what I understand, lenya 1.4 indexes it's pages when 
>> submit/publish is called in the workflow. So if a site was checked 
>> out elsewhere and the indexing was not saved, then the whole site 
>> would have to be manually re-published. Correct?
>
> Well, the pages that were published elsewhere would need to be 
> republished I guess.  We really need a reindex-publication usecase for 
> this.

Anyone thought more about this? I'm constantly losing my my indexes 
everytime I do a "./build.sh clean" If someone is close to releasing a 
script that re-indexes the entire publication, please let me know!

thank you :)
Andrew

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Doug Chestnut <dh...@virginia.edu>.
I have not had the time to look into it too closely.  Looks like one 
would need to remove the context use of the existing index usecase (use 
of page envelope to get pub and id) and make a usecase that iterates 
through all the documents and indexes.

I would like to make a solr module.  Eric Hatcher (lucene guru) works 
down the hall from me and was showing some very cool solr functionality 
to me last week.  Using a separate indexing/searching server would help 
with your problems I think.

--Doug

Andrew Golightly wrote:
> 
> 
> Michael Wechner wrote:
> 
>> Doug Chestnut wrote:
>>
>>>
>>>
>>> Andrew Golightly wrote:
>>>
>>>> Hey everyone,
>>>>
>>>> As you all probably know lucene saves it's indexes in the work 
>>>> directory. We store our publication in an svn repo, and I am 
>>>> wondering whether it is wise to save the indexing too.
>>>>
>>>> Advantages:
>>>> - If the publication is checked out elsewhere, searching is 
>>>> immediately available.
>>>>
>>>> Disadvantages:
>>>> - If people are working on their own local copies, then the indexing 
>>>> directory constantly conflicts within svn. And that includes binary 
>>>> files like: work/lucene/index/authoring/index/segments So resolution 
>>>> is near impossible.
>>>>
>>>>  From what I understand, lenya 1.4 indexes it's pages when 
>>>> submit/publish is called in the workflow. So if a site was checked 
>>>> out elsewhere and the indexing was not saved, then the whole site 
>>>> would have to be manually re-published. Correct?
> 
> 
>>>
>>> Well, the pages that were published elsewhere would need to be 
>>> republished I guess.  We really need a reindex-publication usecase 
>>> for this.
>>
>>
>>
>> agreed, whereas something like this already exists within Lenya 1.2.x
>>
>> But we should also make the location of the index configurable with 
>> local config files. Maybe it's already possible or easy to fix, but I 
>> have not found time yet to check.
> 
> Anyone made any progress on this idea? Does it already exist in some way 
> and I've just missed it?
> 
> Andrew
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
> For additional commands, e-mail: user-help@lenya.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Andrew Golightly <A....@dcs.shef.ac.uk>.

Michael Wechner wrote:
> Doug Chestnut wrote:
>
>>
>>
>> Andrew Golightly wrote:
>>
>>> Hey everyone,
>>>
>>> As you all probably know lucene saves it's indexes in the work 
>>> directory. We store our publication in an svn repo, and I am 
>>> wondering whether it is wise to save the indexing too.
>>>
>>> Advantages:
>>> - If the publication is checked out elsewhere, searching is 
>>> immediately available.
>>>
>>> Disadvantages:
>>> - If people are working on their own local copies, then the indexing 
>>> directory constantly conflicts within svn. And that includes binary 
>>> files like: work/lucene/index/authoring/index/segments So resolution 
>>> is near impossible.
>>>
>>>  From what I understand, lenya 1.4 indexes it's pages when 
>>> submit/publish is called in the workflow. So if a site was checked 
>>> out elsewhere and the indexing was not saved, then the whole site 
>>> would have to be manually re-published. Correct?

>>
>> Well, the pages that were published elsewhere would need to be 
>> republished I guess.  We really need a reindex-publication usecase 
>> for this.
>
>
> agreed, whereas something like this already exists within Lenya 1.2.x
>
> But we should also make the location of the index configurable with 
> local config files. Maybe it's already possible or easy to fix, but I 
> have not found time yet to check.
Anyone made any progress on this idea? Does it already exist in some way 
and I've just missed it?

Andrew

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Michael Wechner <mi...@wyona.com>.
Doug Chestnut wrote:

>
>
> Andrew Golightly wrote:
>
>> Hey everyone,
>>
>> As you all probably know lucene saves it's indexes in the work 
>> directory. We store our publication in an svn repo, and I am 
>> wondering whether it is wise to save the indexing too.
>>
>> Advantages:
>> - If the publication is checked out elsewhere, searching is 
>> immediately available.
>>
>> Disadvantages:
>> - If people are working on their own local copies, then the indexing 
>> directory constantly conflicts within svn. And that includes binary 
>> files like: work/lucene/index/authoring/index/segments So resolution 
>> is near impossible.
>>
>>  From what I understand, lenya 1.4 indexes it's pages when 
>> submit/publish is called in the workflow. So if a site was checked 
>> out elsewhere and the indexing was not saved, then the whole site 
>> would have to be manually re-published. Correct?
>
> Well, the pages that were published elsewhere would need to be 
> republished I guess.  We really need a reindex-publication usecase for 
> this.


agreed, whereas something like this already exists within Lenya 1.2.x

But we should also make the location of the index configurable with 
local config files. Maybe it's already possible or easy to fix, but I 
have not found time yet to check.

Michi

>
>
> --Doug
>
>>
>> Appreciate your input.
>> Andrew
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
>> For additional commands, e-mail: user-help@lenya.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
> For additional commands, e-mail: user-help@lenya.apache.org
>
>


-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org
+41 44 272 91 61


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: backing up lucene indexing in lenya with svn

Posted by Doug Chestnut <dh...@virginia.edu>.

Andrew Golightly wrote:
> Hey everyone,
> 
> As you all probably know lucene saves it's indexes in the work 
> directory. We store our publication in an svn repo, and I am wondering 
> whether it is wise to save the indexing too.
> 
> Advantages:
> - If the publication is checked out elsewhere, searching is immediately 
> available.
> 
> Disadvantages:
> - If people are working on their own local copies, then the indexing 
> directory constantly conflicts within svn. And that includes binary 
> files like: work/lucene/index/authoring/index/segments So resolution is 
> near impossible.
> 
>  From what I understand, lenya 1.4 indexes it's pages when 
> submit/publish is called in the workflow. So if a site was checked out 
> elsewhere and the indexing was not saved, then the whole site would have 
> to be manually re-published. Correct?
Well, the pages that were published elsewhere would need to be 
republished I guess.  We really need a reindex-publication usecase for this.

--Doug
> 
> Appreciate your input.
> Andrew
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
> For additional commands, e-mail: user-help@lenya.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org