You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@forrest.apache.org by "Szabo, Patrick (LNG-VIE)" <pa...@lexisnexis.at> on 2010/11/08 13:25:41 UTC

Lucene Search

Hi, 
 
I'm new to Forrest and my boss in all it's wisdom decided that i have to
administrate our forrest installation now. 
I wasn't involved in our installation until now. 
 
I've already read the forrest documentation and it did help me, but
there are quite a few question that i couldn't answer.
 
I understand that i can use fields to search with lucene. Is there a
list somewhere where i can see which fields are available ?!
Can i somehow add fields ?! 
We have an element <revised modified="23.06.2010"/> in our source xml
and i would like to be able to serch for the date. E.g. i want to see
all the documents that where modified last week.

I've got a few more questions but i don't want to pack them all in just
one mail.

I hope you guys can help me !

Thanks in advance...

Kind regards


. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
 XSLT-Entwickler 
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@lexisnexis.at
Tel.: +43 (1) 534 52 - 1573 
Fax: +43 (1) 534 52 - 146 

http://shop.lexisnexis.at/
http://shop.lexisnexis.at/


AW: Lucene Search

Posted by "Szabo, Patrick (LNG-VIE)" <pa...@lexisnexis.at>.
Thanks a lot David - your tipps have already helped me ^^

kind regards 


. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
 XSLT-Entwickler 
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@lexisnexis.at
Tel.: +43 (1) 534 52 - 1573 
Fax: +43 (1) 534 52 - 146 


-----Ursprüngliche Nachricht-----

Von: David Crossley [mailto:crossley@apache.org] 
Gesendet: Mittwoch, 10. November 2010 15:23
An: user@forrest.apache.org
Betreff: Re: Lucene Search

Szabo, Patrick (LNG-VIE) wrote:
> 
> I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template. 
> 
> I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

Do 'forrest run' then request
localhost:8888/index.xml

See some other tips in
http://forrest.apache.org/howto-dev.html

Also perhaps see the main/webapp/search.xmap
which will show some other internal requests
that might be useful such as
localhost:8888/index.lucene

> I'm not very familiar with cocoon jet so that might be a dumb question.
> 
> The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Yes.

-David



Re: Lucene Search

Posted by David Crossley <cr...@apache.org>.
Szabo, Patrick (LNG-VIE) wrote:
> 
> I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template. 
> 
> I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

Do 'forrest run' then request
localhost:8888/index.xml

See some other tips in
http://forrest.apache.org/howto-dev.html

Also perhaps see the main/webapp/search.xmap
which will show some other internal requests
that might be useful such as
localhost:8888/index.lucene

> I'm not very familiar with cocoon jet so that might be a dumb question.
> 
> The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Yes.

-David

AW: Lucene Search

Posted by "Szabo, Patrick (LNG-VIE)" <pa...@lexisnexis.at>.
Hi, 

Thanks for your response.

I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template. 

I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

I'm not very familiar with cocoon jet so that might be a dumb question.

The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Thanks

kind regards 


. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
 XSLT-Entwickler 
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@lexisnexis.at
Tel.: +43 (1) 534 52 - 1573 
Fax: +43 (1) 534 52 - 146 

http://shop.lexisnexis.at/
http://shop.lexisnexis.at/
-----Ursprüngliche Nachricht-----

Von: Tim Williams [mailto:williamstw@gmail.com] 
Gesendet: Dienstag, 09. November 2010 04:08
An: user@forrest.apache.org
Betreff: Re: Lucene Search

On Mon, Nov 8, 2010 at 7:25 AM, Szabo, Patrick (LNG-VIE)
<pa...@lexisnexis.at> wrote:
> Hi,
>
> I'm new to Forrest and my boss in all it's wisdom decided that i have to
> administrate our forrest installation now.
> I wasn't involved in our installation until now.
>
> I've already read the forrest documentation and it did help me, but
> there are quite a few question that i couldn't answer.

Yeah, in trying to answer your question I realize that the
documentation in this area is weak.  After you understand this stuff,
it'd be great if you could contribute to them.

> I understand that i can use fields to search with lucene. Is there a
> list somewhere where i can see which fields are available ?!

For any given document, you can add a .lucene extension and the
element names are the searchable field names.  These are mostly title,
subtitle, abstract, version, author, and content.

> Can i somehow add fields ?!

I don't think there's a way to add per-project fields, but for you
Forrest implementation, you can try to add it into
$FORREST_HOME/main/webapp/resources/stylesheets/xdoc-to-lucene.xsl

> We have an element <revised modified="23.06.2010"/> in our source xml
> and i would like to be able to serch for the date. E.g. i want to see
> all the documents that where modified last week.

Hmm... I'm not sure, I reckon you'd have to index it based on its xdoc
equivalent.

> I've got a few more questions but i don't want to pack them all in just
> one mail.

Yeah, one per thread is always preferred - mail threads are cheap though:)

I'm not sure what version of Forrest you are on but I had to do some
hacking just to get the indexing to work - I reckon it's been a
long-standing bug.  I'll take that up on the dev@ list though.  Good
luck!

--tim

Re: Lucene Search

Posted by Tim Williams <wi...@gmail.com>.
On Mon, Nov 8, 2010 at 7:25 AM, Szabo, Patrick (LNG-VIE)
<pa...@lexisnexis.at> wrote:
> Hi,
>
> I'm new to Forrest and my boss in all it's wisdom decided that i have to
> administrate our forrest installation now.
> I wasn't involved in our installation until now.
>
> I've already read the forrest documentation and it did help me, but
> there are quite a few question that i couldn't answer.

Yeah, in trying to answer your question I realize that the
documentation in this area is weak.  After you understand this stuff,
it'd be great if you could contribute to them.

> I understand that i can use fields to search with lucene. Is there a
> list somewhere where i can see which fields are available ?!

For any given document, you can add a .lucene extension and the
element names are the searchable field names.  These are mostly title,
subtitle, abstract, version, author, and content.

> Can i somehow add fields ?!

I don't think there's a way to add per-project fields, but for you
Forrest implementation, you can try to add it into
$FORREST_HOME/main/webapp/resources/stylesheets/xdoc-to-lucene.xsl

> We have an element <revised modified="23.06.2010"/> in our source xml
> and i would like to be able to serch for the date. E.g. i want to see
> all the documents that where modified last week.

Hmm... I'm not sure, I reckon you'd have to index it based on its xdoc
equivalent.

> I've got a few more questions but i don't want to pack them all in just
> one mail.

Yeah, one per thread is always preferred - mail threads are cheap though:)

I'm not sure what version of Forrest you are on but I had to do some
hacking just to get the indexing to work - I reckon it's been a
long-standing bug.  I'll take that up on the dev@ list though.  Good
luck!

--tim