You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Viparthi, Kiran (AFIS)" <Ki...@fao.org> on 2004/05/17 11:56:40 UTC
RE: SELECTIVE Indexing
Try using Tidy.
Creates a Document of the html and allows you to apply xpath.
Hope this helps.
Kiran.
-----Original Message-----
From: Karthik N S [mailto:karthik@controlnet.co.in]
Sent: 17 May 2004 11:59
To: Lucene Users List
Subject: SELECTIVE Indexing
Hi all
Can Some Body tell me How to Index CERTAIN PORTION OF THE HTML FILE Only
ex:-
<table .....>
....
</table>
with regards
Karthik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
RE: SELECTIVE Indexing
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Lucene has no plug-in architecture, and does not assume you are
indexing web pages, so your use of JTidy is all up to you, and
independent of Lucene. Just feed Lucene the resulting text that you
want to index and search.
Otis
--- Karthik N S <ka...@controlnet.co.in> wrote:
> Hi
>
> Can I Use TIDY [as plug in ] with Lucene ...
>
>
> with regards
> Karthik
>
> -----Original Message-----
> From: Viparthi, Kiran (AFIS) [mailto:Kiran.Viparthi@fao.org]
> Sent: Monday, May 17, 2004 3:27 PM
> To: 'Lucene Users List'
> Subject: RE: SELECTIVE Indexing
>
>
>
> Try using Tidy.
> Creates a Document of the html and allows you to apply xpath.
> Hope this helps.
>
> Kiran.
>
> -----Original Message-----
> From: Karthik N S [mailto:karthik@controlnet.co.in]
> Sent: 17 May 2004 11:59
> To: Lucene Users List
> Subject: SELECTIVE Indexing
>
>
>
> Hi all
>
> Can Some Body tell me How to Index CERTAIN PORTION OF THE HTML
> FILE Only
>
> ex:-
> <table .....>
> ....
>
> </table>
>
>
> with regards
> Karthik
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
RE: SELECTIVE Indexing
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi
Can I Use TIDY [as plug in ] with Lucene ...
with regards
Karthik
-----Original Message-----
From: Viparthi, Kiran (AFIS) [mailto:Kiran.Viparthi@fao.org]
Sent: Monday, May 17, 2004 3:27 PM
To: 'Lucene Users List'
Subject: RE: SELECTIVE Indexing
Try using Tidy.
Creates a Document of the html and allows you to apply xpath.
Hope this helps.
Kiran.
-----Original Message-----
From: Karthik N S [mailto:karthik@controlnet.co.in]
Sent: 17 May 2004 11:59
To: Lucene Users List
Subject: SELECTIVE Indexing
Hi all
Can Some Body tell me How to Index CERTAIN PORTION OF THE HTML FILE Only
ex:-
<table .....>
....
</table>
with regards
Karthik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org