You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Viparthi, Kiran (AFIS)" <Ki...@fao.org> on 2004/05/17 11:56:40 UTC

RE: SELECTIVE Indexing

Try using Tidy.
Creates a Document of the html and allows you to apply xpath.
Hope this helps.

Kiran.

-----Original Message-----
From: Karthik N S [mailto:karthik@controlnet.co.in] 
Sent: 17 May 2004 11:59
To: Lucene Users List
Subject: SELECTIVE Indexing



Hi all

   Can Some Body tell me How to Index  CERTAIN PORTION OF THE HTML FILE Only

   ex:-
        <table .....>
               ....

         </table>


with regards
Karthik




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: SELECTIVE Indexing

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Lucene has no plug-in architecture, and does not assume you are
indexing web pages, so your use of JTidy is all up to you, and
independent of Lucene.  Just feed Lucene the resulting text that you
want to index and search.

Otis

--- Karthik N S <ka...@controlnet.co.in> wrote:
> Hi
> 
> Can I Use TIDY [as plug in ] with Lucene ...
> 
> 
> with regards
> Karthik
> 
> -----Original Message-----
> From: Viparthi, Kiran (AFIS) [mailto:Kiran.Viparthi@fao.org]
> Sent: Monday, May 17, 2004 3:27 PM
> To: 'Lucene Users List'
> Subject: RE: SELECTIVE Indexing
> 
> 
> 
> Try using Tidy.
> Creates a Document of the html and allows you to apply xpath.
> Hope this helps.
> 
> Kiran.
> 
> -----Original Message-----
> From: Karthik N S [mailto:karthik@controlnet.co.in]
> Sent: 17 May 2004 11:59
> To: Lucene Users List
> Subject: SELECTIVE Indexing
> 
> 
> 
> Hi all
> 
>    Can Some Body tell me How to Index  CERTAIN PORTION OF THE HTML
> FILE Only
> 
>    ex:-
>         <table .....>
>                ....
> 
>          </table>
> 
> 
> with regards
> Karthik
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: SELECTIVE Indexing

Posted by Karthik N S <ka...@controlnet.co.in>.
Hi

Can I Use TIDY [as plug in ] with Lucene ...


with regards
Karthik

-----Original Message-----
From: Viparthi, Kiran (AFIS) [mailto:Kiran.Viparthi@fao.org]
Sent: Monday, May 17, 2004 3:27 PM
To: 'Lucene Users List'
Subject: RE: SELECTIVE Indexing



Try using Tidy.
Creates a Document of the html and allows you to apply xpath.
Hope this helps.

Kiran.

-----Original Message-----
From: Karthik N S [mailto:karthik@controlnet.co.in]
Sent: 17 May 2004 11:59
To: Lucene Users List
Subject: SELECTIVE Indexing



Hi all

   Can Some Body tell me How to Index  CERTAIN PORTION OF THE HTML FILE Only

   ex:-
        <table .....>
               ....

         </table>


with regards
Karthik




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org