You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com> on 2003/03/04 18:33:18 UTC

Regarding Setup Lucine for my site

The documentation says:

Once you've gotten this far you're probably itching to go. Let's start by creating the index you'll need for the web examples. Since you've already set your classpath in the previous examples, all you need to do is type "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..". You'll need to do this from a (any) subdirectory of your {tomcat}/webapps directory (make sure you didn't leave off the ".." or you'll get a null pointer exception). {index-dir} should be a directory that Tomcat has permission to read and write, but is outside of a web accessible context. By default the webapp is configured to look in /opt/lucene/index for this index. 

A copy of my site is in:

C:\CopiaSite20030228\

My web application runs on

http://mydomain.com/search/index.jsp

how can I make the lucene index map the URLs of the indexed files to:

http://mydomain.com/

 

Please help!


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Jeff Linwood <je...@greenninja.com>.
One point to note about Lucene is that it isn't a stand-alone search engine
like Inktomi.  It lets you build a search engine into your application.
You (as a developer) are responsible for writing the code that adds your
content to the index, and for writing the code that displays the search
results to the user. The demo code is great, but it's really just a start
for your applications

One approach would be to store the paths of each file (relative to
c:\myfiles\www) as a field on the document, and then use that path to build
up a link in the search results page.   You could add a server name here if
you needed to

Hope this helps,
Jeff
----- Original Message -----
From: "Pinky Iyer" <pi...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Tuesday, March 04, 2003 2:24 PM
Subject: Re: Regarding Setup Lucine for my site


>
> I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
> Thanks in advance!
>  Samuel Alfonso Velázquez Díaz <sa...@yahoo.com> wrote:
> Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
> java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
> /thmp/my_www: Is the path to the directory where the index is to be
created
> Project/Egothor/var/www: is the path to the local file system files to be
indexed.
> and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
> Thanks for your comments, any way now I know that I have to modify code to
do this.
> Regards!
> Jeff Linwood wrote:Hi,
>
> I'm not a hundred percent sure I understand what you are asking, but when
> you get the results back from Lucene (the hits) it's up to you to format
> them to display on a web page - you can always do the modification there
> when you display the links to the results.
>
> Jeff
> ----- Original Message -----
> From: "Samuel Alfonso Velázquez Díaz"
> To: "Lucene Users List"
> Sent: Tuesday, March 04, 2003 11:33 AM
> Subject: Regarding Setup Lucine for my site
>
>
> >
> > The documentation says:
> >
> > Once you've gotten this far you're probably itching to go. Let's start
by
> creating the index you'll need for the web examples. Since you've already
> set your classpath in the previous examples, all you need to do is type
> "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
> You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
> directory (make sure you didn't leave off the ".." or you'll get a null
> pointer exception). {index-dir} should be a directory that Tomcat has
> permission to read and write, but is outside of a web accessible context.
By
> default the webapp is configured to look in /opt/lucene/index for this
> index.
> >
> > A copy of my site is in:
> >
> > C:\CopiaSite20030228\
> >
> > My web application runs on
> >
> > http://mydomain.com/search/index.jsp
> >
> > how can I make the lucene index map the URLs of the indexed files to:
> >
> > http://mydomain.com/
> >
> >
> >
> > Please help!
> >
> >
> > Samuel Alfonso Velázquez Díaz
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> >
> >
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> Samuel Alfonso Velázquez Díaz
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Catalin <ca...@cyber.ro>.
hi there all !
the .zip is available (by request) 
at: 
http://dev.cabanova.ro/java/lucene/

have fun !

Catalin

  ----- Original Message ----- 
  From: maurits van wijland 
  To: Lucene Users List 
  Sent: Wednesday, March 05, 2003 6:17 PM
  Subject: Re: Regarding Setup Lucine for my site


  Catalin,
  could you send me a zip file with your implementation?

  Thanks,

  maurits
  ----- Original Message -----
  From: "Catalin" <ca...@cyber.ro>
  To: "Lucene Users List" <lu...@jakarta.apache.org>
  Sent: Wednesday, March 05, 2003 10:26 AM
  Subject: Re: Regarding Setup Lucine for my site


  hi there !
  we have almost the same configuration (site, index, paths, etc) like you.
  we used for our search on the site another approach.

  eg: use a small crawler to index some feeded urls,
  make the lucene index, make the web search app to use that index.

  for crawling:
  http://cvs.cabanova.ro/viewcvs.cgi/indexer/

  for webapp:
  http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

  running online:
  http://www.anet.ro/search?query=star+wars

  the code of the indexer is based on i2a websearch application demo
  that is listed on lucene jakarta site.

  take a look, maybe you might find something usefull !
  there is no .zip available for download.
  but if somebody requests the .zip
  we can put it online.

  have fun !

  Catalin

    ----- Original Message -----
    From: Samuel Alfonso Velázquez Díaz
    To: Lucene Users List
    Sent: Wednesday, March 05, 2003 3:16 AM
    Subject: Re: Regarding Setup Lucine for my site



    Yes I have
    1.- The directory with the files to index:
    C:/filesToIndex/www/

    2.- A path where the index files from the search engine will be created,
  lets say
    C:/index/
    3.- I have an internet domain whose name is: www.mysite.com
    4.- A web application context that runs at http://www.mysite.com/search

    Once I have set all the above things I want to be able to use the search
  aplication:
    http://www.mysite.com/search/search.jsp
    And I dont want that the results that I get from the index (step 2) give
  me results like
    Your file is at
    C:/filesToIndex/www/some_html/my_doc.html
    The results should be:
    Your file is at
    http://www.mysite.com/some_html/my_doc.html
    For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
  is no way to generate the index with some custom prefix (as
  http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
    It seems that I have to modify my web application
  (http://www.mysite.com/search/search.jsp) to include some logic to repalce
  "C:/filesToIndex/www/" to "http://www.mysite.com/".
    If you could point me to the source code of lucene to include this logic
  and this way fix it once and for all, will appreciate a lot.
    The command I used to generate this index was:
    java org.apache.lucene.demo.IndexHTML -create -index index C:\index
  C:\filesToIndex\ www\
    Now in the web application I have to modify
          IndexSearcher searcher;
          Query query;
          Hits hits;

          // some code after...
         hits = searcher.search(query);

          for ( /* search through the hit list*/)

              Document doc = hits.doc(i);
              String doctitle = doc.get("title");
              String url = doc.get("url");

    I have to do some thing like url = "http://www.mysite.com/" +
  url.substring("C:/filesToIndex/www/".length);

    Regards!!!
    And thanks again
     Pinky Iyer <pi...@yahoo.com> wrote:
    I dont understand the explanantion. When I try and index the documents as
  mentioned in the examples, and then when i run the app and do a sample
  search, it does point to the directory structure say "c:/filesToIndex/www/"
  instead of "http://localhost:8080/www/". So how can this be changed to
  reflect the website domain as mentioned by you. Could you explain again. Say
  my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
  said http://localhost:8080/ , then how to proceed!
    Thanks in advance!
    Samuel Alfonso Velázquez Díaz wrote:
    Oh ok, I thougth it was going to be some thing like the egothor search
  engine (A java based search engine). When you create the Index, you issue a
  command like:
    java org.egothor.indexer.mirror.DoTanker /tmp/my_www
  Project/Egothor/var/www as http://localhost:8080
    /thmp/my_www: Is the path to the directory where the index is to be
  created
    Project/Egothor/var/www: is the path to the local file system files to be
  indexed.
    and as http://localhost:8080 is the prefix that the index will keep on the
  hit list. This way the index will be relative to http://localhost:8080. Even
  if your production site may be an other site.
    Thanks for your comments, any way now I know that I have to modify code to
  do this.
    Regards!
    Jeff Linwood wrote:Hi,

    I'm not a hundred percent sure I understand what you are asking, but when
    you get the results back from Lucene (the hits) it's up to you to format
    them to display on a web page - you can always do the modification there
    when you display the links to the results.

    Jeff
    ----- Original Message -----
    From: "Samuel Alfonso Velázquez Díaz"
    To: "Lucene Users List"
    Sent: Tuesday, March 04, 2003 11:33 AM
    Subject: Regarding Setup Lucine for my site


    >
    > The documentation says:
    >
    > Once you've gotten this far you're probably itching to go. Let's start
  by
    creating the index you'll need for the web examples. Since you've already
    set your classpath in the previous examples, all you need to do is type
    "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
    You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
    directory (make sure you didn't leave off the ".." or you'll get a null
    pointer exception). {index-dir} should be a directory that Tomcat has
    permission to read and write, but is outside of a web accessible context.
  By
    default the webapp is configured to look in /opt/lucene/index for this
    index.
    >
    > A copy of my site is in:
    >
    > C:\CopiaSite20030228\
    >
    > My web application runs on
    >
    > http://mydomain.com/search/index.jsp
    >
    > how can I make the lucene index map the URLs of the indexed files to:
    >
    > http://mydomain.com/
    >
    >
    >
    > Please help!
    >
    >
    > Samuel Alfonso Velázquez Díaz
    > http://www.geocities.com/samuelvd
    > samuelvd@yahoo.com
    >
    >
    > ---------------------------------
    > Do you Yahoo!?
    > Yahoo! Tax Center - forms, calculators, tips, and more


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org


    Samuel Alfonso Velázquez Díaz
    http://www.geocities.com/samuelvd
    samuelvd@yahoo.com


    ---------------------------------
    Do you Yahoo!?
    Yahoo! Tax Center - forms, calculators, tips, and more


    ---------------------------------
    Do you Yahoo!?
    Yahoo! Tax Center - forms, calculators, tips, and more

    Samuel Alfonso Velázquez Díaz
    http://www.geocities.com/samuelvd
    samuelvd@yahoo.com


    ---------------------------------
    Do you Yahoo!?
    Yahoo! Tax Center - forms, calculators, tips, and more


  ---------------------------------------------------------------------
  To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  For additional commands, e-mail: lucene-user-help@jakarta.apache.org



Re: i2a websearch application demo ???

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Yeah sorry!
 Otis Gospodnetic <ot...@yahoo.com> wrote:For all i2a questions please contact its author.
i2a websearch application just _uses_ Lucene, it is not a part of
Lucene.

Otis

--- Pinky Iyer 
wrote:
> 


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: i2a websearch application demo ???

Posted by Otis Gospodnetic <ot...@yahoo.com>.
For all i2a questions please contact its author.
i2a websearch application just _uses_ Lucene, it is not a part of
Lucene.

Otis

--- Pinky Iyer <pi...@yahoo.com> wrote:
> 
> I am trying to setup the i2a websearch app, when i go to admin
> section and choose the index with detail or any of the option i dont
> see any indexex being created under the main directory, am I doing
> anything wrong?
> I did change the websearch.xml to point to appropriate site
> (http://localhost:8080/index.jsp) still no index is being created.
> Any help???
> P Iyer
>  Pinky Iyer <pi...@yahoo.com> wrote:
> A license for the application has not been determined yet as of now.
> It will most likely be BSD, ASL or GPL. Until then, there is
> disclaimer. 
> 
> Thanks!
> 
> 
> Samuel Alfonso Vel�zquez D�az wrote:
> Wow the features of i2a Web Search are just what I need!
> I have just added to my servlet engine, but so far I read the readme,
> but could not find if this application is GPL or LGPL, is it?
> Pinky Iyer 
> wrote:
> Thanks!
> Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.html
> the 6th in the list is i2a Web Search
> 
> Catalin
> 
> ----- Original Message -----
> From: Pinky Iyer
> To: Lucene Users List
> Sent: Wednesday, March 05, 2003 6:26 PM
> Subject: i2a websearch application demo ???
> 
> 
> 
> COuld anybody tell me where in the Jakarta site is this "i2a
> websearch
> application demo". Is this the demo under "getting started" under
> "lucene".
> If thats so i dont see that using any crawler.
> It would be nice if the jakartha site itself has a search
> incorporated in
> the site.
> Thanks!
> P Iyer
> maurits van wijland wrote:Catalin,
> could you send me a zip file with your implementation?
> 
> Thanks,
> 
> maurits
> ----- Original Message -----
> From: "Catalin"
> To: "Lucene Users List"
> Sent: Wednesday, March 05, 2003 10:26 AM
> Subject: Re: Regarding Setup Lucine for my site
> 
> 
> hi there !
> we have almost the same configuration (site, index, paths, etc) like
> you.
> we used for our search on the site another approach.
> 
> eg: use a small crawler to index some feeded urls,
> make the lucene index, make the web search app to use that index.
> 
> for crawling:
> http://cvs.cabanova.ro/viewcvs.cgi/indexer/
> 
> for webapp:
> http://cvs.cabanova.ro/viewcvs.cgi/wsearch/
> 
> running online:
> http://www.anet.ro/search?query=star+wars
> 
> the code of the indexer is based on i2a websearch application demo
> that is listed on lucene jakarta site.
> 
> take a look, maybe you might find something usefull !
> there is no .zip available for download.
> but if somebody requests the .zip
> we can put it online.
> 
> have fun !
> 
> Catalin
> 
> ----- Original Message -----
> From: Samuel Alfonso Vel�zquez D�az
> To: Lucene Users List
> Sent: Wednesday, March 05, 2003 3:16 AM
> Subject: Re: Regarding Setup Lucine for my site
> 
> 
> 
> Yes I have
> 1.- The directory with the files to index:
> C:/filesToIndex/www/
> 
> 2.- A path where the index files from the search engine will be
> created,
> lets say
> C:/index/
> 3.- I have an internet domain whose name is: www.mysite.com
> 4.- A web application context that runs at
> http://www.mysite.com/search
> 
> Once I have set all the above things I want to be able to use the
> search
> aplication:
> http://www.mysite.com/search/search.jsp
> And I dont want that the results that I get from the index (step 2)
> give
> me results like
> Your file is at
> C:/filesToIndex/www/some_html/my_doc.html
> The results should be:
> Your file is at
> http://www.mysite.com/some_html/my_doc.html
> For the comments I have read (THANK YOU VERY MUTCH) I conclude that
> there
> is no way to generate the index with some custom prefix (as
> http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
> It seems that I have to modify my web application
> (http://www.mysite.com/search/search.jsp) to include some logic to
> repalce
> "C:/filesToIndex/www/" to "http://www.mysite.com/".
> If you could point me to the source code of lucene to include this
> logic
> and this way fix it once and for all, will appreciate a lot.
> The command I used to generate this index was:
> java org.apache.lucene.demo.IndexHTML -create -index index C:\index
> C:\filesToIndex\ www\
> Now in the web application I have to modify
> IndexSearcher searcher;
> Query query;
> Hits hits;
> 
> // some code after...
> hits = searcher.search(query);
> 
> for ( /* search through the hit list*/)
> 
> Document doc = hits.doc(i);
> String doctitle = doc.get("title");
> String url = doc.get("url");
> 
> I have to do some thing like url = "http://www.mysite.com/" +
> url.substring("C:/filesToIndex/www/".length);
> 
> Regards!!!
> And thanks again
> Pinky Iyer
> wrote:
> I dont understand the explanantion. When I try and index the
> documents as
> mentioned in the examples, and then when i run the app and do a
> sample
> search, it does point to the directory structure say
> "c:/filesToIndex/www/"
> instead of "http://localhost:8080/www/". So how can this be changed
> to
> reflect the website domain as mentioned by you. Could you explain
> again. Say
> my docs are under a directory c:/filesToIndex/www/ and the wesite is
> as you
> said http://localhost:8080/ , then how to proceed!
> Thanks in advance!
> Samuel Alfonso Vel�zquez D�az wrote:
> Oh ok, I thougth it was going to be some thing like the egothor
> search
> engine (A java based search engine). When you create the Index, you
> issue a
> command like:
> java org.egothor.indexer.mirror.DoTanker /tmp/my_www
> Project/Egothor/var/www as http://localhost:8080
> /thmp/my_www: Is the path to the directory where the index is to be
> created
> Project/Egothor/var/www: is the path to the local file system files
> to be
> indexed.
> and as http://localhost:8080 is the prefix that the index will keep
> on the
> hit list. This way the index will be relative to
> http://localhost:8080. Even
> if your production site may be an other site.
> Thanks for your comments, any way now I know that I have to modify
> code to
> do this.
> Regards!
> Jeff Linwood wrote:Hi,
> 
> I'm not a hundred percent sure I understand what you are asking, but
> when
> you get the results back from Lucene (the hits) it's up to you to
> format
> them to display on a web page - you can always do the modification
> there
> when you display the links to the results.
> 
> Jeff
> ----- Original Message -----
> From: "Samuel Alfonso Vel�zquez D�az"
> To: "Lucene Users List"
> Sent: Tuesday, March 04, 2003 11:33 AM
> Subject: Regarding Setup Lucine for my site
> 
> 
> >
> > The documentation says:
> >
> > Once you've gotten this far you're probably itching to go. Let's
> start
> by
> creating the index you'll need for the web examples. Since you've
> already
> set your classpath in the previous examples, all you need to do is
> type
> "java org.apache.lucene.demo.IndexHTML -create -index {index-dir}
> ..".
> You'll need to do this from a (any) subdirectory of your
> {tomcat}/webapps
> directory (make sure you didn't leave off the ".." or you'll get a
> null
> pointer exception). {index-dir} should be a directory that Tomcat has
> permission to read and write, but is outside of a web accessible
> context.
> By
> default the webapp is configured to look in /opt/lucene/index for
> this
> index.
> >
> > A copy of my site is in:
> >
> > C:\CopiaSite20030228\
> >
> > My web application runs on
> >
> > http://mydomain.com/search/index.jsp
> >
> > how can I make the lucene index map the URLs of the indexed files
> to:
> >
> > http://mydomain.com/
> >
> >
> >
> > Please help!
> >
> >
> > Samuel Alfonso Vel�zquez D�az
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> >
> >
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: i2a websearch application demo ???

Posted by Pinky Iyer <pi...@yahoo.com>.
I am trying to setup the i2a websearch app, when i go to admin section and choose the index with detail or any of the option i dont see any indexex being created under the main directory, am I doing anything wrong?
I did change the websearch.xml to point to appropriate site (http://localhost:8080/index.jsp) still no index is being created.
Any help???
P Iyer
 Pinky Iyer <pi...@yahoo.com> wrote:
A license for the application has not been determined yet as of now. It will most likely be BSD, ASL or GPL. Until then, there is disclaimer. 

Thanks!


Samuel Alfonso Vel�zquez D�az wrote:
Wow the features of i2a Web Search are just what I need!
I have just added to my servlet engine, but so far I read the readme, but could not find if this application is GPL or LGPL, is it?
Pinky Iyer 
wrote:
Thanks!
Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search

Catalin

----- Original Message -----
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???



COuld anybody tell me where in the Jakarta site is this "i2a websearch
application demo". Is this the demo under "getting started" under "lucene".
If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in
the site.
Thanks!
P Iyer
maurits van wijland wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin"
To: "Lucene Users List"
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Vel�zquez D�az
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: i2a websearch application demo ???

Posted by Pinky Iyer <pi...@yahoo.com>.
A license for the application has not been determined yet as of now. It will most likely be BSD, ASL or GPL. Until then, there is  disclaimer. 

 Thanks!

 
 Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com> wrote:
Wow the features of i2a Web Search are just what I need!
I have just added to my servlet engine, but so far I read the readme, but could not find if this application is GPL or LGPL, is it?
Pinky Iyer 
wrote:
Thanks!
Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search

Catalin

----- Original Message -----
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???



COuld anybody tell me where in the Jakarta site is this "i2a websearch
application demo". Is this the demo under "getting started" under "lucene".
If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in
the site.
Thanks!
P Iyer
maurits van wijland wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin"
To: "Lucene Users List"
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Vel�zquez D�az
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: i2a websearch application demo ???

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Wow the features of i2a Web Search are just what I need!
I have just added to my servlet engine, but so far I read the readme, but could not find if this application is GPL or LGPL, is it?
 Pinky Iyer <pi...@yahoo.com> wrote:
Thanks!
Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search

Catalin

----- Original Message -----
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???



COuld anybody tell me where in the Jakarta site is this "i2a websearch
application demo". Is this the demo under "getting started" under "lucene".
If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in
the site.
Thanks!
P Iyer
maurits van wijland wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin"
To: "Lucene Users List"
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Vel�zquez D�az
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: i2a websearch application demo ???

Posted by Pinky Iyer <pi...@yahoo.com>.
Thanks!
 Catalin <ca...@cyber.ro> wrote:http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search

Catalin

----- Original Message -----
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???



COuld anybody tell me where in the Jakarta site is this "i2a websearch
application demo". Is this the demo under "getting started" under "lucene".
If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in
the site.
Thanks!
P Iyer
maurits van wijland wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin"
To: "Lucene Users List"
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Vel�zquez D�az
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: i2a websearch application demo ???

Posted by Catalin <ca...@cyber.ro>.
http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search

Catalin

----- Original Message -----
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???



COuld anybody tell me where in the Jakarta site is this "i2a websearch
application demo". Is this the demo under "getting started" under "lucene".
If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in
the site.
Thanks!
P Iyer
 maurits van wijland <m....@quicknet.nl> wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin"
To: "Lucene Users List"
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Velázquez Díaz
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Velázquez Díaz wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Velázquez Díaz"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Velázquez Díaz
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Velázquez Díaz
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Velázquez Díaz
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: i2a websearch application demo ???

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
I downloaded and instaled the i2a websearch application. Looks fine, but I have a problem, my site contains a lot of Macromedia Flash Objects and there are a lot of links of my site in this flash objects. Clearly this links wouldn't be crwaled easily. Is there a way to create a index for i2a websearch or to adapt code to parse a directory structure?
On the other hand I have some pdfs files and they seem not to get indexed. I looked at the servlet cointainer log and found:
java.util.zip.ZipException: unknown compression method
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:140)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:105)
        at com.i2a.websearch.PDFHandler.parseDataStream(PDFHandler.java:467)
        at com.i2a.websearch.PDFHandler.parseContent(PDFHandler.java:339)


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

i2a websearch application demo ???

Posted by Pinky Iyer <pi...@yahoo.com>.
COuld anybody tell me where in the Jakarta site is this "i2a websearch application demo". Is this the demo under "getting started" under "lucene". If thats so i dont see that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in the site.
Thanks!
P Iyer
 maurits van wijland <m....@quicknet.nl> wrote:Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin" 
To: "Lucene Users List" 
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message -----
From: Samuel Alfonso Vel�zquez D�az
To: Lucene Users List
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created,
lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search
aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give
me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
Now in the web application I have to modify
IndexSearcher searcher;
Query query;
Hits hits;

// some code after...
hits = searcher.search(query);

for ( /* search through the hit list*/)

Document doc = hits.doc(i);
String doctitle = doc.get("title");
String url = doc.get("url");

I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer 
wrote:
I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be
created
Project/Egothor/var/www: is the path to the local file system files to be
indexed.
and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to
do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az"
To: "Lucene Users List"
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start
by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context.
By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by maurits van wijland <m....@quicknet.nl>.
Catalin,
could you send me a zip file with your implementation?

Thanks,

maurits
----- Original Message -----
From: "Catalin" <ca...@cyber.ro>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site


hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

  ----- Original Message -----
  From: Samuel Alfonso Velázquez Díaz
  To: Lucene Users List
  Sent: Wednesday, March 05, 2003 3:16 AM
  Subject: Re: Regarding Setup Lucine for my site



  Yes I have
  1.- The directory with the files to index:
  C:/filesToIndex/www/

  2.- A path where the index files from the search engine will be created,
lets say
  C:/index/
  3.- I have an internet domain whose name is: www.mysite.com
  4.- A web application context that runs at http://www.mysite.com/search

  Once I have set all the above things I want to be able to use the search
aplication:
  http://www.mysite.com/search/search.jsp
  And I dont want that the results that I get from the index (step 2) give
me results like
  Your file is at
  C:/filesToIndex/www/some_html/my_doc.html
  The results should be:
  Your file is at
  http://www.mysite.com/some_html/my_doc.html
  For the comments I have read (THANK YOU VERY MUTCH) I conclude that there
is no way to generate the index with some custom prefix (as
http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
  It seems that I have to modify my web application
(http://www.mysite.com/search/search.jsp) to include some logic to repalce
"C:/filesToIndex/www/" to "http://www.mysite.com/".
  If you could point me to the source code of lucene to include this logic
and this way fix it once and for all, will appreciate a lot.
  The command I used to generate this index was:
  java org.apache.lucene.demo.IndexHTML -create -index index C:\index
C:\filesToIndex\ www\
  Now in the web application I have to modify
        IndexSearcher searcher;
        Query query;
        Hits hits;

        // some code after...
       hits = searcher.search(query);

        for ( /* search through the hit list*/)

            Document doc = hits.doc(i);
            String doctitle = doc.get("title");
            String url = doc.get("url");

  I have to do some thing like url = "http://www.mysite.com/" +
url.substring("C:/filesToIndex/www/".length);

  Regards!!!
  And thanks again
   Pinky Iyer <pi...@yahoo.com> wrote:
  I dont understand the explanantion. When I try and index the documents as
mentioned in the examples, and then when i run the app and do a sample
search, it does point to the directory structure say "c:/filesToIndex/www/"
instead of "http://localhost:8080/www/". So how can this be changed to
reflect the website domain as mentioned by you. Could you explain again. Say
my docs are under a directory c:/filesToIndex/www/ and the wesite is as you
said http://localhost:8080/ , then how to proceed!
  Thanks in advance!
  Samuel Alfonso Velázquez Díaz wrote:
  Oh ok, I thougth it was going to be some thing like the egothor search
engine (A java based search engine). When you create the Index, you issue a
command like:
  java org.egothor.indexer.mirror.DoTanker /tmp/my_www
Project/Egothor/var/www as http://localhost:8080
  /thmp/my_www: Is the path to the directory where the index is to be
created
  Project/Egothor/var/www: is the path to the local file system files to be
indexed.
  and as http://localhost:8080 is the prefix that the index will keep on the
hit list. This way the index will be relative to http://localhost:8080. Even
if your production site may be an other site.
  Thanks for your comments, any way now I know that I have to modify code to
do this.
  Regards!
  Jeff Linwood wrote:Hi,

  I'm not a hundred percent sure I understand what you are asking, but when
  you get the results back from Lucene (the hits) it's up to you to format
  them to display on a web page - you can always do the modification there
  when you display the links to the results.

  Jeff
  ----- Original Message -----
  From: "Samuel Alfonso Velázquez Díaz"
  To: "Lucene Users List"
  Sent: Tuesday, March 04, 2003 11:33 AM
  Subject: Regarding Setup Lucine for my site


  >
  > The documentation says:
  >
  > Once you've gotten this far you're probably itching to go. Let's start
by
  creating the index you'll need for the web examples. Since you've already
  set your classpath in the previous examples, all you need to do is type
  "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
  You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
  directory (make sure you didn't leave off the ".." or you'll get a null
  pointer exception). {index-dir} should be a directory that Tomcat has
  permission to read and write, but is outside of a web accessible context.
By
  default the webapp is configured to look in /opt/lucene/index for this
  index.
  >
  > A copy of my site is in:
  >
  > C:\CopiaSite20030228\
  >
  > My web application runs on
  >
  > http://mydomain.com/search/index.jsp
  >
  > how can I make the lucene index map the URLs of the indexed files to:
  >
  > http://mydomain.com/
  >
  >
  >
  > Please help!
  >
  >
  > Samuel Alfonso Velázquez Díaz
  > http://www.geocities.com/samuelvd
  > samuelvd@yahoo.com
  >
  >
  > ---------------------------------
  > Do you Yahoo!?
  > Yahoo! Tax Center - forms, calculators, tips, and more


  ---------------------------------------------------------------------
  To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  For additional commands, e-mail: lucene-user-help@jakarta.apache.org


  Samuel Alfonso Velázquez Díaz
  http://www.geocities.com/samuelvd
  samuelvd@yahoo.com


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more

  Samuel Alfonso Velázquez Díaz
  http://www.geocities.com/samuelvd
  samuelvd@yahoo.com


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Pinky Iyer <pi...@yahoo.com>.
Thanks, for the info, even I would be intrested to see the zip code esplly for indexer. This discussion has been a wonderful source of info  esplly for we starters. Thanks to one and all. I guess once in a while such a discussion helps us too , to get to the level usually the discussion is!
I would appreciate if anybody could tell me the documentation which was mentioned earliar which sheds light on the complete understanding of lucene.
Thanks again!
 Catalin <ca...@cyber.ro> wrote:hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

----- Original Message ----- 
From: Samuel Alfonso Vel�zquez D�az 
To: Lucene Users List 
Sent: Wednesday, March 05, 2003 3:16 AM
Subject: Re: Regarding Setup Lucine for my site



Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/

2.- A path where the index files from the search engine will be created, lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search

Once I have set all the above things I want to be able to use the search aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there is no way to generate the index with some custom prefix (as http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application (http://www.mysite.com/search/search.jsp) to include some logic to repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index C:\filesToIndex\ www\
Now in the web application I have to modify 
IndexSearcher searcher;
Query query; 
Hits hits; 

// some code after...
hits = searcher.search(query); 

for ( /* search through the hit list*/)

Document doc = hits.doc(i); 
String doctitle = doc.get("title");
String url = doc.get("url"); 

I have to do some thing like url = "http://www.mysite.com/" + url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
Pinky Iyer 
wrote:
I dont understand the explanantion. When I try and index the documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is as you said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be created
Project/Egothor/var/www: is the path to the local file system files to be indexed.
and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az" 
To: "Lucene Users List" 
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context. By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Hi, I'd like to take a look at the webapp war file or zip tarball for wsearch and indexer crawling
 Catalin <ca...@cyber.ro> wrote:......
for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Catalin <ca...@cyber.ro>.
hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.

eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.

for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/

for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/

running online:
http://www.anet.ro/search?query=star+wars

the code of the indexer is based on i2a websearch application demo
that is listed on lucene jakarta site.

take a look, maybe you might find something usefull !
there is no .zip available for download.
but if somebody requests the .zip
we can put it online.

have fun !

Catalin

  ----- Original Message ----- 
  From: Samuel Alfonso Velázquez Díaz 
  To: Lucene Users List 
  Sent: Wednesday, March 05, 2003 3:16 AM
  Subject: Re: Regarding Setup Lucine for my site



  Yes I have
  1.- The directory with the files to index:
  C:/filesToIndex/www/
   
  2.- A path where the index files from the search engine will be created, lets say
  C:/index/
  3.- I have an internet domain whose name is: www.mysite.com
  4.- A web application context that runs at http://www.mysite.com/search
   
  Once I have set all the above things I want to be able to use the search aplication:
  http://www.mysite.com/search/search.jsp
  And I dont want that the results that I get from the index (step 2) give me results like
  Your file is at
  C:/filesToIndex/www/some_html/my_doc.html
  The results should be:
  Your file is at
  http://www.mysite.com/some_html/my_doc.html
  For the comments I have read (THANK YOU VERY MUTCH) I conclude that there is no way to generate the index with some custom prefix (as http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
  It seems that I have to modify my web application (http://www.mysite.com/search/search.jsp) to include some logic to repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
  If you could point me to the source code of lucene to include this logic and this way fix it once and for all, will appreciate a lot.
  The command I used to generate this index was:
  java org.apache.lucene.demo.IndexHTML -create -index index C:\index C:\filesToIndex\ www\
  Now in the web application I have to modify 
        IndexSearcher searcher;
        Query query;  
        Hits hits;        

        // some code after...
       hits = searcher.search(query); 

        for ( /* search through the hit list*/)

            Document doc = hits.doc(i);        
            String doctitle = doc.get("title");
            String url = doc.get("url");       

  I have to do some thing like url = "http://www.mysite.com/" + url.substring("C:/filesToIndex/www/".length);

  Regards!!!
  And thanks again
   Pinky Iyer <pi...@yahoo.com> wrote:
  I dont understand the explanantion. When I try and index the documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is as you said http://localhost:8080/ , then how to proceed!
  Thanks in advance!
  Samuel Alfonso Velázquez Díaz wrote:
  Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like:
  java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080
  /thmp/my_www: Is the path to the directory where the index is to be created
  Project/Egothor/var/www: is the path to the local file system files to be indexed.
  and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site.
  Thanks for your comments, any way now I know that I have to modify code to do this.
  Regards!
  Jeff Linwood wrote:Hi,

  I'm not a hundred percent sure I understand what you are asking, but when
  you get the results back from Lucene (the hits) it's up to you to format
  them to display on a web page - you can always do the modification there
  when you display the links to the results.

  Jeff
  ----- Original Message -----
  From: "Samuel Alfonso Velázquez Díaz" 
  To: "Lucene Users List" 
  Sent: Tuesday, March 04, 2003 11:33 AM
  Subject: Regarding Setup Lucine for my site


  >
  > The documentation says:
  >
  > Once you've gotten this far you're probably itching to go. Let's start by
  creating the index you'll need for the web examples. Since you've already
  set your classpath in the previous examples, all you need to do is type
  "java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
  You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
  directory (make sure you didn't leave off the ".." or you'll get a null
  pointer exception). {index-dir} should be a directory that Tomcat has
  permission to read and write, but is outside of a web accessible context. By
  default the webapp is configured to look in /opt/lucene/index for this
  index.
  >
  > A copy of my site is in:
  >
  > C:\CopiaSite20030228\
  >
  > My web application runs on
  >
  > http://mydomain.com/search/index.jsp
  >
  > how can I make the lucene index map the URLs of the indexed files to:
  >
  > http://mydomain.com/
  >
  >
  >
  > Please help!
  >
  >
  > Samuel Alfonso Velázquez Díaz
  > http://www.geocities.com/samuelvd
  > samuelvd@yahoo.com
  >
  >
  > ---------------------------------
  > Do you Yahoo!?
  > Yahoo! Tax Center - forms, calculators, tips, and more


  ---------------------------------------------------------------------
  To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  For additional commands, e-mail: lucene-user-help@jakarta.apache.org


  Samuel Alfonso Velázquez Díaz
  http://www.geocities.com/samuelvd
  samuelvd@yahoo.com


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more

  Samuel Alfonso Velázquez Díaz
  http://www.geocities.com/samuelvd
  samuelvd@yahoo.com


  ---------------------------------
  Do you Yahoo!?
  Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Eric Anderson <Er...@LanRx.com>.
Samuel-

I'm basically using the software in a similar fashion to how you are. However, 
something to remember, is that the documents that you're indexing need to be in 
a location that is published by your webserver. What I did, was use the tomcat 
connectors, and mount my document repository inside my tomcat webapps 
directory. That way, it will index the path by using the demo IndexHTML command 
from a child of webapps. Then, I created JkMounts for the children of the 
webapps directory.

I'm not a developer (to say the least), and it's probably a somewhat half-baked 
way around the problems, but, my instance works, and all indexed documents are 
available via the links displayed on the results page.


Quoting Otis Gospodnetic <ot...@yahoo.com>:

> Samuel,
> 
> Some basic understanding of what Lucene is what is missing here.
> Lucene does not index web pages.
> Lucene indexes text.
> Lucene is not automatically aware of your wb site nor your domain.
> Lucene is aware only of what you 'feed it' at index time.
> If you index files, which IndexDemo does, Lucene index will have only
> information about files (information such as file path).  Lucene has no
> clue that you really want to index your web site.
> Even if you could replace C:\..... with http://.... it wouldn't be a
> good solution, as directory structures and file paths do not always map
> directly to URLs.
> 
> In short, you have a bit more reading to do :)
> The information is all there, it just has to be read :(
> Good luck!
> 
> Otis
> 
> 
> 
> --- Samuel Alfonso Velázquez Díaz <sa...@yahoo.com> wrote:
> > 
> > Yes I have
> > 1.- The directory with the files to index:
> > C:/filesToIndex/www/
> >  
> > 2.- A path where the index files from the search engine will be
> > created, lets say
> > C:/index/
> > 3.- I have an internet domain whose name is: www.mysite.com
> > 4.- A web application context that runs at
> > http://www.mysite.com/search
> >  
> > Once I have set all the above things I want to be able to use the
> > search aplication:
> > http://www.mysite.com/search/search.jsp
> > And I dont want that the results that I get from the index (step 2)
> > give me results like
> > Your file is at
> > C:/filesToIndex/www/some_html/my_doc.html
> > The results should be:
> > Your file is at
> > http://www.mysite.com/some_html/my_doc.html
> > For the comments I have read (THANK YOU VERY MUTCH) I conclude that
> > there is no way to generate the index with some custom prefix (as
> > http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
> > It seems that I have to modify my web application
> > (http://www.mysite.com/search/search.jsp) to include some logic to
> > repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
> > If you could point me to the source code of lucene to include this
> > logic and this way fix it once and for all, will appreciate a lot.
> > The command I used to generate this index was:
> > java org.apache.lucene.demo.IndexHTML -create -index index C:\index
> > C:\filesToIndex\ www\
> > Now in the web application I have to modify 
> >       IndexSearcher searcher;
> >       Query query;  
> >       Hits hits;        
> > 
> >       // some code after...
> >      hits = searcher.search(query); 
> > 
> >       for ( /* search through the hit list*/)
> > 
> >           Document doc = hits.doc(i);        
> >           String doctitle = doc.get("title");
> >           String url = doc.get("url");       
> > 
> > I have to do some thing like url = "http://www.mysite.com/" +
> > url.substring("C:/filesToIndex/www/".length);
> > 
> > Regards!!!
> > And thanks again
> >  Pinky Iyer <pi...@yahoo.com> wrote:
> > I dont understand the explanantion. When I try and index the
> > documents as mentioned in the examples, and then when i run the app
> > and do a sample search, it does point to the directory structure say
> > "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So
> > how can this be changed to reflect the website domain as mentioned by
> > you. Could you explain again. Say my docs are under a directory
> > c:/filesToIndex/www/ and the wesite is as you said
> > http://localhost:8080/ , then how to proceed!
> > Thanks in advance!
> > Samuel Alfonso Velázquez Díaz wrote:
> > Oh ok, I thougth it was going to be some thing like the egothor
> > search engine (A java based search engine). When you create the
> > Index, you issue a command like:
> > java org.egothor.indexer.mirror.DoTanker /tmp/my_www
> > Project/Egothor/var/www as http://localhost:8080
> > /thmp/my_www: Is the path to the directory where the index is to be
> > created
> > Project/Egothor/var/www: is the path to the local file system files
> > to be indexed.
> > and as http://localhost:8080 is the prefix that the index will keep
> > on the hit list. This way the index will be relative to
> > http://localhost:8080. Even if your production site may be an other
> > site.
> > Thanks for your comments, any way now I know that I have to modify
> > code to do this.
> > Regards!
> > Jeff Linwood wrote:Hi,
> > 
> > I'm not a hundred percent sure I understand what you are asking, but
> > when
> > you get the results back from Lucene (the hits) it's up to you to
> > format
> > them to display on a web page - you can always do the modification
> > there
> > when you display the links to the results.
> > 
> > Jeff
> > ----- Original Message -----
> > From: "Samuel Alfonso Velázquez Díaz" 
> > To: "Lucene Users List" 
> > Sent: Tuesday, March 04, 2003 11:33 AM
> > Subject: Regarding Setup Lucine for my site
> > 
> > 
> > >
> > > The documentation says:
> > >
> > > Once you've gotten this far you're probably itching to go. Let's
> > start by
> > creating the index you'll need for the web examples. Since you've
> > already
> > set your classpath in the previous examples, all you need to do is
> > type
> > "java org.apache.lucene.demo.IndexHTML -create -index {index-dir}
> > ..".
> > You'll need to do this from a (any) subdirectory of your
> > {tomcat}/webapps
> > directory (make sure you didn't leave off the ".." or you'll get a
> > null
> > pointer exception). {index-dir} should be a directory that Tomcat has
> > permission to read and write, but is outside of a web accessible
> > context. By
> > default the webapp is configured to look in /opt/lucene/index for
> > this
> > index.
> > >
> > > A copy of my site is in:
> > >
> > > C:\CopiaSite20030228\
> > >
> > > My web application runs on
> > >
> > > http://mydomain.com/search/index.jsp
> > >
> > > how can I make the lucene index map the URLs of the indexed files
> > to:
> > >
> > > http://mydomain.com/
> > >
> > >
> > >
> > > Please help!
> > >
> > >
> > > Samuel Alfonso Velázquez Díaz
> > > http://www.geocities.com/samuelvd
> > > samuelvd@yahoo.com
> > >
> > >
> > > ---------------------------------
> > > Do you Yahoo!?
> > > Yahoo! Tax Center - forms, calculators, tips, and more
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > 
> > 
> > Samuel Alfonso Velázquez Díaz
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> > 
> > 
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> > 
> > 
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> > 
> > Samuel Alfonso Velázquez Díaz
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> > 
> > 
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, more
> http://taxes.yahoo.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

LanRx Network Solutions, Inc.
Providing Enterprise Level Solutions...On A Small Business Budget

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
On Tue, 4 Mar 2003, Otis Gospodnetic wrote:

> Even if you could replace C:\..... with http://.... it wouldn't be a
> good solution, as directory structures and file paths do not always map
> directly to URLs.

Yes, but it is not the case of Samuel's configuration and 99.99% of 
others.

The fact is, that Lucene is only a library, and sandbox utilities which
are of different quality. :-)

-g-



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Andrzej Bialecki <ab...@getopt.org>.
Otis Gospodnetic wrote:
>>>>Egothor can do that, so why not Lucene?
>>>
>>>Yes, Lucene can do more than I think it can, why not.
>>>Maybe this is being done already...with Lucene... ;)
>>
>>...and that is why I would like to see the object model (UML+notes).
>>In
>>the model we can find the answer if Lucene can do more than we think
>>:).
> 
> 
> I believe there are tools out there that will analyze Java sources and
> create UML class diagrams from that.  I believe TogetherJ or one of
> those 'all in one' tools can do that.

I can do it for you, if you want - it takes ~10 minutes.


-- 

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Eric Jain <Er...@isb-sib.ch>.
> It is not a good way, because such diagrams contain a lot of
> dependencies which are not in the ``original'' diagrams. Moreover the
> tool cannot recognize what objects are important and what objects
> would be excluded from the diagrams.

+1

I only use hand-built UML [1], and only for illustrating certain key
aspects or behaviors of the system I am documenting. In my opinion,
automatic generated diagrams offer no advantage whatsoever over Javadoc.
Or perhaps I just haven't found the proper tool yet :-)

[1] http://www.navision.com/hq/view.asp?categoryid=368&documentid=428,4


--
Eric Jain


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
On Fri, 7 Mar 2003, Andrzej Bialecki wrote:

> In my experience, for creating class diagrams tools like TogetherJ do 
> acceptable job when used to automatically reverse-engineer existing 
> source code. But in the case of sequence diagrams they are just 
> pathetic... You'll have a chance to see two of them in the package I 
> sent to Otis. :-)

I did not see your diagrams yet (I missed the URL IMHO), but I think that
collaboration, activity and sequence diagrams would be better. Can they be
produced by the tool you use?

Thank you.

-g-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Andrzej Bialecki <ab...@getopt.org>.
Leo Galambos wrote:
>>I believe there are tools out there that will analyze Java sources and
>>create UML class diagrams from that.  I believe TogetherJ or one of
>>those 'all in one' tools can do that.
> 
> 
> It is not a good way, because such diagrams contain a lot of dependencies
> which are not in the ``original'' diagrams. Moreover the tool cannot
> recognize what objects are important and what objects would be excluded
> from the diagrams.

I fully agree that nothing can beat a handcrafted UML diagram, made by 
someone who knows which details are irrelevant for the key concepts. 
That's what the models are for - to present a complex reality in an 
abstract, simplified way, by disregarding things that are not important 
for the purpose of explaining the concept...

However, automated tools still can go a long way, especially if you 
don't have time or expertise to create diagrams on your own, with 
unknown code base...

In my experience, for creating class diagrams tools like TogetherJ do 
acceptable job when used to automatically reverse-engineer existing 
source code. But in the case of sequence diagrams they are just 
pathetic... You'll have a chance to see two of them in the package I 
sent to Otis. :-)

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> I believe there are tools out there that will analyze Java sources and
> create UML class diagrams from that.  I believe TogetherJ or one of
> those 'all in one' tools can do that.

It is not a good way, because such diagrams contain a lot of dependencies
which are not in the ``original'' diagrams. Moreover the tool cannot
recognize what objects are important and what objects would be excluded
from the diagrams.

-g-



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Otis Gospodnetic <ot...@yahoo.com>.
> > > Egothor can do that, so why not Lucene?
> > Yes, Lucene can do more than I think it can, why not.
> > Maybe this is being done already...with Lucene... ;)
> 
> ...and that is why I would like to see the object model (UML+notes).
> In
> the model we can find the answer if Lucene can do more than we think
> :).

I believe there are tools out there that will analyze Java sources and
create UML class diagrams from that.  I believe TogetherJ or one of
those 'all in one' tools can do that.

> The point, where I am lost, is Searchable (and subclasses). Have you
> not already written a paper about it?

Moi?  No.

Otis


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> > That class cannot be used in Merger. RemoteSearchable is a class that
> > allows you to pass a query to another node, nothing less and nothing
> > more
> > AFAIK.
> 
> What is Merger?  Verb, noun, an IR concept, a name of the product or
> project?  Merging of results from multiple searchers from multiple
> indices?

Ooops. SegmentMerger, the central class in org.apache.lucene.index.

> That is the difference between a simple library and a targeted
> application.

Right. On the other hand, when you want to use the library for such
application, it must allow you the things.

> > Moreover, I think that Lucene can do much more than you think Otis
> > :). 
> > Egothor can do that, so why not Lucene?
> Yes, Lucene can do more than I think it can, why not.
> Maybe this is being done already...with Lucene... ;)

...and that is why I would like to see the object model (UML+notes). In
the model we can find the answer if Lucene can do more than we think :).
The point, where I am lost, is Searchable (and subclasses). Have you not
already written a paper about it?

-g-



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Otis Gospodnetic <ot...@yahoo.com>.
--- Leo Galambos <ga...@com-os2.ms.mff.cuni.cz> wrote:
> > If I understand you correctly, then maybe you are not aware of
> > RemoteSearchable in Lucene.
> 
> That class cannot be used in Merger. RemoteSearchable is a class that
> allows you to pass a query to another node, nothing less and nothing
> more
> AFAIK.

What is Merger?  Verb, noun, an IR concept, a name of the product or
project?  Merging of results from multiple searchers from multiple
indices?


> > This is the point that's more clear to me now.  There is confusion
> > about what Lucene is and what it is not.  Lucene does not even try
> to
> > be what those services you mentioned are.  Their goals are
> different,
> > they are a different set of tools.  Lucene's focus is on indexing
> text
> > and searching it.  It is not a tool to query other existing search
> 
> I do not think so. It is all about the object model you use. If you
> are
> not able to solve the simplest case, how can you distribute the
> engine
> across the network? I do not mean the simple RMI gateways which
> marshall
> parameters and send them through a network pipe, I mean the true
> system that could beat google (and it is another topic...).

That is the difference between a simple library and a targeted
application.

> Moreover, I think that Lucene can do much more than you think Otis
> :). 
> Egothor can do that, so why not Lucene?

Yes, Lucene can do more than I think it can, why not.
Maybe this is being done already...with Lucene... ;)

Otis


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> If I understand you correctly, then maybe you are not aware of
> RemoteSearchable in Lucene.

That class cannot be used in Merger. RemoteSearchable is a class that
allows you to pass a query to another node, nothing less and nothing more
AFAIK.

> This is the point that's more clear to me now.  There is confusion
> about what Lucene is and what it is not.  Lucene does not even try to
> be what those services you mentioned are.  Their goals are different,
> they are a different set of tools.  Lucene's focus is on indexing text
> and searching it.  It is not a tool to query other existing search

I do not think so. It is all about the object model you use. If you are
not able to solve the simplest case, how can you distribute the engine
across the network? I do not mean the simple RMI gateways which marshall
parameters and send them through a network pipe, I mean the true system
that could beat google (and it is another topic...).

Moreover, I think that Lucene can do much more than you think Otis :). 
Egothor can do that, so why not Lucene?

-g-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Potential Lucene drawbacks

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hello,

I changed the Subject line for obvious reasons.

I'm in the same boat as Tatu, as I didn't understand all your
points...and I still don't :(  But some other things became more clear
from this email, so I'll comment on those.


--- Leo Galambos <ga...@com-os2.ms.mff.cuni.cz> wrote:
> > > 1. 2 threads per request may improve speed up to 50%
> > Hmm? Could you clarify? During indexing, multithreading may speed
> things
> > up (splitting docs to index in 2 or more sets, indexing separately,
> combining
> > indexing). But... isn't that a good thing? Or are you saying that
> it'd be good 
> > to have multi-threaded search functionality for single search? (in
> my 
> > experience searching is seldom the slow part)
> 
> you may improve indexing and searching. Indexing, because the merge
> operation will lock just one thread and smaller part of an index
> while
> other threads are still working;  searching, because you can
> distribute
> the query to more barrels. In both cases you save up to 50% of time
> (I
> assume mergefactor=2).

I don't follow the indexing part, but you can certainly perform
distributed searches.  They are not parallelized currently, so searches
will run one after the other, but...

> > > 2. Merger is hard coded
> > 
> > In a way that is bad because... ?
> > (ie. what is the specific problem... I assume you mean index
> merging
> > functionality?)
> 
> Because you cannot process local and/or remote barrels -- all must be
> local in Lucene object model. That is the serious bug IMHO.

If I understand you correctly, then maybe you are not aware of
RemoteSearchable in Lucene.
This is from CHANGES.txt:

   9. Added class RemoteSearchable, providing support for remote
      searching via RMI.  The test class RemoteSearchableTest.java
      provides an example of how this can be used.  (cutting)


> > > 4. you cannot implement dissemination + wrappers for internet
> servers
> > > which would serve as static barrels.
> > Could you explain this bit more thoroughly (or pointers on longer 
> > explanation)?
> 
> Read more about dissemination, metasearch engines (i.e. Savvysearch),
> dDIRs (i.e. Harvest). BTW, let's go to a pub and we can talk til
> morning
> :) (it is a serious offer, because I would like to know more about
> IR).
>
> This example is about metasearch (the simplest case of dDIRs): Can
> Lucene
> allow that a barrel (index segment?) is static and a query is solved
> via
> wrapper, that sends the query ${QUERY} to
> http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=${QUERY} and
> then
> reads the HTML output as a result?

This is the point that's more clear to me now.  There is confusion
about what Lucene is and what it is not.  Lucene does not even try to
be what those services you mentioned are.  Their goals are different,
they are a different set of tools.  Lucene's focus is on indexing text
and searching it.  It is not a tool to query other existing search
engines and parse returned HTML, etc.  It is also not a tool that wants
to have a built-in web crawler, and so on.  It's small and simple on
purpose, and comparing it with SavvySearch (still exists??), Harvest,
Dogpile, etc. would be like comparing apples and oranges.

> > > 5. Document metadata cannot be stored as a programmer wants, he
> must
> > > translate the object to a set of fields
> > Yes? I'd think that possibility of doing separate fields is a good
> thing; 
> > after all, all a plain text search engine needs to provide (to be
> considered 
> > one) is indexing of plain text data, right?
> 
> I talked about metadata. When metadata object knows how to achieve
> its 
> persistence, why would one translate anything to fields and then
> back?
> Why would you touch the users metadata at all? You need flat fields
> for
> indexing, and what's around -- it is not your problem :). Lucene is
> something between CMS and CIS, you say that it's closer to CIS, but
> when
> you need metadata in fields, you are closer to CMS IMHO.

Not sure I follow.  I certainly don't think of Lucene as a CMS.  Just a
text indexing and searching library.

> > > 6. Lucene cannot implement your own dynamization
> > 
> > (sorry, I must sound real thick here).
> > Could you elaborate on this... what do you mean by dynamization?
> 
> Read more about "Dynamization of Decomposable Searching Problems".

Otis


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> > 1. 2 threads per request may improve speed up to 50%
> Hmm? Could you clarify? During indexing, multithreading may speed things
> up (splitting docs to index in 2 or more sets, indexing separately, combining
> indexing). But... isn't that a good thing? Or are you saying that it'd be good 
> to have multi-threaded search functionality for single search? (in my 
> experience searching is seldom the slow part)

you may improve indexing and searching. Indexing, because the merge
operation will lock just one thread and smaller part of an index while
other threads are still working;  searching, because you can distribute
the query to more barrels. In both cases you save up to 50% of time (I
assume mergefactor=2).

> > 2. Merger is hard coded
> 
> In a way that is bad because... ?
> (ie. what is the specific problem... I assume you mean index merging
> functionality?)

Because you cannot process local and/or remote barrels -- all must be
local in Lucene object model. That is the serious bug IMHO.

> > 4. you cannot implement dissemination + wrappers for internet servers
> > which would serve as static barrels.
> Could you explain this bit more thoroughly (or pointers on longer 
> explanation)?

Read more about dissemination, metasearch engines (i.e. Savvysearch),
dDIRs (i.e. Harvest). BTW, let's go to a pub and we can talk til morning
:) (it is a serious offer, because I would like to know more about IR).

This example is about metasearch (the simplest case of dDIRs): Can Lucene
allow that a barrel (index segment?) is static and a query is solved via
wrapper, that sends the query ${QUERY} to
http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=${QUERY} and then
reads the HTML output as a result?

> > 5. Document metadata cannot be stored as a programmer wants, he must
> > translate the object to a set of fields
> Yes? I'd think that possibility of doing separate fields is a good thing; 
> after all, all a plain text search engine needs to provide (to be considered 
> one) is indexing of plain text data, right?

I talked about metadata. When metadata object knows how to achieve its 
persistence, why would one translate anything to fields and then back?
Why would you touch the users metadata at all? You need flat fields for
indexing, and what's around -- it is not your problem :). Lucene is
something between CMS and CIS, you say that it's closer to CIS, but when
you need metadata in fields, you are closer to CMS IMHO.

> > 6. Lucene cannot implement your own dynamization
> 
> (sorry, I must sound real thick here).
> Could you elaborate on this... what do you mean by dynamization?

Read more about "Dynamization of Decomposable Searching Problems".

-g-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
For all beginers (as I can tell), I found this URL and I thougth you may want to check it out:
http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html
Regards!


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Tatu Saloranta <ta...@hypermall.net>.
On Wednesday 05 March 2003 13:35, Leo Galambos wrote:
> > I'm all eyes and I'm a serious grown-up with good manners :)
> > Constructive suggestions for improvement are always welcome.
>

First a disclaimer: I don't mean to sound too negative. I'm genuinely curious 
about many of the issues you mention. But I'm not sure I really understand 
them. :-)

> 1. 2 threads per request may improve speed up to 50%

Hmm? Could you clarify? During indexing, multithreading may speed things
up (splitting docs to index in 2 or more sets, indexing separately, combining
indexing). But... isn't that a good thing? Or are you saying that it'd be good 
to have multi-threaded search functionality for single search? (in my 
experience searching is seldom the slow part)

> 2. Merger is hard coded

In a way that is bad because... ?
(ie. what is the specific problem... I assume you mean index merging
functionality?)

...
> 4. you cannot implement dissemination + wrappers for internet servers
> which would serve as static barrels.

Could you explain this bit more thoroughly (or pointers on longer 
explanation)?

> 5. Document metadata cannot be stored as a programmer wants, he must
> translate the object to a set of fields

Yes? I'd think that possibility of doing separate fields is a good thing; 
after all, all a plain text search engine needs to provide (to be considered 
one) is indexing of plain text data, right?
Plus, Lucene is not a Content Management System (or database), but
content indexing system. As such I'm not sure why storage should not be 
optimized to allow for fast searches (which means flattening contents, 
amongst other things).

That is not to say that things couldn't be improved; it might be a good idea 
to define small set of base interfaces / classes to make it easier to convert 
from 'objectified' textual data to straight-forward indexing.

FWIW I am actually using Lucene for storing documents that have extensive 
metadata associated, and I don't find restrictions too bad... but that's 
certainly matter of taste. :-)

> 6. Lucene cannot implement your own dynamization

(sorry, I must sound real thick here).
Could you elaborate on this... what do you mean by dynamization?

-+ Tatu +-



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> > On the other hand, if you extend Lucene with your hacks, you will
> > find out
> > that the model of Lucene is unknown and many parts are hard-coded. It
> > boosts speed, but it disallows future enhancements (I could name the
> > parts, I hope we do not start flamewar here).
> 
> I'm all eyes and I'm a serious grown-up with good manners :)
> Constructive suggestions for improvement are always welcome.

1. 2 threads per request may improve speed up to 50%

2. Merger is hard coded

3. you cannot use different inverted lists in one index (i.e. pagerank and
doc_id instead of doc_id/prox_handle/freq/...), inverted lists do not
support multilevel skips (see MoZo papers about this topic)

4. you cannot implement dissemination + wrappers for internet servers 
which would serve as static barrels.

5. Document metadata cannot be stored as a programmer wants, he must
translate the object to a set of fields

6. Lucene cannot implement your own dynamization

....etc.

-g-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Otis Gospodnetic <ot...@yahoo.com>.
> On the other hand, if you extend Lucene with your hacks, you will
> find out
> that the model of Lucene is unknown and many parts are hard-coded. It
> boosts speed, but it disallows future enhancements (I could name the
> parts, I hope we do not start flamewar here).

I'm all eyes and I'm a serious grown-up with good manners :)
Constructive suggestions for improvement are always welcome.

Thanks,
Otis


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Leo Galambos <ga...@com-os2.ms.mff.cuni.cz>.
> org.apache.lucene.demo.IndexHTML wich was provided with the
> documentation. Is there any problem using this demo class for a web
> production site? I'm an application developer and it would be hard to
> understand the hole lucene code to use it. It would be almost imposible

You can use it, but: if you need something special (snippets, coloring,
different URL mapping, handling of your local charset, etc. etc.) you must
include code from sandbox or write it from scratch AFAIK.

> for my develop phase timings to try to do this. * Regarding you comment:
> Lucene does not index web pages. I thougth lucene main goal was to index
> web pages ¿? and as an after thougth it should be able to index text
> files or some other information (for example mail databases). Regards

Lucene *can* index HTML pages, if you use programs which build Lucene 
index from HTML documents. The programs exist.

On the other hand, if you extend Lucene with your hacks, you will find out
that the model of Lucene is unknown and many parts are hard-coded. It
boosts speed, but it disallows future enhancements (I could name the
parts, I hope we do not start flamewar here).

> and thanks for your comments!!!!!!! I'm considering egothor search
> engine. I succesfully set a web application for searching my web site
> but I didn't see a mailing list or a forum with the level of

I had PhD exam, and many questions went throught ICQ, you know, it is 
faster for me than e-mails...

-g-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Please point me to the web link to read more about lucene, I have read all the documentation with the distribution (which is all most the same as the lucene.apache.org site). 
About the problem you mentioned about URL to file mapings, what about if I issue a code line like
  myurl = URLEncode.encode(myurl); 
wouldn't that solve posibly malformed URLs at the web app level?
* On the other hand I'm using org.apache.lucene.demo.IndexHTML wich was provided with the documentation. Is there any problem using this demo class for a web production site?
I'm an application developer and it would be hard to understand the hole lucene code to use it. It would be almost imposible for my develop phase timings to try to do this.
* Regarding you comment: Lucene does not index web pages. I thougth lucene main goal was to index web pages �? and as an after thougth it should be able to index text files or some other information (for example mail databases).
Regards and thanks for your comments!!!!!!!
I'm considering egothor search engine. I succesfully set a web application for searching my web site but I didn't see a mailing list or a forum with the level of participation like lucene. :D
Greatings to every one!!
 Otis Gospodnetic <ot...@yahoo.com> wrote:Samuel,

Some basic understanding of what Lucene is what is missing here.
Lucene does not index web pages.
Lucene indexes text.
Lucene is not automatically aware of your wb site nor your domain.
Lucene is aware only of what you 'feed it' at index time.
If you index files, which IndexDemo does, Lucene index will have only
information about files (information such as file path). Lucene has no
clue that you really want to index your web site.
Even if you could replace C:\..... with http://.... it wouldn't be a
good solution, as directory structures and file paths do not always map
directly to URLs.

In short, you have a bit more reading to do :)
The information is all there, it just has to be read :(
Good luck!

Otis



--- Samuel Alfonso Vel�zquez D�az wrote:
> 
> Yes I have
> 1.- The directory with the files to index:
> C:/filesToIndex/www/
> 
> 2.- A path where the index files from the search engine will be
> created, lets say
> C:/index/
> 3.- I have an internet domain whose name is: www.mysite.com
> 4.- A web application context that runs at
> http://www.mysite.com/search
> 
> Once I have set all the above things I want to be able to use the
> search aplication:
> http://www.mysite.com/search/search.jsp
> And I dont want that the results that I get from the index (step 2)
> give me results like
> Your file is at
> C:/filesToIndex/www/some_html/my_doc.html
> The results should be:
> Your file is at
> http://www.mysite.com/some_html/my_doc.html
> For the comments I have read (THANK YOU VERY MUTCH) I conclude that
> there is no way to generate the index with some custom prefix (as
> http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
> It seems that I have to modify my web application
> (http://www.mysite.com/search/search.jsp) to include some logic to
> repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
> If you could point me to the source code of lucene to include this
> logic and this way fix it once and for all, will appreciate a lot.
> The command I used to generate this index was:
> java org.apache.lucene.demo.IndexHTML -create -index index C:\index
> C:\filesToIndex\ www\
> Now in the web application I have to modify 
> IndexSearcher searcher;
> Query query; 
> Hits hits; 
> 
> // some code after...
> hits = searcher.search(query); 
> 
> for ( /* search through the hit list*/)
> 
> Document doc = hits.doc(i); 
> String doctitle = doc.get("title");
> String url = doc.get("url"); 
> 
> I have to do some thing like url = "http://www.mysite.com/" +
> url.substring("C:/filesToIndex/www/".length);
> 
> Regards!!!
> And thanks again
> Pinky Iyer 
wrote:
> I dont understand the explanantion. When I try and index the
> documents as mentioned in the examples, and then when i run the app
> and do a sample search, it does point to the directory structure say
> "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So
> how can this be changed to reflect the website domain as mentioned by
> you. Could you explain again. Say my docs are under a directory
> c:/filesToIndex/www/ and the wesite is as you said
> http://localhost:8080/ , then how to proceed!
> Thanks in advance!
> Samuel Alfonso Vel�zquez D�az wrote:
> Oh ok, I thougth it was going to be some thing like the egothor
> search engine (A java based search engine). When you create the
> Index, you issue a command like:
> java org.egothor.indexer.mirror.DoTanker /tmp/my_www
> Project/Egothor/var/www as http://localhost:8080
> /thmp/my_www: Is the path to the directory where the index is to be
> created
> Project/Egothor/var/www: is the path to the local file system files
> to be indexed.
> and as http://localhost:8080 is the prefix that the index will keep
> on the hit list. This way the index will be relative to
> http://localhost:8080. Even if your production site may be an other
> site.
> Thanks for your comments, any way now I know that I have to modify
> code to do this.
> Regards!
> Jeff Linwood wrote:Hi,
> 
> I'm not a hundred percent sure I understand what you are asking, but
> when
> you get the results back from Lucene (the hits) it's up to you to
> format
> them to display on a web page - you can always do the modification
> there
> when you display the links to the results.
> 
> Jeff
> ----- Original Message -----
> From: "Samuel Alfonso Vel�zquez D�az" 
> To: "Lucene Users List" 
> Sent: Tuesday, March 04, 2003 11:33 AM
> Subject: Regarding Setup Lucine for my site
> 
> 
> >
> > The documentation says:
> >
> > Once you've gotten this far you're probably itching to go. Let's
> start by
> creating the index you'll need for the web examples. Since you've
> already
> set your classpath in the previous examples, all you need to do is
> type
> "java org.apache.lucene.demo.IndexHTML -create -index {index-dir}
> ..".
> You'll need to do this from a (any) subdirectory of your
> {tomcat}/webapps
> directory (make sure you didn't leave off the ".." or you'll get a
> null
> pointer exception). {index-dir} should be a directory that Tomcat has
> permission to read and write, but is outside of a web accessible
> context. By
> default the webapp is configured to look in /opt/lucene/index for
> this
> index.
> >
> > A copy of my site is in:
> >
> > C:\CopiaSite20030228\
> >
> > My web application runs on
> >
> > http://mydomain.com/search/index.jsp
> >
> > how can I make the lucene index map the URLs of the indexed files
> to:
> >
> > http://mydomain.com/
> >
> >
> >
> > Please help!
> >
> >
> > Samuel Alfonso Vel�zquez D�az
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> >
> >
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Samuel,

Some basic understanding of what Lucene is what is missing here.
Lucene does not index web pages.
Lucene indexes text.
Lucene is not automatically aware of your wb site nor your domain.
Lucene is aware only of what you 'feed it' at index time.
If you index files, which IndexDemo does, Lucene index will have only
information about files (information such as file path).  Lucene has no
clue that you really want to index your web site.
Even if you could replace C:\..... with http://.... it wouldn't be a
good solution, as directory structures and file paths do not always map
directly to URLs.

In short, you have a bit more reading to do :)
The information is all there, it just has to be read :(
Good luck!

Otis



--- Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com> wrote:
> 
> Yes I have
> 1.- The directory with the files to index:
> C:/filesToIndex/www/
>  
> 2.- A path where the index files from the search engine will be
> created, lets say
> C:/index/
> 3.- I have an internet domain whose name is: www.mysite.com
> 4.- A web application context that runs at
> http://www.mysite.com/search
>  
> Once I have set all the above things I want to be able to use the
> search aplication:
> http://www.mysite.com/search/search.jsp
> And I dont want that the results that I get from the index (step 2)
> give me results like
> Your file is at
> C:/filesToIndex/www/some_html/my_doc.html
> The results should be:
> Your file is at
> http://www.mysite.com/some_html/my_doc.html
> For the comments I have read (THANK YOU VERY MUTCH) I conclude that
> there is no way to generate the index with some custom prefix (as
> http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
> It seems that I have to modify my web application
> (http://www.mysite.com/search/search.jsp) to include some logic to
> repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
> If you could point me to the source code of lucene to include this
> logic and this way fix it once and for all, will appreciate a lot.
> The command I used to generate this index was:
> java org.apache.lucene.demo.IndexHTML -create -index index C:\index
> C:\filesToIndex\ www\
> Now in the web application I have to modify 
>       IndexSearcher searcher;
>       Query query;  
>       Hits hits;        
> 
>       // some code after...
>      hits = searcher.search(query); 
> 
>       for ( /* search through the hit list*/)
> 
>           Document doc = hits.doc(i);        
>           String doctitle = doc.get("title");
>           String url = doc.get("url");       
> 
> I have to do some thing like url = "http://www.mysite.com/" +
> url.substring("C:/filesToIndex/www/".length);
> 
> Regards!!!
> And thanks again
>  Pinky Iyer <pi...@yahoo.com> wrote:
> I dont understand the explanantion. When I try and index the
> documents as mentioned in the examples, and then when i run the app
> and do a sample search, it does point to the directory structure say
> "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So
> how can this be changed to reflect the website domain as mentioned by
> you. Could you explain again. Say my docs are under a directory
> c:/filesToIndex/www/ and the wesite is as you said
> http://localhost:8080/ , then how to proceed!
> Thanks in advance!
> Samuel Alfonso Vel�zquez D�az wrote:
> Oh ok, I thougth it was going to be some thing like the egothor
> search engine (A java based search engine). When you create the
> Index, you issue a command like:
> java org.egothor.indexer.mirror.DoTanker /tmp/my_www
> Project/Egothor/var/www as http://localhost:8080
> /thmp/my_www: Is the path to the directory where the index is to be
> created
> Project/Egothor/var/www: is the path to the local file system files
> to be indexed.
> and as http://localhost:8080 is the prefix that the index will keep
> on the hit list. This way the index will be relative to
> http://localhost:8080. Even if your production site may be an other
> site.
> Thanks for your comments, any way now I know that I have to modify
> code to do this.
> Regards!
> Jeff Linwood wrote:Hi,
> 
> I'm not a hundred percent sure I understand what you are asking, but
> when
> you get the results back from Lucene (the hits) it's up to you to
> format
> them to display on a web page - you can always do the modification
> there
> when you display the links to the results.
> 
> Jeff
> ----- Original Message -----
> From: "Samuel Alfonso Vel�zquez D�az" 
> To: "Lucene Users List" 
> Sent: Tuesday, March 04, 2003 11:33 AM
> Subject: Regarding Setup Lucine for my site
> 
> 
> >
> > The documentation says:
> >
> > Once you've gotten this far you're probably itching to go. Let's
> start by
> creating the index you'll need for the web examples. Since you've
> already
> set your classpath in the previous examples, all you need to do is
> type
> "java org.apache.lucene.demo.IndexHTML -create -index {index-dir}
> ..".
> You'll need to do this from a (any) subdirectory of your
> {tomcat}/webapps
> directory (make sure you didn't leave off the ".." or you'll get a
> null
> pointer exception). {index-dir} should be a directory that Tomcat has
> permission to read and write, but is outside of a web accessible
> context. By
> default the webapp is configured to look in /opt/lucene/index for
> this
> index.
> >
> > A copy of my site is in:
> >
> > C:\CopiaSite20030228\
> >
> > My web application runs on
> >
> > http://mydomain.com/search/index.jsp
> >
> > how can I make the lucene index map the URLs of the indexed files
> to:
> >
> > http://mydomain.com/
> >
> >
> >
> > Please help!
> >
> >
> > Samuel Alfonso Vel�zquez D�az
> > http://www.geocities.com/samuelvd
> > samuelvd@yahoo.com
> >
> >
> > ---------------------------------
> > Do you Yahoo!?
> > Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
> 
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Regarding Setup Lucine for my site

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Yes I have
1.- The directory with the files to index:
C:/filesToIndex/www/
 
2.- A path where the index files from the search engine will be created, lets say
C:/index/
3.- I have an internet domain whose name is: www.mysite.com
4.- A web application context that runs at http://www.mysite.com/search
 
Once I have set all the above things I want to be able to use the search aplication:
http://www.mysite.com/search/search.jsp
And I dont want that the results that I get from the index (step 2) give me results like
Your file is at
C:/filesToIndex/www/some_html/my_doc.html
The results should be:
Your file is at
http://www.mysite.com/some_html/my_doc.html
For the comments I have read (THANK YOU VERY MUTCH) I conclude that there is no way to generate the index with some custom prefix (as http://www.mysite.com/ for the documents at C:/filesToIndex/www/).
It seems that I have to modify my web application (http://www.mysite.com/search/search.jsp) to include some logic to repalce "C:/filesToIndex/www/" to "http://www.mysite.com/".
If you could point me to the source code of lucene to include this logic and this way fix it once and for all, will appreciate a lot.
The command I used to generate this index was:
java org.apache.lucene.demo.IndexHTML -create -index index C:\index C:\filesToIndex\ www\
Now in the web application I have to modify 
      IndexSearcher searcher;
      Query query;  
      Hits hits;        

      // some code after...
     hits = searcher.search(query); 

      for ( /* search through the hit list*/)

          Document doc = hits.doc(i);        
          String doctitle = doc.get("title");
          String url = doc.get("url");       

I have to do some thing like url = "http://www.mysite.com/" + url.substring("C:/filesToIndex/www/".length);

Regards!!!
And thanks again
 Pinky Iyer <pi...@yahoo.com> wrote:
I dont understand the explanantion. When I try and index the documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is as you said http://localhost:8080/ , then how to proceed!
Thanks in advance!
Samuel Alfonso Vel�zquez D�az wrote:
Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be created
Project/Egothor/var/www: is the path to the local file system files to be indexed.
and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az" 
To: "Lucene Users List" 
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context. By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Pinky Iyer <pi...@yahoo.com>.
I dont understand the explanantion. When I try and index the documents as mentioned in the examples, and then when i run the app and do a sample search, it does point to the directory structure say "c:/filesToIndex/www/" instead of "http://localhost:8080/www/". So how can this be changed to reflect the website domain as mentioned by you. Could you explain again. Say my docs are under a directory c:/filesToIndex/www/ and the wesite is as you said http://localhost:8080/ , then how to proceed!
Thanks in advance!
 Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com> wrote:
Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like:
java org.egothor.indexer.mirror.DoTanker /tmp/my_www Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be created
Project/Egothor/var/www: is the path to the local file system files to be indexed.
and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to do this.
Regards!
Jeff Linwood wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az" 
To: "Lucene Users List" 
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context. By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Samuel Alfonso Vel�zquez D�az <sa...@yahoo.com>.
Oh ok, I thougth it was going to be some thing like the egothor search engine (A java based search engine). When you create the Index, you issue a command like:
java org.egothor.indexer.mirror.DoTanker  /tmp/my_www Project/Egothor/var/www as http://localhost:8080
/thmp/my_www: Is the path to the directory where the index is to be created
Project/Egothor/var/www: is the path to the local file system files to be indexed.
and as http://localhost:8080 is the prefix that the index will keep on the hit list. This way the index will be relative to http://localhost:8080. Even if your production site may be an other site.
Thanks for your comments, any way now I know that I have to modify code to do this.
Regards!
 Jeff Linwood <je...@greenninja.com> wrote:Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Vel�zquez D�az" 
To: "Lucene Users List" 
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context. By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Vel�zquez D�az
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Samuel Alfonso Vel�zquez D�az
http://www.geocities.com/samuelvd
samuelvd@yahoo.com


---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more

Re: Regarding Setup Lucine for my site

Posted by Jeff Linwood <je...@greenninja.com>.
Hi,

I'm not a hundred percent sure I understand what you are asking, but when
you get the results back from Lucene (the hits) it's up to you to format
them to display on a web page - you can always do the modification there
when you display the links to the results.

Jeff
----- Original Message -----
From: "Samuel Alfonso Velázquez Díaz" <sa...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Tuesday, March 04, 2003 11:33 AM
Subject: Regarding Setup Lucine for my site


>
> The documentation says:
>
> Once you've gotten this far you're probably itching to go. Let's start by
creating the index you'll need for the web examples. Since you've already
set your classpath in the previous examples, all you need to do is type
"java org.apache.lucene.demo.IndexHTML -create -index {index-dir} ..".
You'll need to do this from a (any) subdirectory of your {tomcat}/webapps
directory (make sure you didn't leave off the ".." or you'll get a null
pointer exception). {index-dir} should be a directory that Tomcat has
permission to read and write, but is outside of a web accessible context. By
default the webapp is configured to look in /opt/lucene/index for this
index.
>
> A copy of my site is in:
>
> C:\CopiaSite20030228\
>
> My web application runs on
>
> http://mydomain.com/search/index.jsp
>
> how can I make the lucene index map the URLs of the indexed files to:
>
> http://mydomain.com/
>
>
>
> Please help!
>
>
> Samuel Alfonso Velázquez Díaz
> http://www.geocities.com/samuelvd
> samuelvd@yahoo.com
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org