You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by Frank Ruebe <fr...@hotmail.com> on 2005/06/05 16:25:52 UTC

lucene problems in lenya 1.2.2 / Servlet/ WinXP

Hi,

I'm trying to include lucene in my new made pub.
As far as I understand (not very far), first you need to crawl your site and 
then to index it.
I've read http://lenya.apache.org/1_2_x/components/search/lucene.html and 
http://lenya.apache.org/1_2_x/components/search/lucene.html but I'm 
obviously to dump to get it.

I've inserted crawler-live.xconf in mypub/config/search/
but when I start crawling
ant -f ../../build/lenya/webapp/lenya/bin/crawl_and_index.xml 
-Dcrawler.xconf=../../build/len
ya/webapp/lenya/pubs/mypub/config/search/crawler-live.xconf crawlin>ant -f 
../../build/lenya/webapp/lenya/bi
ant
[echo] INFO: Show configuration
[java] http://localhost:8888/mypub/live/index.html
[java] http://localhost:8888/mypub/live/
[java] lenya
[java] work/search/lucene/uris.txt
[java] 
../../build/lenya/webapp/lenya/pubs/mypub/config/search/work/search/lucene/uris.txt
[java] htdocs-dump-dir/@src: work/search/lucene/htdocs_dump/live
[java] 
../../build/lenya/webapp/lenya/pubs/mypub/config/search/work/search/lucene/htdocs_dump/live
[echo] INFO: Start crawling ...
[java] java.lang.StringIndexOutOfBoundsException: String index out of range: 
-1
[java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:172)
[java]     at org.apache.tools.ant.taskdefs.Java.run(Java.java:705)
[java]     at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:177)
[java]     at org.apache.tools.ant.taskdefs.Java.execute(Java.java:83)
[java]     at 
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:275)
[java]     at org.apache.tools.ant.Task.perform(Task.java:364)
[java]     at org.apache.tools.ant.Target.execute(Target.java:341)
[java]     at org.apache.tools.ant.Target.performTasks(Target.java:369)
[java]     at org.apache.tools.ant.Project.executeTarget(Project.java:1214)
[java]     at org.apache.tools.ant.Project.executeTargets(Project.java:1062)
[java]     at org.apache.tools.ant.Main.runBuild(Main.java:673)
[...]

What's going wrong?

This is my crawler-live.xconf

<?xml version="1.0"?>
<crawler>
  <user-agent>lenya</user-agent>

  <base-url href="http://localhost:8888/mypub/live/index.html"/>
  <scope-url href="http://localhost:8888/mypub/live/"/>

  <uri-list src="work/search/lucene/uris.txt"/>
  <htdocs-dump-dir src="work/search/lucene/htdocs_dump/live/"/>

  <!-- <robots src="robots.txt" domain="lenya.apache.org"/> -->
</crawler>

I didn't edit search.properties (but webapp.dir=../../ is right anyway, or?)

Thanks
Franz

_________________________________________________________________
Eine für alle. MSN Suche. http://search.msn.de Finden statt suchen!


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by so...@gmail.com.
On 6/7/05, Franz Ruebe <fr...@hotmail.com> wrote:
> Hi solprovider,
> thanks for your interest and help. It's really good for an newbie to
> experience a living community while having problems.
I am still rather new here, but glad to help.

> solprovider@gmail.com wrote:
> >http://excalibur.apache.org/
> >Good Code.  Smart Developers.
> >Maybe.  I could not find this code on the Excalibur website, so it was
> >probably discarded.  The code below was decompiled from the classes
> >shipped in Lenya1.2.2.
> >org.apache.avalon.excalibur.io.FileUtil [CODE REMOVED]
> >Twice they use lastIndexOf(), then immediately use the return value in
> >substring without checking for -1 (not found).  In C, this would cause
> >weird memory errors.  In Java, it throws an Exception.  It would be
> >caught if the programmers had a clue.  Instead, the JRE complains and
> >exits.
> This means I'm not the only one who got problems in clean programming?
You are in good company; I seem to the lonely exception.  I have the
(good) habit of assuming no function will return what I expect, and
users will never give good input, so I check everything all the time,
even values returned from my own functions.  Comments and
input-checking double the length of my code, but it is very difficult
to break.  Much experience has taught me not to expect that level of
programming from anybody else.  You might read the articles on my site
in the programming section:
http://solprovider.com/articles/&cat=Software+Development
Some give good advice; some are about improving existing UIs; some are
just rants.  Enjoy.

> >This code is expecting slashes.  I did not see any backslashes in your
> >config to confuse it.  Does your ant directory have 2 levels to go up?
> This seems to be an answer on one of my questions. But which one?
This was about crawling crashing.

> >Franz,  Thanks for providing evidence that making Lenya easy-to-use for
> >non-gurus will help its popularity.
> You're welcome. But being serious: You knew that before, didn't you?
I know it, but I was responding to a post on the other thread:
"Lenya's target audience should know what a gz archive and a MD5 sum is."

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by so...@gmail.com.
I thought BSE was "Being So E___", but I could not think of a word
starting with E that fit the context.

Bad School English?  You write better than most of what I read here,
on Slashdot, or even from ZDNet's "professionals".  Many cannot spell
or capitalize, and I have wasted much time figuring out what people
were trying to say.  I have no issues with your writing.

---
I looked at:
http://lenya.apache.org/1_2_x/components/search/lucene.html

It is inconsistent.  It is likely nobody sees the end of the ant
commands because they extend past the screen.  (It needs to be
formatted better.)

crawler-live.xconf vs. crawler.xconf: 
Use either filename, but be consistent with the ant command. 

I cannot think why someone would bother indexing the authoring area. 
"Oh, I just made a change, didn't publish it, and cannot remember
where it is! (But I remember enough text for a successful search.)"  I
cannot imagine that happening often enough (or ever) to waste time
setting up indexing of the authoring area.

---
(I lied about not helping.)

http://excalibur.apache.org/
Good Code.  Smart Developers.
Maybe.  I could not find this code on the Excalibur website, so it was
probably discarded.  The code below was decompiled from the classes
shipped in Lenya1.2.2.

org.apache.avalon.excalibur.io.FileUtil
    public static String catPath(String lookupPath, String path) {
        int index = lookupPath.lastIndexOf("/");
        String lookup = lookupPath.substring(0, index);
        String pth;
        for(pth = path; pth.startsWith("../"); pth = pth.substring(index)) {
            if(lookup.length() > 0) {
                index = lookup.lastIndexOf("/");
                lookup = lookup.substring(0, index);
            } else {
                return null;
            }
            index = pth.indexOf("../") + 3;
        }
        return lookup + "/" + pth;
    }

Twice they use lastIndexOf(), then immediately use the return value in
substring without checking for -1 (not found).  In C, this would cause
weird memory errors.  In Java, it throws an Exception.  It would be
caught if the programmers had a clue.  Instead, the JRE complains and
exits.

English translation:
Given lookupPath and path
Remove after last "/" in lookupPath
For every "../" at the beginning of path, remove it and after last "/"
in lookupPath
Return what remains of lookupPath and path.
So ("a/b/c/d", "../../x") returns "a/x".
If lookupPath does not have a slash, or there are more "../" in path
than one less than the number of slashes in lookupPath, it crashes.

This code is expecting slashes.  I did not see any backslashes in your
config to confuse it.  Does your ant directory have 2 levels to go up?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by so...@gmail.com.
On 6/6/05, Franz Ruebe <fr...@hotmail.com> wrote:
> I set my CLASSPATH in Windows and la voilá, there's a lucene.log. Setting my
> classpath was not necessary before, because I had an ant 1.5.1 in my
> PATH-Variable, which I deleted after the new installation. Does this sounds
> reasonable?
> >If it is in the correct directory, the errors go away and "lucene.log"
> >is created.  (Like the filename?  I invented it myself!)
> The login errors went away, and I like the name very muchacho (mirrors your
> genius ;-)) .
> But is the location sensful? Shouldn't it be a log dir (There are still not
> enough dirs in lenya).

Good.  The logging is working.  Now if it was only useful...

Since I was starting the indexing manually from the ant directory, it
made sense to have the log appear in the same directory, and that is
where it defaults.  You should be able to specify a full filepath in
the log4j.properties.

> But the log is empty (creation date is crawl-try-date) and I got still a
> bunch of Java errors. So just the old problem...

Bad.  It sounds like your Lucene has a bug in the crawler, but since
nobody else noticed it, it is more likely your configuration is
incorrect.

> So once again some questions:
> What to do when just want to use the 'normal way of searching' (I agree
> after understanding your way indexing the xml is much smarter, but one after
> the other...)?

Then my instructions are not what you want.  I thought about combining
results from indexing crawled HTML, and indexing the XML.  Combining
them would require many xsl:choose in the XSL for results because the
URLs do not need to be rewritten for HTML entries.  Also, crawled HTML
does not have a language field, and the description is usually rather
messy.  I decided the easiest method would be separate indexes and
search pages.  I only need the XML indexed for my project, so I
stopped working on it when that was working.

> I made a crawler-live.xconf (In
> http://lenya.apache.org/1_2_x/components/search/lucene.html is a mistake, I
> think)
> "Note that there is a search.properties file in build/lenya/webapp/lenya/bin
> that you may have to change. crawler.xconf needs to have the following
> elements:"
> Shouldn't that be crawler-live.xconf as written in the ant command?
> Here is my crawler-live.xconf
> <?xml version="1.0"?>
> <crawler>
>   <user-agent>lenya</user-agent>
>   <base-url href="http://localhost:8888/mypub/live/index.html"/>
>   <scope-url href="http://localhost:8888/mypub/live/"/>
>   <uri-list src="work/search/lucene/uris.txt"/>
>   <htdocs-dump-dir
> src="work/search/lucene/htdocs_dump/live/lenya.apache.org"/>
> </crawler>
> "Note that there is a search.properties..."
> Ok, in there's just the webapp.dir=../../ ; seems to be ok...

Can't open lenya.apache.org now.  (Is it down, or is it Comcast?) 
I'll look at it later, but I have no authority on that site.

> >org.apache.avalon.excalibur.io.FileUtil.catPath(FileUtil.java:509)
> >forgot to check the bounds before using substring.  Might work if you
> >wrap the code:
> >if(lowerbound >= 0){   newString = oldString.substring(lowerbound [,
> >upperbound]);
> >}
> >but it would be better to read the code and figure out why the
> >lowerbound is -1.  Usually the -1 comes from searching with indexOf()
> >or lastIndexOf() for a substring that is not there .
> Wouldn't that mean, that everybody would have the same probs in using the
> crawler?
> Does anybody else uses the crawler?
> I'm not (yet??) the right guy to change java-classes.

I am the right guy for changing Java classes, but I tried to avoid it
for this project.  The closest was rewriting search.xsp.

It is probably missing or incorrect configuration.  I do not write
software without reasonable defaults for missing/bad configuration, or
errors that don't tell how to fix, but that's just me; most
programmers think more obscure errors are better.  (I do not have the
time to research it; it does not affect my current project.  Sorry.)

> > > Is there a way to get an index without crawling?
> >Yes.  That was the point of the How-To.  I wanted to filter the
> >results based on language and  filepath, and that is easier working
> >from Lenya's XML than from HTML.  It seems silly to crawl the website
> >and put a copy on the hard drive, when all the contents are already on
> >the hard drive in a much better format.
> How embarassing. I read the stuff several times, but obviously didn't
> understand it. Have the red lines on your site
> http://solprovider.com/lenya/search have already been yesterday there? Or is
> this a tribute to my reading-over-without-understanding? This line in the
> apache how-to would spare dudes like me a lot of time.

Most of the changes are a tribute to you!  The only other change was
the link to the Linux version.  Documentation should be an interactive
process (which is why Wikis are so popular.)  I will ask Gregor to
update the official How-To when we are done.

> Sorry for BSE
Please define "BSE" (if it is allowed in polite company.)

---
Why do you need the crawled HTML version of Search?  Should I add
reasons about when my version is not appropriate?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by Franz Ruebe <fr...@hotmail.com>.
Hi solprovider,

> > >log4j.properties must be on the CLASSPATH.  The ANT Batch file has:
> > >SET CLASSPATH=.
> > >so having log4j.properties in the tools/bin directory works.
> > I have my log4j.properties in the CLASSPATH set by the ant.bat (In my 
>case
> > d:\apache-lenya-1.2.3\tools\bin).
I set my CLASSPATH in Windows and la voilá, there's a lucene.log. Setting my 
classpath was not necessary before, because I had an ant 1.5.1 in my 
PATH-Variable, which I deleted after the new installation. Does this sounds 
reasonable?

>If it is in the correct directory, the errors go away and "lucene.log"
>is created.  (Like the filename?  I invented it myself!)
The login errors went away, and I like the name very muchacho (mirrors your 
genius ;-)) .
But is the location sensful? Shouldn't it be a log dir (There are still not 
enough dirs in lenya).

But the log is empty (creation date is crawl-try-date) and I got still a 
bunch of Java errors. So just the old problem...

So once again some questions:
What to do when just want to use the 'normal way of searching' (I agree 
after understanding your way indexing the xml is much smarter, but one after 
the other...)?
I made a crawler-live.xconf (In 
http://lenya.apache.org/1_2_x/components/search/lucene.html is a mistake, I 
think)

"Note that there is a search.properties file in build/lenya/webapp/lenya/bin 
that you may have to change. crawler.xconf needs to have the following 
elements:"
Shouldn't that be crawler-live.xconf as written in the ant command?

Here is my crawler-live.xconf
<?xml version="1.0"?>
<crawler>
  <user-agent>lenya</user-agent>
  <base-url href="http://localhost:8888/mypub/live/index.html"/>
  <scope-url href="http://localhost:8888/mypub/live/"/>
  <uri-list src="work/search/lucene/uris.txt"/>
  <htdocs-dump-dir 
src="work/search/lucene/htdocs_dump/live/lenya.apache.org"/>
</crawler>

"Note that there is a search.properties..."
Ok, in there's just the webapp.dir=../../ ; seems to be ok...

>org.apache.avalon.excalibur.io.FileUtil.catPath(FileUtil.java:509)
>forgot to check the bounds before using substring.  Might work if you
>wrap the code:
>if(lowerbound >= 0){   newString = oldString.substring(lowerbound [, 
>upperbound]);
>}
>but it would be better to read the code and figure out why the
>lowerbound is -1.  Usually the -1 comes from searching with indexOf()
>or lastIndexOf() for a substring that is not there .

Wouldn't that mean, that everybody would have the same probs in using the 
crawler?
Does anybody else uses the crawler?
I'm not (yet??) the right guy to change java-classes.

> > Is there a way to get an index without crawling?
>
>Yes.  That was the point of the How-To.  I wanted to filter the
>results based on language and  filepath, and that is easier working
>from Lenya's XML than from HTML.  It seems silly to crawl the website
>and put a copy on the hard drive, when all the contents are already on
>the hard drive in a much better format.

How embarassing. I read the stuff several times, but obviously didn't 
understand it. Have the red lines on your site 
http://solprovider.com/lenya/search have already been yesterday there? Or is 
this a tribute to my reading-over-without-understanding? This line in the 
apache how-to would spare dudes like me a lot of time.

Thanx for your time!

"Programming is like war between engineers trying to make better software 
and the universe producing idiots. So far, universe wins"

Sorry for BSE
Franz

_________________________________________________________________
Machen Sie lästigen E-Mails ein Ende. MSN Hotmail mit Junk-Mail-Filter. 
http://www.msn.de/antispam/prevention/junkmailfilter Jetzt kostenlos 
anmelden!


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by so...@gmail.com.
On 6/5/05, Franz Ruebe <fr...@hotmail.com> wrote:
> It works in the same way for 1.2.3?
The developers have said there were no changes to Search in 1.2.3. 
Another thread here was about someone trying the instructions with
1.2.3, but they have not told us it worked yet.

> >log4j.properties must be on the CLASSPATH.  The ANT Batch file has:
> >SET CLASSPATH=.
> >so having log4j.properties in the tools/bin directory works.
> I have my log4j.properties in the CLASSPATH set by the ant.bat (In my case
> d:\apache-lenya-1.2.3\tools\bin).

If it is in the correct directory, the errors go away and "lucene.log"
is created.  (Like the filename?  I invented it myself!)

> >If you're not resetting the CLASSPATH, you must put log4j.properties on
> >your CLASSPATH.
> How to reset the CLASSPATH? Is this necessary if it is set correct by
> ant.bat?

On Windows, the CLASSPATH is typically set in AUTOEXEC.BAT.  Win2K/XP
can also set environment values in System Properties.  Open a command
prompt and type:
SET
Look for CLASSPATH={some directories}

My global CLASSPATH is set to work with Tomcat.  I reset it to the
tools/bin directory in my batch file because everything ant needs for
crawling and indexing is in that directory, and it made it easy to
place "log4j.properties", and easy to find "lucene.log".

To set it to the current directory, add:
SET CLASSPATH=.
(including the final period) in your batch file.

> Is there a way to get an index without crawling?

Yes.  That was the point of the How-To.  I wanted to filter the
results based on language and  filepath, and that is easier working
from Lenya's XML than from HTML.  It seems silly to crawl the website
and put a copy on the hard drive, when all the contents are already on
the hard drive in a much better format.

FYI, my copy of the How-To is at: http://solprovider.com/lenya/search

Please let us know it works with Lenya1.2.3, and I would be interested
in any comments to improve the instructions.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by so...@gmail.com.
On 6/5/05, Franz Ruebe <fr...@hotmail.com> wrote:
> I re-inserted log4j.properties as described in
> http://lenya.apache.org/1_2_x/how-to/search.html, put mypub back in
> lenya/pubs (it works) and tried again.
> Result:
...
>      [java] log4j:WARN No appenders could be found for logger
> (org.apache.lenya.xml.DOMUtil).
>      [java] log4j:WARN Please initialize the log4j system properly.
...
>      [java] log4j:WARN No appenders could be found for logger
> (org.apache.lenya.xml.DOMUtil).
>      [java] log4j:WARN Please initialize the log4j system properly.
>      [java] java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
...
> 
> Have their been changes in log-handling? Theres no lucene.log (as it was in
> lenya 1.2.2). Sigh...
> Hoping for help...
> 
> Franz

I wrote the How-To as a complete set.  I did not realize people would
try to use it in pieces.   Sorry.

log4j.properties must be on the CLASSPATH.  The ANT Batch file has:
SET CLASSPATH=.
so having log4j.properties in the tools/bin directory works.  If you
are not resetting the CLASSPATH, you must put log4j.properties on your
CLASSPATH.
 
As far as the StringIndexOutOfBoundsException, my version does not use
the crawler.  So follow all of my instructions and this problem
disappears.
OR
org.apache.avalon.excalibur.io.FileUtil.catPath(FileUtil.java:509)
forgot to check the bounds before using substring.  Might work if you
wrap the code:
if(lowerbound >= 0){ 
   newString = oldString.substring(lowerbound [, upperbound]);
}
but it would be better to read the code and figure out why the
lowerbound is -1.  Usually the -1 comes from searching with indexOf()
or lastIndexOf() for a substring that is not there .

Enjoy,
solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by Franz Ruebe <fr...@hotmail.com>.
Hi Gregor,

thanks for answering - sorry for the double.
After reading installation doc again, I installed j2sdk 1.4.2_08 (rtfm...) 
and lenya 1.2.3. and rebuild it.
I re-inserted log4j.properties as described in 
http://lenya.apache.org/1_2_x/how-to/search.html, put mypub back in 
lenya/pubs (it works) and tried again.
Result:
D:\apache-lenya-1.2.3\tools\lib
Buildfile: ..\..\build\lenya\webapp\lenya\bin\crawl_and_index.xml
init:
     [echo] INFO: Init

crawl:
     [echo] INFO: Crawl and dump hypertext documents 
(../../build/lenya/webapp/lenya/pubs/bcd/config/search/crawler-live.xconf)
     [echo] INFO: Show configuration
     [java] log4j:WARN No appenders could be found for logger 
(org.apache.lenya.xml.DOMUtil).
     [java] log4j:WARN Please initialize the log4j system properly.
     [java] Crawler Config: Base URL: 
http://localhost:8888/bcd/live/index.html
     [java] Crawler Config: Scope URL: http://localhost:8888/bcd/live/
     [java] Crawler Config: User Agent: lenya
     [java] Crawler Config: URI List: 
../../build/lenya/webapp/lenya/pubs/bcd/config/search/work/search/lucene/uris.txt 
(work/search/lucene/uris.txt)
     [java] Crawler Config: HTDocs Dump Dir: 
../../build/lenya/webapp/lenya/pubs/bcd/config/search/work/search/lucene/htdocs_dump/live/ 
(work/search/lucene/htdocs_dump/live/)
     [echo] INFO: START crawling ...
     [java] log4j:WARN No appenders could be found for logger 
(org.apache.lenya.xml.DOMUtil).
     [java] log4j:WARN Please initialize the log4j system properly.
     [java] java.lang.StringIndexOutOfBoundsException: String index out of 
range: -1
     [java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:178)
     [java]     at org.apache.tools.ant.taskdefs.Java.run(Java.java:710)
     [java]     at 
org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:178)
     [java]     at org.apache.tools.ant.taskdefs.Java.execute(Java.java:84)
     [java]     at 
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:275)
     [java]     at org.apache.tools.ant.Task.perform(Task.java:364)
     [java]     at org.apache.tools.ant.Target.execute(Target.java:341)
     [java]     at org.apache.tools.ant.Target.performTasks(Target.java:369)
     [java]     at 
org.apache.tools.ant.Project.executeSortedTargets(Project.java:1216)
     [java]     at 
org.apache.tools.ant.Project.executeTarget(Project.java:1185)
     [java]     at 
org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:40)
     [java]     at 
org.apache.tools.ant.Project.executeTargets(Project.java:1068)
     [java]     at org.apache.tools.ant.Main.runBuild(Main.java:668)
     [java]     at org.apache.tools.ant.Main.startAnt(Main.java:187)
     [java]     at 
org.apache.tools.ant.launch.Launcher.run(Launcher.java:246)
     [java]     at 
org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
     [java] Caused by: java.lang.StringIndexOutOfBoundsException: String 
index out of range: -1
     [java]     at java.lang.String.substring(String.java:1444)
     [java]     at 
org.apache.avalon.excalibur.io.FileUtil.catPath(FileUtil.java:509)
     [java]     at 
org.apache.lenya.search.crawler.CrawlerConfiguration.resolvePath(CrawlerConfiguration.java:268)
     [java]     at 
org.apache.lenya.search.crawler.CrawlerConfiguration.getURIListResolved(CrawlerConfiguration.java:199)
     [java]     at 
org.apache.lenya.search.crawler.IterativeHTMLCrawler.<init>(IterativeHTMLCrawler.java:94)
     [java]     at 
org.apache.lenya.search.crawler.IterativeHTMLCrawler.main(IterativeHTMLCrawler.java:63)
     [java]     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
     [java]     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     [java]     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     [java]     at java.lang.reflect.Method.invoke(Method.java:324)
     [java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.run(ExecuteJava.java:200)
     [java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:134)
     [java]     ... 15 more
     [java] --- Nested Exception ---
     [java] java.lang.StringIndexOutOfBoundsException: String index out of 
range: -1
     [java]     at java.lang.String.substring(String.java:1444)
     [java]     at 
org.apache.avalon.excalibur.io.FileUtil.catPath(FileUtil.java:509)
     [java]     at 
org.apache.lenya.search.crawler.CrawlerConfiguration.resolvePath(CrawlerConfiguration.java:268)
     [java]     at 
org.apache.lenya.search.crawler.CrawlerConfiguration.getURIListResolved(CrawlerConfiguration.java:199)
     [java]     at 
org.apache.lenya.search.crawler.IterativeHTMLCrawler.<init>(IterativeHTMLCrawler.java:94)
     [java]     at 
org.apache.lenya.search.crawler.IterativeHTMLCrawler.main(IterativeHTMLCrawler.java:63)
     [java]     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
     [java]     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     [java]     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     [java]     at java.lang.reflect.Method.invoke(Method.java:324)
     [java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.run(ExecuteJava.java:200)
     [java]     at 
org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:134)
     [java]     at org.apache.tools.ant.taskdefs.Java.run(Java.java:710)
     [java]     at 
org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:178)
     [java]     at org.apache.tools.ant.taskdefs.Java.execute(Java.java:84)
     [java]     at 
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:275)
     [java]     at org.apache.tools.ant.Task.perform(Task.java:364)
     [java]     at org.apache.tools.ant.Target.execute(Target.java:341)
     [java]     at org.apache.tools.ant.Target.performTasks(Target.java:369)
     [java]     at 
org.apache.tools.ant.Project.executeSortedTargets(Project.java:1216)
     [java]     at 
org.apache.tools.ant.Project.executeTarget(Project.java:1185)
     [java]     at 
org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:40)
     [java]     at 
org.apache.tools.ant.Project.executeTargets(Project.java:1068)
     [java]     at org.apache.tools.ant.Main.runBuild(Main.java:668)
     [java]     at org.apache.tools.ant.Main.startAnt(Main.java:187)
     [java]     at 
org.apache.tools.ant.launch.Launcher.run(Launcher.java:246)
     [java]     at 
org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
     [echo] INFO: Crawling DONE

Have their been changes in log-handling? Theres no lucene.log (as it was in 
lenya 1.2.2). Sigh...
Hoping for help...

Franz

_________________________________________________________________
Immer für Sie da. MSN Hotmail. http://www.msn.de/email/webbased/ Jetzt 
kostenlos anmelden und überall erreichbar sein!


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org


Re: lucene problems in lenya 1.2.2 / Servlet/ WinXP

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Frank Ruebe wrote:

> I'm trying to include lucene in my new made pub.

you may also want to consult 
http://lenya.apache.org/1_2_x/how-to/search.html


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org