You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Grant Ingersoll <gs...@apache.org> on 2007/12/28 17:54:20 UTC

EnwikiDocMaker

I am using EnwikiDocMaker with the following algorithm outlined at the  
bottom (against trunk).  After the first round is complete, I am getting
java.lang.RuntimeException: java.io.IOException: Bad file descriptor
	at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker 
$Parser.run(EnwikiDocMaker.java:75)
	at java.lang.Thread.run(Thread.java:552)
Caused by: java.io.IOException: Bad file descriptor
	at java.io.FileInputStream.readBytes(Native Method)
	at java.io.FileInputStream.read(FileInputStream.java:194)
	at org.apache.xerces.impl.XMLEntityManager 
$RewindableInputStream.read(Unknown Source)
	at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
	at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
	at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
	at  
org 
.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown  
Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl 
$FragmentContentDispatcher.dispatch(Unknown Source)
	at  
org 
.apache 
.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker 
$Parser.run(EnwikiDocMaker.java:59)

The strange thing, in trying to debug this, I put a breakpoint at line  
59 which gets hit on the first round, but then is not hit when this  
exception is thrown despite the stack trace.  The program then just  
seems to hang.

Anyone seen this before?  Can you reproduce it or is it just me?

-Grant

-------
Algorithm:
merge.factor=mrg:10:100:10:100
max.field.length=2147483647
max.buffered=buf:10:10:100:100
compound=true

analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
directory=FSDirectory

doc.stored=true
doc.tokenized=true
doc.term.vector=false
doc.add.log.step=5000

docs.file=temp/enwiki-20070527-pages-articles.xml

doc.maker=org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker

query.maker=org.apache.lucene.benchmark.byTask.feeds.ReutersQueryMaker

# task at this depth or less would print when they start
task.max.depth.log=2

log.queries=false
#  
-------------------------------------------------------------------------------------

{ "Rounds"

     ResetSystemErase

     { "Populate"
         CreateIndex
         { "MAddDocs" AddDoc > : 2000
         CloseIndex
     }

     NewRound

} : 4

RepSumByName
RepSumByPrefRound MAddDocs
----------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org