You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2008/01/05 02:53:34 UTC
[jira] Resolved: (LUCENE-1117) Intermittent thread safety issue
with EnwikiDocMaker
[ https://issues.apache.org/jira/browse/LUCENE-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-1117.
----------------------------------------
Resolution: Fixed
> Intermittent thread safety issue with EnwikiDocMaker
> ----------------------------------------------------
>
> Key: LUCENE-1117
> URL: https://issues.apache.org/jira/browse/LUCENE-1117
> Project: Lucene - Java
> Issue Type: Bug
> Components: contrib/benchmark
> Affects Versions: 2.2, 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1117.patch
>
>
> Intermittent thread safety issue with EnwikiDocMaker
> When I run the conf/wikipediaOneRound.alg, sometimes it gets started
> OK, other times (about 1/3rd the time) I see this:
> Exception in thread "Thread-0" java.lang.RuntimeException: java.io.IOException: Bad file descriptor
> at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker$Parser.run(EnwikiDocMaker.java:76)
> at java.lang.Thread.run(Thread.java:595)
> Caused by: java.io.IOException: Bad file descriptor
> at java.io.FileInputStream.readBytes(Native Method)
> at java.io.FileInputStream.read(FileInputStream.java:194)
> at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
> at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.scanQName(Unknown Source)
> at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
> at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
> at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker$Parser.run(EnwikiDocMaker.java:60)
> ... 1 more
> The problem is that the thread that pulls the XML docs is started as
> soon as EnwikiDocMaker class is instantiated. When it's started, it
> uses the fileIS (FileInputStream) to feed the XML Parser. But,
> openFile is actually called twice on starting the alg, if you use any
> task deriving from ResetInputsTask, which closes the original fileIS
> that the XML parser may be using.
> I changed the thread to instead start on-demand the first time next()
> is called. I also removed a redundant resetInputs() call (which was
> opening the file more frequently than needed). Finally, I added logic
> in the thread to detect that the input stream was closed (because
> LineDocMaker.resetInputs() was called, eg, if we are not running the
> doc maker to exhaustion).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org