You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/06/03 01:48:22 UTC

EnwikiDocMaker

Is there a reason the EnwikiDocMaker assumes Xerces for the SAX  
parser?  Line 96.

Thanks,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: EnwikiDocMaker

Posted by Grant Ingersoll <gs...@apache.org>.
On Jun 4, 2009, at 2:49 PM, Grant Ingersoll wrote:

> Looking more, I think my problem resides around the notion that I'm  
> using EnWikiDocMaker independently of the benchmarking tool.  The  
> weird thing is, it used to work, but I don't know when it broke.  I  
> suspect I'm not initializing things right.
>
> Anyone else doing that?

Answering my own question, calling resetInputs() first is the key.

For the record, I was seeing the following exception when calling the  
EWDM standalone:
Exception in thread "Thread-0" Exception in thread "main"  
org.apache.lucene.benchmark.byTask.feeds.NoMoreDataException
[INFO]  at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker 
$Parser.next(EnwikiDocMaker.java:167)
[INFO]  at  
org 
.apache 
.lucene 
.benchmark 
.byTask.feeds.EnwikiDocMaker.makeDocument(EnwikiDocMaker.java:300)
[INFO]  at  
com.lucidimagination.wikipedia.indexing.Indexer.index(Indexer.java:66)
[INFO]  at  
com.lucidimagination.wikipedia.indexing.Indexer.main(Indexer.java:115)
[INFO] java.lang.RuntimeException: java.net.MalformedURLException
[INFO]  at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker 
$Parser.run(EnwikiDocMaker.java:129)
[INFO]  at java.lang.Thread.run(Thread.java:637)
[INFO] Caused by: java.net.MalformedURLException
[INFO]  at java.net.URL.<init>(URL.java:601)
[INFO]  at java.net.URL.<init>(URL.java:464)
[INFO]  at java.net.URL.<init>(URL.java:413)
[INFO]  at  
org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown  
Source)
[INFO]  at  
org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown  
Source)
[INFO]  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown  
Source)
[INFO]  at org.apache.xerces.parsers.XML11Configuration.parse(Unknown  
Source)
[INFO]  at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
[INFO]  at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown  
Source)
[INFO]  at org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker 
$Parser.run(EnwikiDocMaker.java:103)
[INFO]  ... 1 more


And here's the code:
       EnwikiDocMaker docMaker = new EnwikiDocMaker();
       Properties properties = new Properties();
       //fileName = config.get("docs.file", null);
       String filePath = wikipediaXML.getAbsolutePath();
       properties.setProperty("docs.file", filePath);
       docMaker.setConfig(new Config(properties));
       docMaker.resetInputs();
       //docMaker.openFile();
       Document doc = null;
       List<SolrInputDocument> docs = new  
ArrayList<SolrInputDocument>(200);
       int i = 0;
       SolrInputDocument sDoc = null;
       long start = System.currentTimeMillis();
       while ((doc = docMaker.makeDocument()) != null && i < numDocs) {
         ...
       }


-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: EnwikiDocMaker

Posted by Grant Ingersoll <gs...@apache.org>.
Looking more, I think my problem resides around the notion that I'm  
using EnWikiDocMaker independently of the benchmarking tool.  The  
weird thing is, it used to work, but I don't know when it broke.  I  
suspect I'm not initializing things right.

Anyone else doing that?

-Grant

On Jun 3, 2009, at 1:26 AM, Shai Erera wrote:

> Grant, note that I'm changing the DocMakers in LUCENE-1595 including  
> this one. So whatever the decision is following your question, I can  
> do it as part of this issue, since that code will no longer be in  
> EnwikiDocMaker.
>
> Regarding to your question, I don't know why it should depend on  
> Xerces (rather than the default Java XML parser I assume?)


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: EnwikiDocMaker

Posted by Grant Ingersoll <gs...@apache.org>.
I think I see what might be my problem.  I'm pulling in the  
dependencies via Maven, and the benchmarker POM is not publishing the  
XERCES dependency, etc.

-Grant

On Jun 3, 2009, at 11:53 AM, Jason Rutherglen wrote:

> I saw a weird error related to the xerces, I think it was a class  
> version problem.  I'll try it again though to make sure.
>
> On Wed, Jun 3, 2009 at 5:58 AM, Shai Erera <se...@gmail.com> wrote:
> Then perhaps as part of 1595 I can change it to use Java's XML  
> parser, and test the Enwiki file. If all goes well, we may not need  
> the XERCES jar in benchmark? Anyway, I'll check that too
>
>
> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <lucene@mikemccandless.com 
> > wrote:
> I also don't know why it's specifically using Xerces...
>
> Mike
>
> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
> > Grant, note that I'm changing the DocMakers in LUCENE-1595  
> including this
> > one. So whatever the decision is following your question, I can do  
> it as
> > part of this issue, since that code will no longer be in  
> EnwikiDocMaker.
> >
> > Regarding to your question, I don't know why it should depend on  
> Xerces
> > (rather than the default Java XML parser I assume?)
> >
> > Shai
> >
> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll  
> <gs...@apache.org> wrote:
> >>
> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX  
> parser?
> >>  Line 96.
> >>
> >> Thanks,
> >> Grant
> >>
> >>  
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
> ---------------------------------------------------------------------

Re: EnwikiDocMaker

Posted by Jason Rutherglen <ja...@gmail.com>.
I saw a weird error related to the xerces, I think it was a class version
problem.  I'll try it again though to make sure.

On Wed, Jun 3, 2009 at 5:58 AM, Shai Erera <se...@gmail.com> wrote:

> Then perhaps as part of 1595 I can change it to use Java's XML parser, and
> test the Enwiki file. If all goes well, we may not need the XERCES jar in
> benchmark? Anyway, I'll check that too
>
>
> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> I also don't know why it's specifically using Xerces...
>>
>> Mike
>>
>> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
>> > Grant, note that I'm changing the DocMakers in LUCENE-1595 including
>> this
>> > one. So whatever the decision is following your question, I can do it as
>> > part of this issue, since that code will no longer be in EnwikiDocMaker.
>> >
>> > Regarding to your question, I don't know why it should depend on Xerces
>> > (rather than the default Java XML parser I assume?)
>> >
>> > Shai
>> >
>> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>> >>
>> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX parser?
>> >>  Line 96.
>> >>
>> >> Thanks,
>> >> Grant
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>

Re: EnwikiDocMaker

Posted by Grant Ingersoll <gs...@apache.org>.
Doh! Not sure how I missed that!  Sure enough, I see it now.  I'll try  
my stuff using those libs and make sure they are at the front of the  
classpath


On Jun 3, 2009, at 11:13 AM, Shai Erera wrote:

> The current benchmark contains xerces-2.9.1-patched- 
> XERCESJ-1257.jar, and its build.xml sets the classpath to include  
> all .jar under the lib folder. So it looks like it is part of  
> Benchmark.
>
> Maybe you fail to run it outside benchmark because you don't include  
> it in your classpath?
>
> Anyway, I'll move to use Java's SAX parser and if all pass, remove  
> the Xerces from benchmark as part of LUCENE-1595
>
> Shai
>
> On Wed, Jun 3, 2009 at 7:09 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
> +1
>
> Note, Xerces Jar is not in benchmark, AFAICT.  It relies on the fact  
> that Java uses it under the hood.
>
> I'm having this really weird situation where I'm using  
> EnwikiDocMaker outside the context of the benchmarker and I'm  
> grasping at straws as to why it is not working.  It seems to be a  
> classpath issue, but is not Lucene related so I'll spare the details.
>
> -Grant
>
> On Jun 3, 2009, at 5:58 AM, Shai Erera wrote:
>
>> Then perhaps as part of 1595 I can change it to use Java's XML  
>> parser, and test the Enwiki file. If all goes well, we may not need  
>> the XERCES jar in benchmark? Anyway, I'll check that too
>>
>> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <lucene@mikemccandless.com 
>> > wrote:
>> I also don't know why it's specifically using Xerces...
>>
>> Mike
>>
>> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
>> > Grant, note that I'm changing the DocMakers in LUCENE-1595  
>> including this
>> > one. So whatever the decision is following your question, I can  
>> do it as
>> > part of this issue, since that code will no longer be in  
>> EnwikiDocMaker.
>> >
>> > Regarding to your question, I don't know why it should depend on  
>> Xerces
>> > (rather than the default Java XML parser I assume?)
>> >
>> > Shai
>> >
>> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gsingers@apache.org 
>> > wrote:
>> >>
>> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX  
>> parser?
>> >>  Line 96.
>> >>
>> >> Thanks,
>> >> Grant
>> >>
>> >>  
>> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: EnwikiDocMaker

Posted by Michael McCandless <lu...@mikemccandless.com>.
Shai, make sure you're able to process the full Wikipedia export, ie
you don't hit that weird issue (with Xerces) from LUCENE-1591, that
caused us to switch to the patched version of Xerces.

Mike

On Wed, Jun 3, 2009 at 2:13 PM, Shai Erera <se...@gmail.com> wrote:
> The current benchmark contains xerces-2.9.1-patched-XERCESJ-1257.jar, and
> its build.xml sets the classpath to include all .jar under the lib folder.
> So it looks like it is part of Benchmark.
>
> Maybe you fail to run it outside benchmark because you don't include it in
> your classpath?
>
> Anyway, I'll move to use Java's SAX parser and if all pass, remove the
> Xerces from benchmark as part of LUCENE-1595
>
> Shai
>
> On Wed, Jun 3, 2009 at 7:09 PM, Grant Ingersoll <gs...@apache.org> wrote:
>>
>> +1
>> Note, Xerces Jar is not in benchmark, AFAICT.  It relies on the fact that
>> Java uses it under the hood.
>> I'm having this really weird situation where I'm using EnwikiDocMaker
>> outside the context of the benchmarker and I'm grasping at straws as to why
>> it is not working.  It seems to be a classpath issue, but is not Lucene
>> related so I'll spare the details.
>> -Grant
>> On Jun 3, 2009, at 5:58 AM, Shai Erera wrote:
>>
>> Then perhaps as part of 1595 I can change it to use Java's XML parser, and
>> test the Enwiki file. If all goes well, we may not need the XERCES jar in
>> benchmark? Anyway, I'll check that too
>>
>> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>>>
>>> I also don't know why it's specifically using Xerces...
>>>
>>> Mike
>>>
>>> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
>>> > Grant, note that I'm changing the DocMakers in LUCENE-1595 including
>>> > this
>>> > one. So whatever the decision is following your question, I can do it
>>> > as
>>> > part of this issue, since that code will no longer be in
>>> > EnwikiDocMaker.
>>> >
>>> > Regarding to your question, I don't know why it should depend on Xerces
>>> > (rather than the default Java XML parser I assume?)
>>> >
>>> > Shai
>>> >
>>> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org>
>>> > wrote:
>>> >>
>>> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX
>>> >> parser?
>>> >>  Line 96.
>>> >>
>>> >> Thanks,
>>> >> Grant
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>> >>
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: EnwikiDocMaker

Posted by Shai Erera <se...@gmail.com>.
The current benchmark contains
xerces-2.9.1-patched-XERCESJ-1257.jar<http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/lib/xerces-2.9.1-patched-XERCESJ-1257.jar?view=log>,
and its build.xml sets the classpath to include all .jar under the lib
folder. So it looks like it is part of Benchmark.

Maybe you fail to run it outside benchmark because you don't include it in
your classpath?

Anyway, I'll move to use Java's SAX parser and if all pass, remove the
Xerces from benchmark as part of LUCENE-1595

Shai

On Wed, Jun 3, 2009 at 7:09 PM, Grant Ingersoll <gs...@apache.org> wrote:

> +1
> Note, Xerces Jar is not in benchmark, AFAICT.  It relies on the fact that
> Java uses it under the hood.
>
> I'm having this really weird situation where I'm using EnwikiDocMaker
> outside the context of the benchmarker and I'm grasping at straws as to why
> it is not working.  It seems to be a classpath issue, but is not Lucene
> related so I'll spare the details.
>
> -Grant
>
> On Jun 3, 2009, at 5:58 AM, Shai Erera wrote:
>
> Then perhaps as part of 1595 I can change it to use Java's XML parser, and
> test the Enwiki file. If all goes well, we may not need the XERCES jar in
> benchmark? Anyway, I'll check that too
>
> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> I also don't know why it's specifically using Xerces...
>>
>> Mike
>>
>> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
>> > Grant, note that I'm changing the DocMakers in LUCENE-1595 including
>> this
>> > one. So whatever the decision is following your question, I can do it as
>> > part of this issue, since that code will no longer be in EnwikiDocMaker.
>> >
>> > Regarding to your question, I don't know why it should depend on Xerces
>> > (rather than the default Java XML parser I assume?)
>> >
>> > Shai
>> >
>> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>> >>
>> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX parser?
>> >>  Line 96.
>> >>
>> >> Thanks,
>> >> Grant
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Re: EnwikiDocMaker

Posted by Grant Ingersoll <gs...@apache.org>.
+1

Note, Xerces Jar is not in benchmark, AFAICT.  It relies on the fact  
that Java uses it under the hood.

I'm having this really weird situation where I'm using EnwikiDocMaker  
outside the context of the benchmarker and I'm grasping at straws as  
to why it is not working.  It seems to be a classpath issue, but is  
not Lucene related so I'll spare the details.

-Grant
On Jun 3, 2009, at 5:58 AM, Shai Erera wrote:

> Then perhaps as part of 1595 I can change it to use Java's XML  
> parser, and test the Enwiki file. If all goes well, we may not need  
> the XERCES jar in benchmark? Anyway, I'll check that too
>
> On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <lucene@mikemccandless.com 
> > wrote:
> I also don't know why it's specifically using Xerces...
>
> Mike
>
> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
> > Grant, note that I'm changing the DocMakers in LUCENE-1595  
> including this
> > one. So whatever the decision is following your question, I can do  
> it as
> > part of this issue, since that code will no longer be in  
> EnwikiDocMaker.
> >
> > Regarding to your question, I don't know why it should depend on  
> Xerces
> > (rather than the default Java XML parser I assume?)
> >
> > Shai
> >
> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll  
> <gs...@apache.org> wrote:
> >>
> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX  
> parser?
> >>  Line 96.
> >>
> >> Thanks,
> >> Grant
> >>
> >>  
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: EnwikiDocMaker

Posted by Shai Erera <se...@gmail.com>.
Then perhaps as part of 1595 I can change it to use Java's XML parser, and
test the Enwiki file. If all goes well, we may not need the XERCES jar in
benchmark? Anyway, I'll check that too

On Wed, Jun 3, 2009 at 1:59 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> I also don't know why it's specifically using Xerces...
>
> Mike
>
> On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
> > Grant, note that I'm changing the DocMakers in LUCENE-1595 including this
> > one. So whatever the decision is following your question, I can do it as
> > part of this issue, since that code will no longer be in EnwikiDocMaker.
> >
> > Regarding to your question, I don't know why it should depend on Xerces
> > (rather than the default Java XML parser I assume?)
> >
> > Shai
> >
> > On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org>
> wrote:
> >>
> >> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX parser?
> >>  Line 96.
> >>
> >> Thanks,
> >> Grant
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: EnwikiDocMaker

Posted by Michael McCandless <lu...@mikemccandless.com>.
I also don't know why it's specifically using Xerces...

Mike

On Wed, Jun 3, 2009 at 4:26 AM, Shai Erera <se...@gmail.com> wrote:
> Grant, note that I'm changing the DocMakers in LUCENE-1595 including this
> one. So whatever the decision is following your question, I can do it as
> part of this issue, since that code will no longer be in EnwikiDocMaker.
>
> Regarding to your question, I don't know why it should depend on Xerces
> (rather than the default Java XML parser I assume?)
>
> Shai
>
> On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org> wrote:
>>
>> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX parser?
>>  Line 96.
>>
>> Thanks,
>> Grant
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: EnwikiDocMaker

Posted by Shai Erera <se...@gmail.com>.
Grant, note that I'm changing the DocMakers in LUCENE-1595 including this
one. So whatever the decision is following your question, I can do it as
part of this issue, since that code will no longer be in EnwikiDocMaker.

Regarding to your question, I don't know why it should depend on Xerces
(rather than the default Java XML parser I assume?)

Shai

On Wed, Jun 3, 2009 at 2:48 AM, Grant Ingersoll <gs...@apache.org> wrote:

> Is there a reason the EnwikiDocMaker assumes Xerces for the SAX parser?
>  Line 96.
>
> Thanks,
> Grant
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>