You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Scott Schneider <Sc...@symantec.com> on 2013/09/25 02:32:52 UTC

Problem loading my codec sometimes

Hello,

I created my own codec and Solr can find it sometimes and not other times.  When I start fresh (delete the data folder and run Solr), it all works fine.  I can add data and query it.  When I stop Solr and start it again, I get:

Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec with name 'MyCodec' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [SimpleText, Appending, Lucene40, Lucene3x, Lucene41, Lucene42]

I added the JAR to the path and I'm pretty sure Java sees it, or else it would not be using my codec when I start fresh.  (I've looked at the index files and verified that it's using my codec.)  I suppose Solr is asking SPI for my codec based on the codec class name stored in the index files, but I don't see why this would fail when a fresh start works.

Any thoughts?

Thanks,
Scott


RE: Problem loading my codec sometimes

Posted by Scott Schneider <Sc...@symantec.com>.
Ok, I created SOLR-5278.  Thanks again!

Scott


> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: Wednesday, September 25, 2013 10:15 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Problem loading my codec sometimes
> 
> 
> : Ah, I fixed it.  I wasn't properly including the
> : org.apache.lucene.codecs.Codec file in my jar.  I wasn't sure if it
> was
> : necessary in Solr, since I specify my factory in solrconfig.xml.  I
> : think that's why I could create a new index, but not load an existing
> : one.
> 
> Ah.... interesting.
> 
> yes, you definitely need the SPI registration in the jar file so that
> it
> can resolve codec files found on disk when opening them -- the
> configuration in solrconfig.xml tells solr hch codec to use when
> writing
> new segments, but it must respect the codec information in segements
> found
> on disk when opening them (that's how the index backcompat works), and
> those are looked up via SPI.
> 
> Can you do me a favor please and still file an issue with these
> details.
> the attachments i asked about before would still be handy, but probably
> not neccessary -- at a minimum could you show us the "jar tf" output of
> your plugin jar when you were having the problem.
> 
> Even if the codec factory code can find the configured codec on
> startup,
> we should probably throw a very load error write away if that same
> codec
> can't be found by name using SPI to prevent people from running into
> confusing problems when making mistakes like this.
> 
> 
> 
> -Hoss

RE: Problem loading my codec sometimes

Posted by Chris Hostetter <ho...@fucit.org>.
: Ah, I fixed it.  I wasn't properly including the 
: org.apache.lucene.codecs.Codec file in my jar.  I wasn't sure if it was 
: necessary in Solr, since I specify my factory in solrconfig.xml.  I 
: think that's why I could create a new index, but not load an existing 
: one.

Ah.... interesting.  

yes, you definitely need the SPI registration in the jar file so that it 
can resolve codec files found on disk when opening them -- the 
configuration in solrconfig.xml tells solr hch codec to use when writing 
new segments, but it must respect the codec information in segements found 
on disk when opening them (that's how the index backcompat works), and 
those are looked up via SPI.

Can you do me a favor please and still file an issue with these details.  
the attachments i asked about before would still be handy, but probably 
not neccessary -- at a minimum could you show us the "jar tf" output of 
your plugin jar when you were having the problem.

Even if the codec factory code can find the configured codec on startup, 
we should probably throw a very load error write away if that same codec 
can't be found by name using SPI to prevent people from running into 
confusing problems when making mistakes like this.



-Hoss

RE: Problem loading my codec sometimes

Posted by Scott Schneider <Sc...@symantec.com>.
Ah, I fixed it.  I wasn't properly including the org.apache.lucene.codecs.Codec file in my jar.  I wasn't sure if it was necessary in Solr, since I specify my factory in solrconfig.xml.  I think that's why I could create a new index, but not load an existing one.

Scott


> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: Wednesday, September 25, 2013 9:49 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Problem loading my codec sometimes
> 
> 
> : I still wonder why it can create a new index using my codec, but not
> : load an index previously created with my codec.  In solrconfig.xml, I
> : specify the CodecFactory along with the package name, whereas the
> codec
> : name that is read from the index file has no package name.  Could
> that
> : be the problem?  I think that's the way it's supposed to be.  Could
> it
> : be that Solr has my jar in the classpath, but SPI is not registering
> my
> : codec class from the jar?  I'm not familiar with SPI.
> 
> it's very possible that there is a classloader / SPI runtime race
> condition in looking up the codec names found in segment files.  This
> sort
> of classpath related runtime issue is extremely hard to write tests
> for.
> 
> Could you please file a bug and include...
> 
>  * the source of your codec (or a simple sample codec that you can
>    also use to reproduce the problem)
>  * a ziped up copy of your entire solr home directory, including
>    the jar file containing your codec so we can verify the SPI files
>    are in their properly
>     - no need to include an actual index here
>  * some simple sample docments in xml or json taht we can index
>    with the schema you are using
> 
> 
> 
> -Hoss

RE: Problem loading my codec sometimes

Posted by Chris Hostetter <ho...@fucit.org>.
: I still wonder why it can create a new index using my codec, but not 
: load an index previously created with my codec.  In solrconfig.xml, I 
: specify the CodecFactory along with the package name, whereas the codec 
: name that is read from the index file has no package name.  Could that 
: be the problem?  I think that's the way it's supposed to be.  Could it 
: be that Solr has my jar in the classpath, but SPI is not registering my 
: codec class from the jar?  I'm not familiar with SPI.

it's very possible that there is a classloader / SPI runtime race 
condition in looking up the codec names found in segment files.  This sort 
of classpath related runtime issue is extremely hard to write tests for.

Could you please file a bug and include...

 * the source of your codec (or a simple sample codec that you can 
   also use to reproduce the problem)
 * a ziped up copy of your entire solr home directory, including 
   the jar file containing your codec so we can verify the SPI files 
   are in their properly
    - no need to include an actual index here
 * some simple sample docments in xml or json taht we can index 
   with the schema you are using 



-Hoss

Re: Problem loading my codec sometimes

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
On Wed, Sep 25, 2013 at 2:10 PM, Scott Schneider <
Scott_Schneider@symantec.com> wrote:

> I still wonder why it can create a new index using my codec, but not load
> an index previously created with my codec.


Could be a sequence of classpath initialization. For example, the
write/read functions happen after full classpath is setup. But when you
open (not read/write) an index after restart, it may need to check
something (e.g. your codec) and that's before full classpath is available.
Then, putting things into system level classpath resolves the issue.

Pure conjecture here, but that's exactly the kind of things that triggers
classloader bugs.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)

RE: Problem loading my codec sometimes

Posted by Scott Schneider <Sc...@symantec.com>.
Thanks for your quick response!  My jar was in solr/lib.  I removed all the <lib> directives from solrconfig.xml, but I still get the error.  My solr.xml doesn't have sharedLib.

By the way, I am running Solr 4.4.0 with most of the default example files (including solr.xml).  My schema.xml and solrconfig.xml are from another project using Solr 3.6.  I modified them a bit to fix any obvious errors.

I still wonder why it can create a new index using my codec, but not load an index previously created with my codec.  In solrconfig.xml, I specify the CodecFactory along with the package name, whereas the codec name that is read from the index file has no package name.  Could that be the problem?  I think that's the way it's supposed to be.  Could it be that Solr has my jar in the classpath, but SPI is not registering my codec class from the jar?  I'm not familiar with SPI.

What else can I try?

Thanks,
Scott


> -----Original Message-----
> From: Shawn Heisey [mailto:solr@elyograg.org]
> Sent: Tuesday, September 24, 2013 5:51 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Problem loading my codec sometimes
> 
> On 9/24/2013 6:32 PM, Scott Schneider wrote:
> > I created my own codec and Solr can find it sometimes and not other
> times.  When I start fresh (delete the data folder and run Solr), it
> all works fine.  I can add data and query it.  When I stop Solr and
> start it again, I get:
> >
> > Caused by: java.lang.IllegalArgumentException: A SPI class of type
> org.apache.lucene.codecs.Codec with name 'MyCodec' does not exist. You
> need to add the corresponding JAR file supporting this SPI to your
> classpath.The current classpath supports the following names:
> [SimpleText, Appending, Lucene40, Lucene3x, Lucene41, Lucene42]
> >
> > I added the JAR to the path and I'm pretty sure Java sees it, or else
> it would not be using my codec when I start fresh.  (I've looked at the
> index files and verified that it's using my codec.)  I suppose Solr is
> asking SPI for my codec based on the codec class name stored in the
> index files, but I don't see why this would fail when a fresh start
> works.
> 
> What I always recommend for those who want to use custom and contrib
> jars is that they put all such jars (and their dependencies) into
> ${solr.solr.home}/lib, don't use any <lib> directives in
> solrconfig.xml,
> and don't put the sharedLib attribute into solr.xml.  Doing it in any
> other way has a tendency to trigger bugs or causes jars to get loaded
> more than once.
> 
> The ${solr.solr.home} property defaults to $CWD/solr (CWD is current
> working directory for those who don't already know) and is the location
> of the solr.xml file.  Note that depending on the exact version of Solr
> and which servlet container you are using, there may actually be two
> solr.xml files, one which loads solr into your container and one that
> configures Solr.  I am referring to the latter.
> 
> If you are using the solr example and its directory layout, the
> directory you would need to put all jars into is example/solr/lib ...
> which is a directory that doesn't exist and has to be created.
> 
> http://wiki.apache.org/solr/Solr.xml%20%28supported%20through%204.x%29
> http://wiki.apache.org/solr/Solr.xml%204.4%20and%20beyond
> 
> Thanks,
> Shawn


Re: Problem loading my codec sometimes

Posted by Shawn Heisey <so...@elyograg.org>.
On 9/24/2013 6:32 PM, Scott Schneider wrote:
> I created my own codec and Solr can find it sometimes and not other times.  When I start fresh (delete the data folder and run Solr), it all works fine.  I can add data and query it.  When I stop Solr and start it again, I get:
> 
> Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec with name 'MyCodec' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [SimpleText, Appending, Lucene40, Lucene3x, Lucene41, Lucene42]
> 
> I added the JAR to the path and I'm pretty sure Java sees it, or else it would not be using my codec when I start fresh.  (I've looked at the index files and verified that it's using my codec.)  I suppose Solr is asking SPI for my codec based on the codec class name stored in the index files, but I don't see why this would fail when a fresh start works.

What I always recommend for those who want to use custom and contrib
jars is that they put all such jars (and their dependencies) into
${solr.solr.home}/lib, don't use any <lib> directives in solrconfig.xml,
and don't put the sharedLib attribute into solr.xml.  Doing it in any
other way has a tendency to trigger bugs or causes jars to get loaded
more than once.

The ${solr.solr.home} property defaults to $CWD/solr (CWD is current
working directory for those who don't already know) and is the location
of the solr.xml file.  Note that depending on the exact version of Solr
and which servlet container you are using, there may actually be two
solr.xml files, one which loads solr into your container and one that
configures Solr.  I am referring to the latter.

If you are using the solr example and its directory layout, the
directory you would need to put all jars into is example/solr/lib ...
which is a directory that doesn't exist and has to be created.

http://wiki.apache.org/solr/Solr.xml%20%28supported%20through%204.x%29
http://wiki.apache.org/solr/Solr.xml%204.4%20and%20beyond

Thanks,
Shawn