You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2007/10/21 07:41:12 UTC

SolrConfig.Initializable

The SolrConfig.Initializable interface was introduced as part of SOLR-215 
to give Token(izer|Filter)Factories access to the SolrConfig (i believe 
just because StopFilter and SynonymFilter needed getLines)

I can't help but wonder if it would make more sense to replace this with 
something like "SolrCore.Initializable" ... you can get a SolrConfig from 
a SolrCore, but you can't get a SolrCore from a SolrConfig.

This would be a big help for things like SOLR-319.

Is there any particular reason why it would be bad to pass the SolrCore to 
the init method instead of the SolrConfig?



-Hoss


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.

Updated the patch which is now functional (solr-399).
-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13470794
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.
I took the liberty of creating solr-399 to track this (and allow posting a
patch).
 

hossman wrote:
> 
> 
> The SolrConfig.Initializable interface was introduced as part of SOLR-215 
> to give Token(izer|Filter)Factories access to the SolrConfig (i believe 
> just because StopFilter and SynonymFilter needed getLines)
> 
> I can't help but wonder if it would make more sense to replace this with 
> something like "SolrCore.Initializable" ... you can get a SolrConfig from 
> a SolrCore, but you can't get a SolrCore from a SolrConfig.
> 
> This would be a big help for things like SOLR-319.
> 
> Is there any particular reason why it would be bad to pass the SolrCore to 
> the init method instead of the SolrConfig?
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13425931
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.

ryantxu wrote:
> 
> but I fear we are turning the initialization into a jumbled spaghetti of
> interface looparounds, callbacks and pending lists 
> of soon-to-be-initialized stuff.
> (not just this patch - but the combination of requirements we have had for
> all the plugin stuff)
> 
I agree this is "messy"; a common/simple void init(SolrCore core, ...) would
be much better.
I'm no fan of a thread local solution and I believe we can smoothly
transition the API the same way we did when we introduced the
SolrConfig.Initializable (deprecation plus default implementation).


ryantxu wrote:
> 
> One key design choice is if we can pass a not fully initialized SolrCore
> to an init method.
> 
I think we should pass the core even if not fully initialized since there
are already ways today to access the core during init phase anyway
(SolrCore.getSolrCore).
We should document the config/schema/core initialization order so users know
what they can access from a core/schema/config during init (before the core
really starts).
Plus, if we are able to keep the solr-399 feature, besides cyclic or
unsolvable dependencies, the initialization order might just adapt.


ryantxu wrote:
> 
> The solrconfig stuff definitely needs access to the current core. 
> I'm not sure if schema related things need access (it currently does not). 
> Both cases need access to the 'config' in order to access the ClassLoader.
> 
One of the starting point was to allow a tokenizer to use a fieldType;
schema related objects thus need access to others and passing the core would
allow that (getSchema).
Just the same way the core let's us access the solrconfig (getSolrConfig).


ryantxu wrote:
> 
> alternatively, we have to do something like SOLR-399 that queues up
> everything until the core is completely initialized.
> 
I guess solr-399 algorithm can be adapted & kept; if nothing prevents
initialization do it, otherwise postpone it.
Instead of a boolean init(...), we might try to use an exception to indicate
initialization is to be postponed & retried (considering this case as an
exceptional but recoverable workflow).
And we could capture the init continuation in the exception (ala
PluginInitInfo for instance).
Let me know if I should investigate feasibility more thoroughly.
-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13743457
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Ryan McKinley <ry...@gmail.com>.
Henrib wrote:
> Just pinging on the subject and the related solr-399 patch; I understand the
> priority for this is low, just seeking some comment to determine whether
> this is the expected track (and how low the priority is).
> 

I started working on adding configuration to the search components in 
SOLR-281.  This is a case where we need access to the core for 
initialization.

I checked out the SOLR-399 patch - it looks good considering all our 
discussion... but I fear we are turning the initialization into a 
jumbled spaghetti of interface looparounds, callbacks and pending lists 
of soon-to-be-initialized stuff.  (not just this patch - but the 
combination of requirements we have had for all the plugin stuff)

I think we need to have a clear initialization strategy and make sure 
everything conforms to that before releasing 1.3.

Looking over our plugin space, we have two main categories:
1. stuff defined in schema.xml (fieldTypes fields ...)
2. stuff defined in solrconfig.xml (requestHandlers, processors, 
components, writers, ...)

The solrconfig stuff definitely needs access to the current core.  I'm 
not sure if schema related things need access (it currently does not). 
Both cases need access to the 'config' in order to access the ClassLoader.

One key design choice is if we can pass a not fully initialized SolrCore 
to an init method.  alternatively, we have to do something like SOLR-399 
that queues up everything until the core is completely initialized.


Rather then jumble things up with more hooks, queues and callbacks, I 
vote we migrate the NamedList/Map/NodeInitializedPlugin from:
   void init( final NamedList args )
to:
   void init( final SolrCore core, final NamedList args )


I'm not sure what the best approach for the schema family is.  As is, 
IndexSchema can be fully initialized independent of SolrCore -- for 
consistency, it may be nice to use:
   void init( final SolrCore core, final Map<String,String> args )
rather then:
   void init(SolrConfig solrConfig, Map<String,String> args)
If fields/fieldTypes will never need access to the current SolrCore, it 
is fine to leave as is.

- - - -

Another possibility is to perhaps use ThreadLocal to register the 
current core/config while initializing.  That seems a bit dangerous as 
it would assume preconditions that are not required by the API.  But 
this would make it possible to pass things around and not break/migrate 
the existing API.


ryan



Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.


ryantxu wrote:
> 
> Is the patch ready for a serious review? or are you still working on it?
> 

Not sure if this I'm supposed to answer but anyway, I was expecting comments
before 'generalizing' the patch. Right now, it only applies to filter &
tokenizer factories. The foundation should be solid though but since this
introduces new interfaces, some review is needed.

Let me know if there is anything you'd like me to change.

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13704330
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Chris Hostetter <ho...@fucit.org>.
: dooh - I was hoping Chris may take this one up as he started the thread on
: changing the init method :)
: 
: If this is not an issue for Chris, I can take a look sometime in the next few

sorry ... i didn't mean to suggest that i had a specific need to change 
the method ... i was approaching it more from the "because of multi-core, 
the init signature and semantics will change from 1.2 to 1.3; so what do 
we *really* want init methods to look like moving forward?" (as opposed to 
the init signature currently in the trunk that was made to be the minimum 
neccessary changes to support both multi-cores and the existing factories)


-Hoss


Re: SolrConfig.Initializable

Posted by Ryan McKinley <ry...@gmail.com>.
Chris Hostetter wrote:
> : Just pinging on the subject and the related solr-399 patch; I understand the
> : priority for this is low, just seeking some comment to determine whether
> : this is the expected track (and how low the priority is).
> 
> It's in my mailbox of things to look at .... but there are things on that 
> list from september -- when my mail server died and i had to restarted my 
> list from scratch.
> 
> As i recall, the ideas you mentioned in email made sense to me ... I was 
> kind of hopeing Ryan would take the ball and run with it since he's really 
> become the "multi-core warrior"  :)
> 

dooh - I was hoping Chris may take this one up as he started the thread 
on changing the init method :)

If this is not an issue for Chris, I can take a look sometime in the 
next few days.  A quick look at the patch and it looks ok (i have not 
gone into detail yet)

Is the patch ready for a serious review? or are you still working on it?

ryan




Re: SolrConfig.Initializable

Posted by Chris Hostetter <ho...@fucit.org>.
: Just pinging on the subject and the related solr-399 patch; I understand the
: priority for this is low, just seeking some comment to determine whether
: this is the expected track (and how low the priority is).

It's in my mailbox of things to look at .... but there are things on that 
list from september -- when my mail server died and i had to restarted my 
list from scratch.

As i recall, the ideas you mentioned in email made sense to me ... I was 
kind of hopeing Ryan would take the ball and run with it since he's really 
become the "multi-core warrior"  :)


-Hoss


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.
Just pinging on the subject and the related solr-399 patch; I understand the
priority for this is low, just seeking some comment to determine whether
this is the expected track (and how low the priority is).

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13586685
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Chris Hostetter <ho...@fucit.org>.
: On the functional side, it might be beneficial to keep the newInstance as is
: but introduce an initInstance.
: We might want instances to be initialized with some of their current
: arguments before we set the core.

that was kind of my main point ... i just got sidetracked by the sexiness 
of an approach that violated the contract i orriginally proposed.  

i don't have any strong opinions about naming conventions, or specific 
approaches ... just that while it makes sense to init() objects as we 
construct them (when we are parsing the config that identifies them and 
has their args), we need to delay telling them about the core until the 
core is ready.  

any special "init" type actions they want to take once they have the core 
can be done in the "tell" function -- as long as we garuntee "init" is 
called before "tell" is called before any work is requested.


-Hoss


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.
On the functional side, it might be beneficial to keep the newInstance as is
but introduce an initInstance.
We might want instances to be initialized with some of their current
arguments before we set the core.
This way, we may unify all plugin & factory like functionality under one
hood; the sequence would allways be something like:
Object inst = solrConfig.newInstance(....);
//... do something...
solrConfig.initInstance(instance, ...arguments);
//... do more....

For the sake of nitpicking on names, I'd advocate to use
'SolrCore.Initializable' instead of SolrCoreAware.
The sole method being void init(SolrCore core); this makes the SolrConfig a
SolrCore.Initializable.
SolrCore.java:
public interface Initializable {
    void init(SorlCore core);
}


SolrConfig.java: 

  public interface Initializable {
    void init(SolrConfig config, Map<String, String> args);
  }
  
  public interface InitializableBy<T> {
    void init(T arg);
  }
  
  public  interface InitializablePlugin<T>  {
    void init(T plugin, Node node) throws Exception;
  }
 /** The core this config was initiallized with. */
  private SolrCore core = null;
  /** The lit of instances pending core initialization. */
  private java.util.List<SolrCore.Initializable> initList = new
java.util.ArrayList<SolrCore.Initializable>();
  
  public void init(SolrCore core) {
    if (core == null) {
      this.core = core;
      if (initList != null) {
        for(SolrCore.Initializable initItem : initList) {
          initItem.init(core);
        }
        initList = null;
      }
    } else {
      String error =  "Attempting to initialize SolrConfig " + this + "
twice, already bound to " + core ;
      if (core != this.core)
        throw new SolrException( SolrException.ErrorCode.SERVER_ERROR,
error);
      Config.log.warning(error);
    }
  }
  
  public void initInstance(SolrConfig.Initializable instance, Map<String,
String> args) {
    instance.init(this, args);
    coreInitialize(instance);
  }
  
  public <T> void initInstance(SolrConfig.InitializableBy<T> instance, T
arg) {
    instance.init(arg);
    coreInitialize(instance);
  }
  
  public <T> void initInstance(SolrConfig.InitializablePlugin<T> instance, T
plugin, Node node) throws Exception {
    instance.init(plugin, node);
    coreInitialize(instance);
  }
    
 private void coreInitialize(Object instance) {
    if (instance instanceof SolrCore.Initializable) {
      SolrCore.Initializable initItem = (SolrCore.Initializable) instance;
      // delay call to SolrCore.Initializable.init if the config has not
been initialized yet
      if (null == core) {
        initList.add(initItem);
      } else {
        initItem.init(core);
      }
    }
  }

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13409366
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Chris Hostetter <ho...@fucit.org>.
: Interesting, instances of SolrCoreAware would be queued up until the config is
: told what the core is - afterwards it tells them immediately. I like that.

that's what i was thinking, but i'm actually not fond of that idea ... it 
seemed sexy to keep things that simple, so that the same code would 
work for plugins created at startup, and (hypothetical) plugins created 
later ... except that only SolrCore and SolrConfig would be simple, all of 
the plugins would be complicated because we wouldn't be able to garuntee 
that tellAboutCore() would be called before init(), so a lot of plugins 
might have to look like this...

     public class Plugin implements SolrCoreAware {
        SolrCore core = null;
        boolean inited = false;
        public void init(...) {
           ...
           inited = true;
           if (null != core) finishInit();
        }
        public void tellAboutCore(...) {
           ...
           if (inited) finishInit();
        }
      }

...that seems like an obnoxious burden to put on people.  it would 
probably be better to make sure we can live up to a simple contract like 
"tellAboutCore() will always be called before init() even if it means we 
have to jump through more hoops in SolrConfig and SolrCore ... i'm just 
not sure what those hoops should look like.

: The one concern is what about the potential to call config.tellAboutCore(
: core0 ) then call config.tellAboutCore( core1 );

oh sweet mother of all that is holy ... please let us never speak of this 
again ...

: I don't see a good use case where you would want to change the core out from

agreed ... i don't even think it's neccessary to make it throw an 
exception, just treat it as completely undefined (no different then if you 
try to call init() on a plugin more then once right now ... why go out of 
our way to test for something inherently silly?)




-Hoss


Re: SolrConfig.Initializable

Posted by Ryan McKinley <ry...@gmail.com>.
> 
> OOOOHHHHHH... even better, SolrConfig could be changed to implement 
> SolrCoreAware, and all of the hard work could be implmented with the 
> addition below to SolrConfig (and all SolrCore has to do is call 
> solrConfig.tellAboutCore(this)) ...
> 
>    private List<SolrCoreAware> needToTell = new List()
>    private SolrCore core = null;
>    public tellAboutCore(SolrCore c) {
>      core = c;
>      foreach (SolrCoreAware plugin : neeedToTell) {
>         plugin.tellAboutCore(core);
>      }
>      needToTell = null;
>    }
>    public Object newInstance(String cname, String... subpackages) {
>      Object that = super.newInstance(cname, subpackages);
>      if (that instanceof SolrCoreAware) {
>         if (null == core) {
>            needToTell.add(that);
>         } else {
>            that.tellAboutCore(core);
>         }
>      }
>      return that;
>    }
> 

Interesting, instances of SolrCoreAware would be queued up until the 
config is told what the core is - afterwards it tells them immediately. 
I like that.

The one concern is what about the potential to call 
config.tellAboutCore( core0 ) then call config.tellAboutCore( core1 );

In my view, that should either
1. throw an error
or
2. tell all the SolrCoreAware components about the new core

I don't see a good use case where you would want to change the core out 
from under everything it initialized.  Rather, you would want to 
reinitalize everything.  So I would vote for #1

I like the name 'SolrCoreAware', but not wild about 'tellAboutCore' 
perhaps setSolrCore( core )?  maybe strange since it may not be a bean.


> 
> i would prefer we avoid any work towards expecting plugin constructors 
> to take in a SolrCore as a param...
> 

agreed.


Re: SolrConfig.Initializable

Posted by Chris Hostetter <ho...@fucit.org>.
Ah, yes the concern is about ... the use use of SolrCore in plugin init 
methods before the SolrCore itself is fully initialized.

: We could either add a status to the core and have each high-level core
: method check this status before running (which would also allow to have
: cores logically stopped -or marked under reload/backup/etc) or heavily
: insist that using the core before 'init' has been called would lead to
: unpredictable results.

Let me propose a completely new appraoch then ...

Currently, the usage of SolrConfig.Initializable works like this...

    factory = solrConfig.newInstance(className)
    if (factory instanceof SolrConfig.Initializable) {
      ((SolrConfig.Initializable)factory).init(solrConfig, args);
    } else {
      log.warning("DEPRECATE");
      factory.init(args);
    }

what if we eliminate the SolrConfig.Initializable interface completely, 
and introduce a new interface...

  public interface SolrCoreAware {
    public void tellAboutCore(SolrCore c);
  }

...the semantics being that once a SolrCore is finished initializing 
itself, and is completely ready to be used, the last phase of 
it's initialization is to loop over all of the SolrCoreAware instances it 
knows about, and call that.tellAboutCore(this).  From a plugin's 
perspective, init(...) will still always be called before it's every asked 
to to any work, but in addition it would be garunteed that if it 
implements SolrCoreAware, tellAboutCore would be called after init(...) 
but before it's asked to do any real work.

Keeping track of all the plugins to "tell" about the SolrCore later would 
be trivial ... SolrConfig.newInstance could do it, and SOlrCore could get 
the info later.

OOOOHHHHHH... even better, SolrConfig could be changed to implement 
SolrCoreAware, and all of the hard work could be implmented with the 
addition below to SolrConfig (and all SolrCore has to do is call 
solrConfig.tellAboutCore(this)) ...

   private List<SolrCoreAware> needToTell = new List()
   private SolrCore core = null;
   public tellAboutCore(SolrCore c) {
     core = c;
     foreach (SolrCoreAware plugin : neeedToTell) {
        plugin.tellAboutCore(core);
     }
     needToTell = null;
   }
   public Object newInstance(String cname, String... subpackages) {
     Object that = super.newInstance(cname, subpackages);
     if (that instanceof SolrCoreAware) {
        if (null == core) {
           needToTell.add(that);
        } else {
           that.tellAboutCore(core);
        }
     }
     return that;
   }

: >  > Any reason to (re)introduce SolrCore.Initializable instead of 
: > constructors
: >  > that can take a SolrCore as a parameter?

: > If we can manage legacy code effectively the constructor approach seems 
: > cleaner in the long run.  But i'm not sure how that would work.  Also, 

i would prefer we avoid any work towards expecting plugin constructors 
to take in a SolrCore as a param...

1) It still wouldn't address the issue of plugins attempting to use the 
core they have access to before it's fully initialized

2) interfaces and abstract classes can't enforce any contract on 
constructors, that's why we've always use default constructors and had 
init(...) methods in the APIs ... it allows for compile time checking that 
the Plugins will work (they have to go out of their way to write a plugin 
without a default constructor to break things at runtime)


-Hoss


Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.
I agree that the core being accessible whilst not being fully initialized is
not very "comfortable".
We could either add a status to the core and have each high-level core
method check this status before running (which would also allow to have
cores logically stopped -or marked under reload/backup/etc) or heavily
insist that using the core before 'init' has been called would lead to
unpredictable results.
The former could also close the potential SolrCore.getSolrCore()  issue
(which exists today anyway) since the status could be checked there as well.
Could such a modification be folded in solr-281 or is this a bad idea?
I can give it a shot if you want.
Henri


ryantxu wrote:
> 
>>> I can't help but wonder if it would make more sense to replace this with 
>>> something like "SolrCore.Initializable" ... you can get a SolrConfig
>>> from 
>>> a SolrCore, but you can't get a SolrCore from a SolrConfig.
> 
> yes, that sounds like a good idea.
> 
> The one (potential) issue is that the core may not be fully initialized 
> when it is passed to an SolrCore.Initializable()
> 
> that makes for an awkward contract since some calls have undefined 
> results depending on where in the initialization process it is.  Maybe 
> this is not a big problem, but something to consider.
> 
> 
>  >
>  > Any reason to (re)introduce SolrCore.Initializable instead of 
> constructors
>  > that can take a SolrCore as a parameter?
>  >
> 
> If we can manage legacy code effectively the constructor approach seems 
> cleaner in the long run.  But i'm not sure how that would work.  Also, 
> we need to be careful not to call SolrCore.getSolrCore() because that 
> could send things into an initialization loop.
> 
> ryan
> 
> 

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13339931
Sent from the Solr - Dev mailing list archive at Nabble.com.


Re: SolrConfig.Initializable

Posted by Ryan McKinley <ry...@gmail.com>.
>> I can't help but wonder if it would make more sense to replace this with 
>> something like "SolrCore.Initializable" ... you can get a SolrConfig from 
>> a SolrCore, but you can't get a SolrCore from a SolrConfig.

yes, that sounds like a good idea.

The one (potential) issue is that the core may not be fully initialized 
when it is passed to an SolrCore.Initializable()

that makes for an awkward contract since some calls have undefined 
results depending on where in the initialization process it is.  Maybe 
this is not a big problem, but something to consider.


 >
 > Any reason to (re)introduce SolrCore.Initializable instead of 
constructors
 > that can take a SolrCore as a parameter?
 >

If we can manage legacy code effectively the constructor approach seems 
cleaner in the long run.  But i'm not sure how that would work.  Also, 
we need to be careful not to call SolrCore.getSolrCore() because that 
could send things into an initialization loop.

ryan

Re: SolrConfig.Initializable

Posted by Henrib <hb...@gmail.com>.

I dont see any wrong or bad reason for the change since I believe we never
share an object created by one core (and dependant on it) with another core
- and we probably shouldn't.

At some point, solr-215 was passing the SolrCore to dynamically created
objects instead of the SolrConfig in init. The remnant shows in
SolrCore.createInstance.
The SolrConfig.Initializable was introduced to reduce the change scope of
the patch.

Any reason to (re)introduce SolrCore.Initializable instead of constructors
that can take a SolrCore as a parameter?

Henrib


hossman wrote:
> 
> 
> The SolrConfig.Initializable interface was introduced as part of SOLR-215 
> to give Token(izer|Filter)Factories access to the SolrConfig (i believe 
> just because StopFilter and SynonymFilter needed getLines)
> 
> I can't help but wonder if it would make more sense to replace this with 
> something like "SolrCore.Initializable" ... you can get a SolrConfig from 
> a SolrCore, but you can't get a SolrCore from a SolrConfig.
> 
> This would be a big help for things like SOLR-319.
> 
> Is there any particular reason why it would be bad to pass the SolrCore to 
> the init method instead of the SolrConfig?
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/SolrConfig.Initializable-tf4665036.html#a13329383
Sent from the Solr - Dev mailing list archive at Nabble.com.