You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Gus Heck <gu...@gmail.com> on 2015/11/12 00:11:54 UTC

Sharing a class across cores

I have a case where a component loads up a large CSV file (2.5 million
lines) to build a map. This worked ok in a case where we had a single core,
but it isn't working so well with 40 cores because each core loads a new
copy of the component in a new classloader and I get 40 new versions of the
same class each holding it's own private static final map (one for each
core). Each line is small, but a billion of anything gets kinda heavy. Is
this the intended class loading behavior?

Is there some where that one can cause a class to be loaded in a parent
classloader above the core so that it's loaded just once? I want to load it
in some way that leverages standard solr resource loading, so that I'm not
hard coding or setting sysprops just to be able to find it.

This is in a copy of trunk from about a month ago... so 6.x stuff is mostly
available.

-Gus

Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
Thought of that but I'm trying to avoid additional infrastructure... and it
will be accessed 0 or 1 times per query, so speed matters a bit.

On Wed, Nov 11, 2015 at 7:05 PM, Walter Underwood <wu...@wunderwood.org>
wrote:

> Depending on how fast the access needs to be, you could put that big map
> in memcache.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> On Nov 11, 2015, at 4:04 PM, Gus Heck <gu...@gmail.com> wrote:
>
> P.S. I posted the original message concurrently with the chat session's
> occurance I beleive, certainly before I had read it, so no I haven't
> actually tried what you suggest yet.
>
> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
>
>> Yes asked by a colleague :). The chat session is now in our jira ticket
>> :).
>>
>> However, my take on it is that this seems like a pretty broad brush to
>> paint with to move *all* our classes up and out of the normal core loading
>> process. I assume there are good reasons for segregating this stuff into
>> separate class loaders to begin with. It would also be fairly burdensom to
>> make a separate jar file to break out this one component...
>>
>> I really just want a way to stash the map in a place where other cores
>> can see it (and thus I can appropriately synchronize things so that the
>> loading only happens once). I'm asking because it seems like surely this
>> must be a solved problem... if not, it might be easiest to just solve it by
>> adding some sort of shared resources facility to CoreContainer?
>>
>> -Gus
>>
>> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org>
>> wrote:
>>
>>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>>> > I have a case where a component loads up a large CSV file (2.5 million
>>> > lines) to build a map. This worked ok in a case where we had a single
>>> > core, but it isn't working so well with 40 cores because each core
>>> loads
>>> > a new copy of the component in a new classloader and I get 40 new
>>> > versions of the same class each holding it's own private static final
>>> > map (one for each core). Each line is small, but a billion of anything
>>> > gets kinda heavy. Is this the intended class loading behavior?
>>> >
>>> > Is there some where that one can cause a class to be loaded in a parent
>>> > classloader above the core so that it's loaded just once? I want to
>>> load
>>> > it in some way that leverages standard solr resource loading, so that
>>> > I'm not hard coding or setting sysprops just to be able to find it.
>>> >
>>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>>> > mostly available.
>>>
>>> This sounds like a question that I just recently answered on IRC.
>>>
>>> If you remove all <lib> elements from your solrconfig.xml files and
>>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>>> load those jars before any cores are created and they will be available
>>> to all cores.
>>>
>>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>>> unlikely that this will affect third-party components, but be aware that
>>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>>> versions.
>>>
>>> https://issues.apache.org/jira/browse/SOLR-6188
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> http://www.the111shift.com
>>
>
>
>
> --
> http://www.the111shift.com
>
>
>


-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
This is for use in code in a search component, while I could fire up an
http client or solr client to get stuff out of a blob store, that will make
it hard to ensure that concurrent core startups don't duplicate each
other's work and all process the same file into the blob store etc...  and
the blob is meant to be a map in which the component will look things up,
so can one query into the blob in the blob store? I suspect not.

@Erik, This has been discussed as a possible fallback, though it also
involves creating a client that makes requests inside the search component
while it processes queries. It also creates a need for manual steps to
create the index that stands in place of a map and then that has to be
scripted for our continuous deployment...

I have a patch that I am testing locally, and my unit tests like it so far
today I should get it into a running instance for testing. We will at least
use it locally. If it goes well I'll probably contribute it back as a patch
to SOLR-3443.

On Thu, Nov 12, 2015 at 5:07 AM, Ishan Chattopadhyaya <
ichattopadhyaya@gmail.com> wrote:

> > Or in a separate Solr core/collection.  :)
> That support is available with this, I think:
> https://cwiki.apache.org/confluence/display/solr/Blob+Store+API
>
> On Thu, Nov 12, 2015 at 2:31 PM, Erik Hatcher <er...@gmail.com>
> wrote:
>
>> Or in a separate Solr core/collection.  :)
>>
>> On Nov 11, 2015, at 19:05, Walter Underwood <wu...@wunderwood.org>
>> wrote:
>>
>> Depending on how fast the access needs to be, you could put that big map
>> in memcache.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> On Nov 11, 2015, at 4:04 PM, Gus Heck <gu...@gmail.com> wrote:
>>
>> P.S. I posted the original message concurrently with the chat session's
>> occurance I beleive, certainly before I had read it, so no I haven't
>> actually tried what you suggest yet.
>>
>> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
>>
>>> Yes asked by a colleague :). The chat session is now in our jira ticket
>>> :).
>>>
>>> However, my take on it is that this seems like a pretty broad brush to
>>> paint with to move *all* our classes up and out of the normal core loading
>>> process. I assume there are good reasons for segregating this stuff into
>>> separate class loaders to begin with. It would also be fairly burdensom to
>>> make a separate jar file to break out this one component...
>>>
>>> I really just want a way to stash the map in a place where other cores
>>> can see it (and thus I can appropriately synchronize things so that the
>>> loading only happens once). I'm asking because it seems like surely this
>>> must be a solved problem... if not, it might be easiest to just solve it by
>>> adding some sort of shared resources facility to CoreContainer?
>>>
>>> -Gus
>>>
>>> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org>
>>> wrote:
>>>
>>>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>>>> > I have a case where a component loads up a large CSV file (2.5 million
>>>> > lines) to build a map. This worked ok in a case where we had a single
>>>> > core, but it isn't working so well with 40 cores because each core
>>>> loads
>>>> > a new copy of the component in a new classloader and I get 40 new
>>>> > versions of the same class each holding it's own private static final
>>>> > map (one for each core). Each line is small, but a billion of anything
>>>> > gets kinda heavy. Is this the intended class loading behavior?
>>>> >
>>>> > Is there some where that one can cause a class to be loaded in a
>>>> parent
>>>> > classloader above the core so that it's loaded just once? I want to
>>>> load
>>>> > it in some way that leverages standard solr resource loading, so that
>>>> > I'm not hard coding or setting sysprops just to be able to find it.
>>>> >
>>>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>>>> > mostly available.
>>>>
>>>> This sounds like a question that I just recently answered on IRC.
>>>>
>>>> If you remove all <lib> elements from your solrconfig.xml files and
>>>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>>>> load those jars before any cores are created and they will be available
>>>> to all cores.
>>>>
>>>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>>>> unlikely that this will affect third-party components, but be aware that
>>>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>>>> versions.
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-6188
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> http://www.the111shift.com
>>>
>>
>>
>>
>> --
>> http://www.the111shift.com
>>
>>
>>
>


-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
> Or in a separate Solr core/collection.  :)
That support is available with this, I think:
https://cwiki.apache.org/confluence/display/solr/Blob+Store+API

On Thu, Nov 12, 2015 at 2:31 PM, Erik Hatcher <er...@gmail.com>
wrote:

> Or in a separate Solr core/collection.  :)
>
> On Nov 11, 2015, at 19:05, Walter Underwood <wu...@wunderwood.org> wrote:
>
> Depending on how fast the access needs to be, you could put that big map
> in memcache.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> On Nov 11, 2015, at 4:04 PM, Gus Heck <gu...@gmail.com> wrote:
>
> P.S. I posted the original message concurrently with the chat session's
> occurance I beleive, certainly before I had read it, so no I haven't
> actually tried what you suggest yet.
>
> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
>
>> Yes asked by a colleague :). The chat session is now in our jira ticket
>> :).
>>
>> However, my take on it is that this seems like a pretty broad brush to
>> paint with to move *all* our classes up and out of the normal core loading
>> process. I assume there are good reasons for segregating this stuff into
>> separate class loaders to begin with. It would also be fairly burdensom to
>> make a separate jar file to break out this one component...
>>
>> I really just want a way to stash the map in a place where other cores
>> can see it (and thus I can appropriately synchronize things so that the
>> loading only happens once). I'm asking because it seems like surely this
>> must be a solved problem... if not, it might be easiest to just solve it by
>> adding some sort of shared resources facility to CoreContainer?
>>
>> -Gus
>>
>> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org>
>> wrote:
>>
>>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>>> > I have a case where a component loads up a large CSV file (2.5 million
>>> > lines) to build a map. This worked ok in a case where we had a single
>>> > core, but it isn't working so well with 40 cores because each core
>>> loads
>>> > a new copy of the component in a new classloader and I get 40 new
>>> > versions of the same class each holding it's own private static final
>>> > map (one for each core). Each line is small, but a billion of anything
>>> > gets kinda heavy. Is this the intended class loading behavior?
>>> >
>>> > Is there some where that one can cause a class to be loaded in a parent
>>> > classloader above the core so that it's loaded just once? I want to
>>> load
>>> > it in some way that leverages standard solr resource loading, so that
>>> > I'm not hard coding or setting sysprops just to be able to find it.
>>> >
>>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>>> > mostly available.
>>>
>>> This sounds like a question that I just recently answered on IRC.
>>>
>>> If you remove all <lib> elements from your solrconfig.xml files and
>>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>>> load those jars before any cores are created and they will be available
>>> to all cores.
>>>
>>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>>> unlikely that this will affect third-party components, but be aware that
>>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>>> versions.
>>>
>>> https://issues.apache.org/jira/browse/SOLR-6188
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> http://www.the111shift.com
>>
>
>
>
> --
> http://www.the111shift.com
>
>
>

Re: Sharing a class across cores

Posted by Erik Hatcher <er...@gmail.com>.
Or in a separate Solr core/collection.  :)

> On Nov 11, 2015, at 19:05, Walter Underwood <wu...@wunderwood.org> wrote:
> 
> Depending on how fast the access needs to be, you could put that big map in memcache.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
> 
>> On Nov 11, 2015, at 4:04 PM, Gus Heck <gu...@gmail.com> wrote:
>> 
>> P.S. I posted the original message concurrently with the chat session's occurance I beleive, certainly before I had read it, so no I haven't actually tried what you suggest yet.
>> 
>>> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
>>> Yes asked by a colleague :). The chat session is now in our jira ticket :). 
>>> 
>>> However, my take on it is that this seems like a pretty broad brush to paint with to move *all* our classes up and out of the normal core loading process. I assume there are good reasons for segregating this stuff into separate class loaders to begin with. It would also be fairly burdensom to make a separate jar file to break out this one component...
>>> 
>>> I really just want a way to stash the map in a place where other cores can see it (and thus I can appropriately synchronize things so that the loading only happens once). I'm asking because it seems like surely this must be a solved problem... if not, it might be easiest to just solve it by adding some sort of shared resources facility to CoreContainer?
>>> 
>>> -Gus
>>> 
>>>> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>>>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>>>> > I have a case where a component loads up a large CSV file (2.5 million
>>>> > lines) to build a map. This worked ok in a case where we had a single
>>>> > core, but it isn't working so well with 40 cores because each core loads
>>>> > a new copy of the component in a new classloader and I get 40 new
>>>> > versions of the same class each holding it's own private static final
>>>> > map (one for each core). Each line is small, but a billion of anything
>>>> > gets kinda heavy. Is this the intended class loading behavior?
>>>> >
>>>> > Is there some where that one can cause a class to be loaded in a parent
>>>> > classloader above the core so that it's loaded just once? I want to load
>>>> > it in some way that leverages standard solr resource loading, so that
>>>> > I'm not hard coding or setting sysprops just to be able to find it.
>>>> >
>>>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>>>> > mostly available.
>>>> 
>>>> This sounds like a question that I just recently answered on IRC.
>>>> 
>>>> If you remove all <lib> elements from your solrconfig.xml files and
>>>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>>>> load those jars before any cores are created and they will be available
>>>> to all cores.
>>>> 
>>>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>>>> unlikely that this will affect third-party components, but be aware that
>>>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>>>> versions.
>>>> 
>>>> https://issues.apache.org/jira/browse/SOLR-6188
>>>> 
>>>> Thanks,
>>>> Shawn
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> 
>>> 
>>> 
>>> -- 
>>> http://www.the111shift.com
>> 
>> 
>> 
>> -- 
>> http://www.the111shift.com
> 

Re: Sharing a class across cores

Posted by Walter Underwood <wu...@wunderwood.org>.
Depending on how fast the access needs to be, you could put that big map in memcache.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 11, 2015, at 4:04 PM, Gus Heck <gu...@gmail.com> wrote:
> 
> P.S. I posted the original message concurrently with the chat session's occurance I beleive, certainly before I had read it, so no I haven't actually tried what you suggest yet.
> 
> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gus.heck@gmail.com <ma...@gmail.com>> wrote:
> Yes asked by a colleague :). The chat session is now in our jira ticket :). 
> 
> However, my take on it is that this seems like a pretty broad brush to paint with to move *all* our classes up and out of the normal core loading process. I assume there are good reasons for segregating this stuff into separate class loaders to begin with. It would also be fairly burdensom to make a separate jar file to break out this one component...
> 
> I really just want a way to stash the map in a place where other cores can see it (and thus I can appropriately synchronize things so that the loading only happens once). I'm asking because it seems like surely this must be a solved problem... if not, it might be easiest to just solve it by adding some sort of shared resources facility to CoreContainer?
> 
> -Gus
> 
> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <apache@elyograg.org <ma...@elyograg.org>> wrote:
> On 11/11/2015 4:11 PM, Gus Heck wrote:
> > I have a case where a component loads up a large CSV file (2.5 million
> > lines) to build a map. This worked ok in a case where we had a single
> > core, but it isn't working so well with 40 cores because each core loads
> > a new copy of the component in a new classloader and I get 40 new
> > versions of the same class each holding it's own private static final
> > map (one for each core). Each line is small, but a billion of anything
> > gets kinda heavy. Is this the intended class loading behavior?
> >
> > Is there some where that one can cause a class to be loaded in a parent
> > classloader above the core so that it's loaded just once? I want to load
> > it in some way that leverages standard solr resource loading, so that
> > I'm not hard coding or setting sysprops just to be able to find it.
> >
> > This is in a copy of trunk from about a month ago... so 6.x stuff is
> > mostly available.
> 
> This sounds like a question that I just recently answered on IRC.
> 
> If you remove all <lib> elements from your solrconfig.xml files and
> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
> load those jars before any cores are created and they will be available
> to all cores.
> 
> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
> unlikely that this will affect third-party components, but be aware that
> until 5.4, jars in that lib directory will be loaded twice by older 5.x
> versions.
> 
> https://issues.apache.org/jira/browse/SOLR-6188 <https://issues.apache.org/jira/browse/SOLR-6188>
> 
> Thanks,
> Shawn
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org <ma...@lucene.apache.org>
> For additional commands, e-mail: dev-help@lucene.apache.org <ma...@lucene.apache.org>
> 
> 
> 
> 
> -- 
> http://www.the111shift.com <http://www.the111shift.com/>
> 
> 
> -- 
> http://www.the111shift.com <http://www.the111shift.com/>

Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
P.S. I posted the original message concurrently with the chat session's
occurance I beleive, certainly before I had read it, so no I haven't
actually tried what you suggest yet.

On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:

> Yes asked by a colleague :). The chat session is now in our jira ticket
> :).
>
> However, my take on it is that this seems like a pretty broad brush to
> paint with to move *all* our classes up and out of the normal core loading
> process. I assume there are good reasons for segregating this stuff into
> separate class loaders to begin with. It would also be fairly burdensom to
> make a separate jar file to break out this one component...
>
> I really just want a way to stash the map in a place where other cores can
> see it (and thus I can appropriately synchronize things so that the loading
> only happens once). I'm asking because it seems like surely this must be a
> solved problem... if not, it might be easiest to just solve it by adding
> some sort of shared resources facility to CoreContainer?
>
> -Gus
>
> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>
>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>> > I have a case where a component loads up a large CSV file (2.5 million
>> > lines) to build a map. This worked ok in a case where we had a single
>> > core, but it isn't working so well with 40 cores because each core loads
>> > a new copy of the component in a new classloader and I get 40 new
>> > versions of the same class each holding it's own private static final
>> > map (one for each core). Each line is small, but a billion of anything
>> > gets kinda heavy. Is this the intended class loading behavior?
>> >
>> > Is there some where that one can cause a class to be loaded in a parent
>> > classloader above the core so that it's loaded just once? I want to load
>> > it in some way that leverages standard solr resource loading, so that
>> > I'm not hard coding or setting sysprops just to be able to find it.
>> >
>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>> > mostly available.
>>
>> This sounds like a question that I just recently answered on IRC.
>>
>> If you remove all <lib> elements from your solrconfig.xml files and
>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>> load those jars before any cores are created and they will be available
>> to all cores.
>>
>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>> unlikely that this will affect third-party components, but be aware that
>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>> versions.
>>
>> https://issues.apache.org/jira/browse/SOLR-6188
>>
>> Thanks,
>> Shawn
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
>
> --
> http://www.the111shift.com
>



-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
Thanks for the links and perspective Hoss 3443 is exactly the same type of
problem only for spellchecking. My initial thought was also a map, perhaps
I'll adopt that ticket. The basic idea would be a Map<String,Object> on
CoreContainer, accessors, and some very very strongly worded javadoc about
keeping things simple to avoid classloader memory leaks.

A slightly safer approach that would be much harder to implement would
involve counting references to the shared stuff so that if all cores using
it were removed, it could be cleaned up... Not sure if that's worth it.

-Gus

On Wed, Nov 11, 2015 at 7:17 PM, Chris Hostetter <ho...@fucit.org>
wrote:

>
> : However, my take on it is that this seems like a pretty broad brush to
> : paint with to move *all* our classes up and out of the normal core
> loading
> : process. I assume there are good reasons for segregating this stuff into
> : separate class loaders to begin with. It would also be fairly burdensom
> to
>
> There are, but those reasons don't really apply if the whole point is you
> want to share resources between cores.
>
> : I really just want a way to stash the map in a place where other cores
> can
> : see it (and thus I can appropriately synchronize things so that the
> loading
> : only happens once). I'm asking because it seems like surely this must be
> a
> : solved problem... if not, it might be easiest to just solve it by adding
> : some sort of shared resources facility to CoreContainer?
>
> There has been some discussion about it in the past (ie: multiple
> instances of StopwordFilterFactory configured to point at the same
> stopwords.txt file on disk can/should share the same Map in RAM (in most
> cases) even if those instances exist in completely diff cores)
>
> There's been a few jiras where this concept of "sharing" heavy objects
> have come up, but i don't think anyone has made any attempts at a general
> solution...
>
>
> https://issues.apache.org/jira/browse/SOLR-7282
>
> https://issues.apache.org/jira/browse/SOLR-4872?focusedCommentId=13682471&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13682471
> https://issues.apache.org/jira/browse/SOLR-3443
>
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Chris Hostetter <ho...@fucit.org>.
: However, my take on it is that this seems like a pretty broad brush to
: paint with to move *all* our classes up and out of the normal core loading
: process. I assume there are good reasons for segregating this stuff into
: separate class loaders to begin with. It would also be fairly burdensom to

There are, but those reasons don't really apply if the whole point is you 
want to share resources between cores.

: I really just want a way to stash the map in a place where other cores can
: see it (and thus I can appropriately synchronize things so that the loading
: only happens once). I'm asking because it seems like surely this must be a
: solved problem... if not, it might be easiest to just solve it by adding
: some sort of shared resources facility to CoreContainer?

There has been some discussion about it in the past (ie: multiple 
instances of StopwordFilterFactory configured to point at the same 
stopwords.txt file on disk can/should share the same Map in RAM (in most 
cases) even if those instances exist in completely diff cores)

There's been a few jiras where this concept of "sharing" heavy objects 
have come up, but i don't think anyone has made any attempts at a general 
solution...


https://issues.apache.org/jira/browse/SOLR-7282
https://issues.apache.org/jira/browse/SOLR-4872?focusedCommentId=13682471&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13682471
https://issues.apache.org/jira/browse/SOLR-3443


-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
I'm trying to stay within the realm of solr's resource loader, so that I
don't need to tweak startup parameters (e.g. classpath, sysprops) or rely
on hardcoded stuff. The data must be usable by a query SearchComponent, and
those are loaded by the core using the resource loader...

One question I have about Shawn's solution, after a little thought is: If I
do move the jar, how do I get the class to load before I get to my
SeachComponent... if it gets loaded by reference there it will still be
under the Core's classloader I think...

On Wed, Nov 11, 2015 at 7:08 PM, Benson Margulies <bi...@gmail.com>
wrote:

> What is the connection of a blob of data and a class in a class
> loader? Is it a class of your own that you're using to store the data?
>
> Solr can't change fundamental facts about class loader; if an object
> of a class needs to be shared across class loaders, it has to be
> loaded into a common parent. If you don't want to do that broadly,
> you'll need indeed to factor out a jar for the job.
>
> If it isn't a special class, but rather just an instance of some
> boring ordinary class and your problem is sharing the _reference_,
> consider JNDI.
>
>
>
> On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
> > Yes asked by a colleague :). The chat session is now in our jira ticket
> :).
> >
> > However, my take on it is that this seems like a pretty broad brush to
> paint
> > with to move *all* our classes up and out of the normal core loading
> > process. I assume there are good reasons for segregating this stuff into
> > separate class loaders to begin with. It would also be fairly burdensom
> to
> > make a separate jar file to break out this one component...
> >
> > I really just want a way to stash the map in a place where other cores
> can
> > see it (and thus I can appropriately synchronize things so that the
> loading
> > only happens once). I'm asking because it seems like surely this must be
> a
> > solved problem... if not, it might be easiest to just solve it by adding
> > some sort of shared resources facility to CoreContainer?
> >
> > -Gus
> >
> > On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org>
> wrote:
> >>
> >> On 11/11/2015 4:11 PM, Gus Heck wrote:
> >> > I have a case where a component loads up a large CSV file (2.5 million
> >> > lines) to build a map. This worked ok in a case where we had a single
> >> > core, but it isn't working so well with 40 cores because each core
> loads
> >> > a new copy of the component in a new classloader and I get 40 new
> >> > versions of the same class each holding it's own private static final
> >> > map (one for each core). Each line is small, but a billion of anything
> >> > gets kinda heavy. Is this the intended class loading behavior?
> >> >
> >> > Is there some where that one can cause a class to be loaded in a
> parent
> >> > classloader above the core so that it's loaded just once? I want to
> load
> >> > it in some way that leverages standard solr resource loading, so that
> >> > I'm not hard coding or setting sysprops just to be able to find it.
> >> >
> >> > This is in a copy of trunk from about a month ago... so 6.x stuff is
> >> > mostly available.
> >>
> >> This sounds like a question that I just recently answered on IRC.
> >>
> >> If you remove all <lib> elements from your solrconfig.xml files and
> >> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
> >> load those jars before any cores are created and they will be available
> >> to all cores.
> >>
> >> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
> >> unlikely that this will affect third-party components, but be aware that
> >> until 5.4, jars in that lib directory will be loaded twice by older 5.x
> >> versions.
> >>
> >> https://issues.apache.org/jira/browse/SOLR-6188
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> >
> > --
> > http://www.the111shift.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Benson Margulies <bi...@gmail.com>.
What is the connection of a blob of data and a class in a class
loader? Is it a class of your own that you're using to store the data?

Solr can't change fundamental facts about class loader; if an object
of a class needs to be shared across class loaders, it has to be
loaded into a common parent. If you don't want to do that broadly,
you'll need indeed to factor out a jar for the job.

If it isn't a special class, but rather just an instance of some
boring ordinary class and your problem is sharing the _reference_,
consider JNDI.



On Wed, Nov 11, 2015 at 7:02 PM, Gus Heck <gu...@gmail.com> wrote:
> Yes asked by a colleague :). The chat session is now in our jira ticket :).
>
> However, my take on it is that this seems like a pretty broad brush to paint
> with to move *all* our classes up and out of the normal core loading
> process. I assume there are good reasons for segregating this stuff into
> separate class loaders to begin with. It would also be fairly burdensom to
> make a separate jar file to break out this one component...
>
> I really just want a way to stash the map in a place where other cores can
> see it (and thus I can appropriately synchronize things so that the loading
> only happens once). I'm asking because it seems like surely this must be a
> solved problem... if not, it might be easiest to just solve it by adding
> some sort of shared resources facility to CoreContainer?
>
> -Gus
>
> On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>>
>> On 11/11/2015 4:11 PM, Gus Heck wrote:
>> > I have a case where a component loads up a large CSV file (2.5 million
>> > lines) to build a map. This worked ok in a case where we had a single
>> > core, but it isn't working so well with 40 cores because each core loads
>> > a new copy of the component in a new classloader and I get 40 new
>> > versions of the same class each holding it's own private static final
>> > map (one for each core). Each line is small, but a billion of anything
>> > gets kinda heavy. Is this the intended class loading behavior?
>> >
>> > Is there some where that one can cause a class to be loaded in a parent
>> > classloader above the core so that it's loaded just once? I want to load
>> > it in some way that leverages standard solr resource loading, so that
>> > I'm not hard coding or setting sysprops just to be able to find it.
>> >
>> > This is in a copy of trunk from about a month ago... so 6.x stuff is
>> > mostly available.
>>
>> This sounds like a question that I just recently answered on IRC.
>>
>> If you remove all <lib> elements from your solrconfig.xml files and
>> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
>> load those jars before any cores are created and they will be available
>> to all cores.
>>
>> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
>> unlikely that this will affect third-party components, but be aware that
>> until 5.4, jars in that lib directory will be loaded twice by older 5.x
>> versions.
>>
>> https://issues.apache.org/jira/browse/SOLR-6188
>>
>> Thanks,
>> Shawn
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> http://www.the111shift.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Sharing a class across cores

Posted by Gus Heck <gu...@gmail.com>.
Yes asked by a colleague :). The chat session is now in our jira ticket :).

However, my take on it is that this seems like a pretty broad brush to
paint with to move *all* our classes up and out of the normal core loading
process. I assume there are good reasons for segregating this stuff into
separate class loaders to begin with. It would also be fairly burdensom to
make a separate jar file to break out this one component...

I really just want a way to stash the map in a place where other cores can
see it (and thus I can appropriately synchronize things so that the loading
only happens once). I'm asking because it seems like surely this must be a
solved problem... if not, it might be easiest to just solve it by adding
some sort of shared resources facility to CoreContainer?

-Gus

On Wed, Nov 11, 2015 at 6:54 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 11/11/2015 4:11 PM, Gus Heck wrote:
> > I have a case where a component loads up a large CSV file (2.5 million
> > lines) to build a map. This worked ok in a case where we had a single
> > core, but it isn't working so well with 40 cores because each core loads
> > a new copy of the component in a new classloader and I get 40 new
> > versions of the same class each holding it's own private static final
> > map (one for each core). Each line is small, but a billion of anything
> > gets kinda heavy. Is this the intended class loading behavior?
> >
> > Is there some where that one can cause a class to be loaded in a parent
> > classloader above the core so that it's loaded just once? I want to load
> > it in some way that leverages standard solr resource loading, so that
> > I'm not hard coding or setting sysprops just to be able to find it.
> >
> > This is in a copy of trunk from about a month ago... so 6.x stuff is
> > mostly available.
>
> This sounds like a question that I just recently answered on IRC.
>
> If you remove all <lib> elements from your solrconfig.xml files and
> place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
> load those jars before any cores are created and they will be available
> to all cores.
>
> There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
> unlikely that this will affect third-party components, but be aware that
> until 5.4, jars in that lib directory will be loaded twice by older 5.x
> versions.
>
> https://issues.apache.org/jira/browse/SOLR-6188
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
http://www.the111shift.com

Re: Sharing a class across cores

Posted by Shawn Heisey <ap...@elyograg.org>.
On 11/11/2015 4:11 PM, Gus Heck wrote:
> I have a case where a component loads up a large CSV file (2.5 million
> lines) to build a map. This worked ok in a case where we had a single
> core, but it isn't working so well with 40 cores because each core loads
> a new copy of the component in a new classloader and I get 40 new
> versions of the same class each holding it's own private static final
> map (one for each core). Each line is small, but a billion of anything
> gets kinda heavy. Is this the intended class loading behavior?
> 
> Is there some where that one can cause a class to be loaded in a parent
> classloader above the core so that it's loaded just once? I want to load
> it in some way that leverages standard solr resource loading, so that
> I'm not hard coding or setting sysprops just to be able to find it.
> 
> This is in a copy of trunk from about a month ago... so 6.x stuff is
> mostly available.

This sounds like a question that I just recently answered on IRC.

If you remove all <lib> elements from your solrconfig.xml files and
place all extra jars for Solr into ${solr.solr.home}/lib ... Solr will
load those jars before any cores are created and they will be available
to all cores.

There is a minor bug with this that will be fixed in Solr 5.4.0.  It is
unlikely that this will affect third-party components, but be aware that
until 5.4, jars in that lib directory will be loaded twice by older 5.x
versions.

https://issues.apache.org/jira/browse/SOLR-6188

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org