You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by didier deshommes <df...@gmail.com> on 2013/05/02 22:27:33 UTC

transientCacheSize doesn't seem to have any effect, except on startup

Hi,
I've been very interested in the transient core feature of solr to manage a
large number of cores. I'm especially interested in this use case, that the
wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
now):

>loadOnStartup=false transient=true: This is really the use-case. There are
a large number of cores in your system that are short-duration use. You
want Solr to load them as necessary, but unload them when the cache gets
full on an LRU basis.

I'm creating 10 transient core via core admin like so

$ curl "
http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
"

and have "transientCacheSize=2" in my solr.xml file, which I take means I
should have at most 2 transient cores loaded at any time. The problem is
that these cores are still loaded when when I ask solr to list cores:

$ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"

>From the explanation in the wiki, it looks like solr would manage loading
and unloading transient cores for me without having to worry about them,
but this is not what's happening.

The situation is different when I restart solr; it does the "right thing"
by loading the maximum cores set by transientCacheSize. When I add more
cores, the old behavior happens again, where all created transient cores
are loaded in solr.

I'm using the development branch lucene_solr_4_3 to run my example. I can
open a jira if need be.

Re: transientCacheSize doesn't seem to have any effect, except on startup

Posted by Erick Erickson <er...@gmail.com>.
I'm slammed with stuff and have to leave for vacation Saturday morning
so I'll be going silent for a while, sorry....

Best
Erick

On Wed, May 8, 2013 at 11:27 AM, didier deshommes <df...@gmail.com> wrote:
> Any idea on this? I still cannot get the combination of transient cores and
> transientCacheSize to work as I think it should: give me the ability to
> create a large number cores and automatically load and unload them for me
> based on a limit that I set.
>
> If anyone else is using this feature and it is working for you, let me know
> how you got it working!
>
>
> On Fri, May 3, 2013 at 2:11 PM, didier deshommes <df...@gmail.com> wrote:
>
>>
>> On Fri, May 3, 2013 at 11:18 AM, Erick Erickson <er...@gmail.com>wrote:
>>
>>> The cores aren't loaded (or at least shouldn't be) for getting the status.
>>> The _names_ of the cores should be returned, but those are (supposed) to
>>> be
>>> retrieved from a list rather than loaded cores. So are you sure that's
>>> not what
>>> you are seeing? How are you determining whether the cores are actually
>>> loaded
>>> or not?
>>>
>>>
>> I'm looking at the output of :
>>
>> $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
>>
>> cores that are loaded have a "startTime" and "upTime" value. Cores that
>> are unloaded don't appear in the output at all. For example, I created 3
>> transient cores with "transientCacheSize=2" . When I asked for a list of
>> all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
>> back 2 cores when I asked for the list again.
>>
>> It would be nice if cores had a "isTransient" and a "isCurrentlyLoaded"
>> value so that one could see exactly which cores are loaded.
>>
>>
>>
>>
>>> That said, it's perfectly possible that the status command is doing
>>> something we
>>> didn't anticipate, but I took a quick look at the code (got to rush to a
>>> plane)
>>> and CoreAdminHandler _appears_ to be just returning whatever info it can
>>> about an unloaded core for status. I _think_ you'll get more info if the
>>> core has ever been loaded though, even though if it's been removed from
>>> the transient cache. Ditto for the create action.
>>>
>>> So let's figure out whether you're really seeing loaded cores or not, and
>>> then
>>> raise a JIRA if so...
>>>
>>> Thanks for reporting!
>>> Erick
>>>
>>> On Thu, May 2, 2013 at 1:27 PM, didier deshommes <df...@gmail.com>
>>> wrote:
>>> > Hi,
>>> > I've been very interested in the transient core feature of solr to
>>> manage a
>>> > large number of cores. I'm especially interested in this use case, that
>>> the
>>> > wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
>>> > now):
>>> >
>>> >>loadOnStartup=false transient=true: This is really the use-case. There
>>> are
>>> > a large number of cores in your system that are short-duration use. You
>>> > want Solr to load them as necessary, but unload them when the cache gets
>>> > full on an LRU basis.
>>> >
>>> > I'm creating 10 transient core via core admin like so
>>> >
>>> > $ curl "
>>> >
>>> http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
>>> > "
>>> >
>>> > and have "transientCacheSize=2" in my solr.xml file, which I take means
>>> I
>>> > should have at most 2 transient cores loaded at any time. The problem is
>>> > that these cores are still loaded when when I ask solr to list cores:
>>> >
>>> > $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
>>> >
>>> > From the explanation in the wiki, it looks like solr would manage
>>> loading
>>> > and unloading transient cores for me without having to worry about them,
>>> > but this is not what's happening.
>>> >
>>> > The situation is different when I restart solr; it does the "right
>>> thing"
>>> > by loading the maximum cores set by transientCacheSize. When I add more
>>> > cores, the old behavior happens again, where all created transient cores
>>> > are loaded in solr.
>>> >
>>> > I'm using the development branch lucene_solr_4_3 to run my example. I
>>> can
>>> > open a jira if need be.
>>>
>>
>>

Re: transientCacheSize doesn't seem to have any effect, except on startup

Posted by didier deshommes <df...@gmail.com>.
Any idea on this? I still cannot get the combination of transient cores and
transientCacheSize to work as I think it should: give me the ability to
create a large number cores and automatically load and unload them for me
based on a limit that I set.

If anyone else is using this feature and it is working for you, let me know
how you got it working!


On Fri, May 3, 2013 at 2:11 PM, didier deshommes <df...@gmail.com> wrote:

>
> On Fri, May 3, 2013 at 11:18 AM, Erick Erickson <er...@gmail.com>wrote:
>
>> The cores aren't loaded (or at least shouldn't be) for getting the status.
>> The _names_ of the cores should be returned, but those are (supposed) to
>> be
>> retrieved from a list rather than loaded cores. So are you sure that's
>> not what
>> you are seeing? How are you determining whether the cores are actually
>> loaded
>> or not?
>>
>>
> I'm looking at the output of :
>
> $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
>
> cores that are loaded have a "startTime" and "upTime" value. Cores that
> are unloaded don't appear in the output at all. For example, I created 3
> transient cores with "transientCacheSize=2" . When I asked for a list of
> all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
> back 2 cores when I asked for the list again.
>
> It would be nice if cores had a "isTransient" and a "isCurrentlyLoaded"
> value so that one could see exactly which cores are loaded.
>
>
>
>
>> That said, it's perfectly possible that the status command is doing
>> something we
>> didn't anticipate, but I took a quick look at the code (got to rush to a
>> plane)
>> and CoreAdminHandler _appears_ to be just returning whatever info it can
>> about an unloaded core for status. I _think_ you'll get more info if the
>> core has ever been loaded though, even though if it's been removed from
>> the transient cache. Ditto for the create action.
>>
>> So let's figure out whether you're really seeing loaded cores or not, and
>> then
>> raise a JIRA if so...
>>
>> Thanks for reporting!
>> Erick
>>
>> On Thu, May 2, 2013 at 1:27 PM, didier deshommes <df...@gmail.com>
>> wrote:
>> > Hi,
>> > I've been very interested in the transient core feature of solr to
>> manage a
>> > large number of cores. I'm especially interested in this use case, that
>> the
>> > wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
>> > now):
>> >
>> >>loadOnStartup=false transient=true: This is really the use-case. There
>> are
>> > a large number of cores in your system that are short-duration use. You
>> > want Solr to load them as necessary, but unload them when the cache gets
>> > full on an LRU basis.
>> >
>> > I'm creating 10 transient core via core admin like so
>> >
>> > $ curl "
>> >
>> http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
>> > "
>> >
>> > and have "transientCacheSize=2" in my solr.xml file, which I take means
>> I
>> > should have at most 2 transient cores loaded at any time. The problem is
>> > that these cores are still loaded when when I ask solr to list cores:
>> >
>> > $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
>> >
>> > From the explanation in the wiki, it looks like solr would manage
>> loading
>> > and unloading transient cores for me without having to worry about them,
>> > but this is not what's happening.
>> >
>> > The situation is different when I restart solr; it does the "right
>> thing"
>> > by loading the maximum cores set by transientCacheSize. When I add more
>> > cores, the old behavior happens again, where all created transient cores
>> > are loaded in solr.
>> >
>> > I'm using the development branch lucene_solr_4_3 to run my example. I
>> can
>> > open a jira if need be.
>>
>
>

Re: transientCacheSize doesn't seem to have any effect, except on startup

Posted by didier deshommes <df...@gmail.com>.
On Fri, May 3, 2013 at 11:18 AM, Erick Erickson <er...@gmail.com>wrote:

> The cores aren't loaded (or at least shouldn't be) for getting the status.
> The _names_ of the cores should be returned, but those are (supposed) to be
> retrieved from a list rather than loaded cores. So are you sure that's not
> what
> you are seeing? How are you determining whether the cores are actually
> loaded
> or not?
>
>
I'm looking at the output of :

$ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"

cores that are loaded have a "startTime" and "upTime" value. Cores that are
unloaded don't appear in the output at all. For example, I created 3
transient cores with "transientCacheSize=2" . When I asked for a list of
all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
back 2 cores when I asked for the list again.

It would be nice if cores had a "isTransient" and a "isCurrentlyLoaded"
value so that one could see exactly which cores are loaded.




> That said, it's perfectly possible that the status command is doing
> something we
> didn't anticipate, but I took a quick look at the code (got to rush to a
> plane)
> and CoreAdminHandler _appears_ to be just returning whatever info it can
> about an unloaded core for status. I _think_ you'll get more info if the
> core has ever been loaded though, even though if it's been removed from
> the transient cache. Ditto for the create action.
>
> So let's figure out whether you're really seeing loaded cores or not, and
> then
> raise a JIRA if so...
>
> Thanks for reporting!
> Erick
>
> On Thu, May 2, 2013 at 1:27 PM, didier deshommes <df...@gmail.com>
> wrote:
> > Hi,
> > I've been very interested in the transient core feature of solr to
> manage a
> > large number of cores. I'm especially interested in this use case, that
> the
> > wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
> > now):
> >
> >>loadOnStartup=false transient=true: This is really the use-case. There
> are
> > a large number of cores in your system that are short-duration use. You
> > want Solr to load them as necessary, but unload them when the cache gets
> > full on an LRU basis.
> >
> > I'm creating 10 transient core via core admin like so
> >
> > $ curl "
> >
> http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
> > "
> >
> > and have "transientCacheSize=2" in my solr.xml file, which I take means I
> > should have at most 2 transient cores loaded at any time. The problem is
> > that these cores are still loaded when when I ask solr to list cores:
> >
> > $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
> >
> > From the explanation in the wiki, it looks like solr would manage loading
> > and unloading transient cores for me without having to worry about them,
> > but this is not what's happening.
> >
> > The situation is different when I restart solr; it does the "right thing"
> > by loading the maximum cores set by transientCacheSize. When I add more
> > cores, the old behavior happens again, where all created transient cores
> > are loaded in solr.
> >
> > I'm using the development branch lucene_solr_4_3 to run my example. I can
> > open a jira if need be.
>

Re: transientCacheSize doesn't seem to have any effect, except on startup

Posted by Erick Erickson <er...@gmail.com>.
The cores aren't loaded (or at least shouldn't be) for getting the status.
The _names_ of the cores should be returned, but those are (supposed) to be
retrieved from a list rather than loaded cores. So are you sure that's not what
you are seeing? How are you determining whether the cores are actually loaded
or not?

That said, it's perfectly possible that the status command is doing something we
didn't anticipate, but I took a quick look at the code (got to rush to a plane)
and CoreAdminHandler _appears_ to be just returning whatever info it can
about an unloaded core for status. I _think_ you'll get more info if the
core has ever been loaded though, even though if it's been removed from
the transient cache. Ditto for the create action.

So let's figure out whether you're really seeing loaded cores or not, and then
raise a JIRA if so...

Thanks for reporting!
Erick

On Thu, May 2, 2013 at 1:27 PM, didier deshommes <df...@gmail.com> wrote:
> Hi,
> I've been very interested in the transient core feature of solr to manage a
> large number of cores. I'm especially interested in this use case, that the
> wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
> now):
>
>>loadOnStartup=false transient=true: This is really the use-case. There are
> a large number of cores in your system that are short-duration use. You
> want Solr to load them as necessary, but unload them when the cache gets
> full on an LRU basis.
>
> I'm creating 10 transient core via core admin like so
>
> $ curl "
> http://localhost:8983/solr/admin/cores?wt=json&action=CREATE&name=new_core2&instanceDir=collection1/&dataDir=new_core2&transient=true&loadOnStartup=false
> "
>
> and have "transientCacheSize=2" in my solr.xml file, which I take means I
> should have at most 2 transient cores loaded at any time. The problem is
> that these cores are still loaded when when I ask solr to list cores:
>
> $ curl "http://localhost:8983/solr/admin/cores?wt=json&action=status"
>
> From the explanation in the wiki, it looks like solr would manage loading
> and unloading transient cores for me without having to worry about them,
> but this is not what's happening.
>
> The situation is different when I restart solr; it does the "right thing"
> by loading the maximum cores set by transientCacheSize. When I add more
> cores, the old behavior happens again, where all created transient cores
> are loaded in solr.
>
> I'm using the development branch lucene_solr_4_3 to run my example. I can
> open a jira if need be.