You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2009/07/16 21:38:14 UTC

[jira] Created: (LUCENE-1749) FieldCache introspection API

FieldCache introspection API
----------------------------

                 Key: LUCENE-1749
                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
            Reporter: Hoss Man
            Priority: Minor


FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
   * entries for the same reader/field with different types/parsers
   * entries for the same field/type/parser in a reader and it's subreader(s)
   * etc...




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

More progress building on Mark's patch.

added some sanity checking that reader/fieldname combos aren't reused in odd ways -- i made it ignore cases where different parsers ultimately resolve to identical cache objects (ie null vs DEFAULT_LONG_PARSER) and it ignores any CreationPlaceholder objects (not sure about that one)

some tests were modified to make their pathological behavior more "sane" and hooks were addded so that future tests can bypass the sanity testing in the tearDown method if they really need to.

Still need sanity testing of the Reader/subreader variety.  also lots of docs and code cleanup. 

BTW: i was focused on test-core ... still waiting on test-contrib to finish running, so i'm not yet sure if i broke anything there.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737626#action_12737626 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

I believe that ConstantScoreQuery will need it's explain() fixed too?

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

updated patch to trunk (QueryWeight->Weight) and tweaked some FieldCacheImpl methods to use the non-deprecated Entry constructors (forgot that part before)

I'll commit as soon as my test run is finished.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736320#action_12736320 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

No worries, the updates are to the ram estimator and other minor things (eg if something fails the sanity check, the error output comes out twice because of the double check in teardown ) - nothing feature wise at the moment. I'll see what you add first.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739122#action_12739122 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

{quote}
bq. Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue?

as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately.
{quote}

OK I'll open a new issue for this one (deprecate Scorer.explain).

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

I still havn't looked at this in the detail that I want to, but time is my enemy at the moment, so take it back and at least you can finish up what you have planned.

Hopefully its all in good working order after all the extracting and to trunking - let me know if you see anything off and I'll spin another. 

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739093#action_12739093 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

bq. Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new issue?

as i alluded to above, i'm in favor of individual issues for each "bug" uncovered by this issue so they can be tracked separately.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732123#action_12732123 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

nice - would be great if it could estimate ram usage as well.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739097#action_12739097 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch. 

I've already got separate issues and patches up were it makes sense (not the last one Mike mentions) - I wanted to keep them here too though until the insanity tests were complete - the tests that the fixes are somewhat correct are in this patch, and I don't like to manage layers of patches. if we don't plan on doing anymore with the insanity tests here though, I'll spin them out of this patch now.

I'll put up one more version and then you can have it back.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737871#action_12737871 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

This was an excellent idea, and it's great that it uncovered some
dangerous and very unexpected places where we are passing top-level
reader to the FieldCache (eg that explain() could suddenly populate
the FieldCache w/ top-level values is quite shocking!).

ReaderUtil.subSearcher is doing the same thing as
DirectoryReader.readerIndex.

I love the RAMUsageEstimator... we have other places that estimate RAM
(eg IndexWriter does so for added & deleted docs) that we should
eventually cutover to this new API.

I particularly love the new class named Insanity:

{code}
  public static Insanity[] checkSanity(FieldCache cache)
{code}

MultiDocIdSet/Iterator makes me a bit nervous, because it's further
"propogating" a non-segment-based iterator deeper into Lucene than I
think we want to.  It's similar to eg using
DirectoryReader.MultiTermDocs (what Lucene used to do), instead of
stepping through the segments yourself.

Also, shouldn't explain most closely match what was done during
searching (ie, run "per segment")?  So simply pushing explain down to
the sub-reader that has the doc seems appropriate?  Ie we want it to
share as much of the code path as possible with how searching was in
fact done?

EG for ConstantScoreQuery.explain, it seems like we should 1) locate
the sub-reader that this doc falls in, and 2) get a scorer against
that reader, then 3) build up the explanation from that?  Likewise for
CustomScoreQuery? 

In fact.... maybe we should simply fix IndexSearcher.explain to do
this for all queries?  Ie, get the top-level weight, locate sub-reader
that has the doc, un-base the doc, and then invoke QueryWeight.explain
with that sub-reader and un-based doc?  Then we don't have to do
anything special for each query.  I think QueryWeight.scorer()
shouldn't be expected to handle a "top level reader" being passed in.
Ie, higher up in Lucene we should do that switch, so that we don't
have to do it (this "valuesFromSubReaders" arg) for every scorer.

Hmm: why do we even have explain at both the QueryWeight and Scorer
"levels"?  It seems like we should pick one level and do it there,
consistently.  Most queries seem to only implement the QueryWeight one
and often simply throw UOE in the Scorer's explain, but eg PhraseQuery
implements in both places.

(BTW: I'll be offline for approx the next 36 hours or so!)


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732157#action_12732157 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

+1 -- this'd be great to get into 2.9.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732110#action_12732110 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------


The motivation for this issue is all of the changes coming in 2.9 in how Lucene internally uses the FieldCache API -- the biggest change being per Segment sorting, but there may be others not immediately obvious.

While these changes are backwards compatible from an API and functionality perspective, they could have some pretty serious performance impacts for existing apps that also use the FieldCache directly and after upgrading the apps suddenly seem slower to start (because of redundant FieldCache initialization) and require 2X as much RAM as they did before.  This could lead people people to assume Lucene has suddenly became a major memory hog.  SOLR-1111 and SOLR-1247 are some quick examples of the types of problems that apps could encounter.

Currently the only way for a User to even notice the problem is to do memory profiling, and the FieldCache data structure isn't the easiest to understand.  It would be a lot nicer to have some methods for doing this inspection programaticly, so users could write automated tests for incorrect/redundent usage.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Paul Smith (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735242#action_12735242 ] 

Paul Smith commented on LUCENE-1749:
------------------------------------

You know what would be absolute icing on the cake here would be some way during the introspection by some code looking for large sort fields that perhaps can be discarded/unloaded as needed (programmatically).

What I'm thinking here is a use case we've come into where we have had to sort by subject.  Well the unique # subjects gets pretty large, and while we still need to support the use case, it'd be nice to be able to periodically 'toss' sort fields like this so they don't hog memory permanently while the IndexReader is still in memory.  (sorting by subject is used, just not often so a good candidate for tossing)

Because we have multiple large IndexReaders open concurrently, it'd be nice to be able to scan periodically and kick out any unneeded ones.

It's nice to be able to inspect and print out these, but even better if one can make changes based on what one finds.



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740256#action_12740256 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Mark: I'll start working on improving the docs (and other things from my previous todo list)

bq. P.S. I'm not sure we want to go with the way I have changed the tests here.

Are you talking about TestOrdValues and TestFieldScoreQuery ?

if we expect OrdValues and FieldScoreQuery to use subReader based field caches, then the test seems to be doing the correct thing (in your patch) .. inspecting the fieldcaches per subreader.  Is there a code path where we expect those methods to get called on a MultiReader?

(Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues.  What do you think? ... new issue?)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

* merged patches (and little tweaks to explain fix code)
* added a MultiDocIdSet for handling constantscorequery explain - first draft - needs some thought. I just banged it out. Its got a couple simple tests.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by Mark Miller <ma...@gmail.com>.
Yeah, I have no problem breaking them out in the end (and am trying to 
also track with separate JIRA issues). For right now its easier to keep 
it here though - since this is the code thats catching the bugs, and has 
the code to keep checking without having to juggle multiple patches. And 
the checks could change or be added to. I'll be happy to break the bug 
fix code pieces out to their respective issues when we are done here 
(it's really only one issue at the moment I think - though you could 
argue two I guess), or before when it makes sense or if it ends up 
confusing the main issue here. Right now, the code for the fixes is very 
minor and easy to manage.

- Mark

Chris Hostetter wrote:
> : changes to just go per reader for each doc - and a couple other unrelated tiny tweaks.
>
> FWIW: now that this issues has uncovered a few genuine "bugs" in code (as 
> opposed to justs tests being odd) it would probably be better to track 
> those bugs and their patches in seperate issues that can be individually 
> refered to in CHANGES.txt (and reopened as needed)
>
> committing those bug fixes can be done independently of commiting the 
> sanity checker.
>
> (PS: i'm making this suggestiong based purely on skiming the jiraemail 
> stream from the last day or so ... i haven't looked at the patches but the 
> decriptions seem to suggest they contain actual bug fixes, not just test 
> modifications)
>
> : 
> : > FieldCache introspection API
> : > ----------------------------
> : >
> : >                 Key: LUCENE-1749
> : >                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
> : >             Project: Lucene - Java
> : >          Issue Type: Improvement
> : >          Components: Search
> : >            Reporter: Hoss Man
> : >            Priority: Minor
> : >             Fix For: 2.9
> : >
> : >         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
> : >
> : >
> : > FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
> : >    * entries for the same reader/field with different types/parsers
> : >    * entries for the same field/type/parser in a reader and it's subreader(s)
> : >    * etc...
> : 
> : -- 
> : This message is automatically generated by JIRA.
> : -
> : You can reply to this email to add a comment to the issue online.
> : 
> : 
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-dev-help@lucene.apache.org
> : 
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>   


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by Chris Hostetter <ho...@fucit.org>.
: changes to just go per reader for each doc - and a couple other unrelated tiny tweaks.

FWIW: now that this issues has uncovered a few genuine "bugs" in code (as 
opposed to justs tests being odd) it would probably be better to track 
those bugs and their patches in seperate issues that can be individually 
refered to in CHANGES.txt (and reopened as needed)

committing those bug fixes can be done independently of commiting the 
sanity checker.

(PS: i'm making this suggestiong based purely on skiming the jiraemail 
stream from the last day or so ... i haven't looked at the patches but the 
decriptions seem to suggest they contain actual bug fixes, not just test 
modifications)

: 
: > FieldCache introspection API
: > ----------------------------
: >
: >                 Key: LUCENE-1749
: >                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
: >             Project: Lucene - Java
: >          Issue Type: Improvement
: >          Components: Search
: >            Reporter: Hoss Man
: >            Priority: Minor
: >             Fix For: 2.9
: >
: >         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
: >
: >
: > FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
: >    * entries for the same reader/field with different types/parsers
: >    * entries for the same field/type/parser in a reader and it's subreader(s)
: >    * etc...
: 
: -- 
: This message is automatically generated by JIRA.
: -
: You can reply to this email to add a comment to the issue online.
: 
: 
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-dev-help@lucene.apache.org
: 



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

changes to just go per reader for each doc - and a couple other unrelated tiny tweaks.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740265#action_12740265 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Hmmmm...

actually mark, testing our your latest patch against hte trunk i'm seeing (FieldCache sanity) failures from TestCustomScoreQuery, TestFieldScoreQuery, TestOrdValues, and TestSort ... have you seen these?  did some other recent change on the trunk trigger these?

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740272#action_12740272 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue.

TestSort I didn't notice. It looks like its considering String[] and StringIndex the same for the two multi and parallel sort tests - merged to trunk, so perhaps something has gone awry there? I've looked over the patch and I don't see any obvious mistake - I don't know that I have time to dig more now, but since you are more familiar with that code anyway, perhaps you can tell me why its now considering them the same anyway? Otherwise I will look more before too long.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736750#action_12736750 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug.

I don't even know what to do about this one. All I can think is that you pump out an explain for each sub reader - but thats pretty unhelpful.

Perhaps the best we can do is javadoc the extra requirements that may be needed when you use explain?

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740155#action_12740155 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

P.S. I'm not sure we want to go with the way I have changed the tests here.

I switched things to go per subreader rather than use the overall reader - this is how things happen in IndexSearcher now. But we lose the top level reader test. We might want to do it both ways, and when doing it by top reader, ignore the triggered insanity check?

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739089#action_12739089 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

{quote}
How about, for queries that need to go and look up docFreq, their QueryWeight impls simply hold onto the [top-level] IndexSearcher that
had been passed to their ctor, and then do the docFreq call against
that, if explain is invoked?
{quote}

Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.
Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway.


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736304#action_12736304 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Mark: i have a little time to work on this today ... do you have any updates that youv'e been working on locally (i noticed some patch add/retract from you in hte history)

Paul: over in LUCENE-831 there was a lot of discussion and work done towards making the entire FieldCAche internals pluggable so you could customize the cache behavior all sorts of ways ... i feel out of the loop on that issue, but my understanding is that it was pushed back to 3.1 at the earliest because it wasn't clear how the APIs should be setup given the work being done with reopen and with moving FieldCache usage down to the subreaders.

for now my goal with this issue (LUCENE-1749) is purely to provide an experimental  (ie: no back compat expectations) API for app developers to use to sanity check that the changes in 2.9 havne't blown their RAM usage sky high.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737792#action_12737792 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

In the case that is a caching filter? I hadn't actually looked to see if there are any other FieldCache ones either - just what tripped the tests.

I guess it could be dealt with the same way? A DocIdSet that distributes to sub readers ...

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737517#action_12737517 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in

One more note to try and be a bit more clear:

First, I wasn't sure how easy this was to do because I don't know explain code or the function package very well at all (eg I've never used it). And the explain method itself did not grab values from the field cache, it loaded up a scorer that did so. So I just wasn't sure how doable this fix was. Thats why I was saying pushing down explain to the subreader wasn't great, but I wasn't sure what else you could do. The fix turned out to be fairly easy though - the scorer for valuesource just needed two modes - one for normal scoring and one for explain (that breaks up the requests for a fieldcache val per sub reader) - the explain method would work for both ways, but no reason to try and break down per reader when its going to score per reader anyway, so I have both. Standard scorer for valuesource works as it did, and explain trips a setting to break out subreaders and distrib fieldcache requests. And then the custom query needed a tweak to work right (flip that setting) with its underlying valuesource queries.



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by Chris Hostetter <ho...@fucit.org>.
: In the insanity check, when you drop into the sequential subreaders - I 
: think its got to be recursive - you might have a multi at the top with 
: other subs, or any combo thereof. I can add to next patch.

i don't have the code in front of me, but i thought i was adding the sub 
readers to the list it's iterating over, so it will eventually recurse all 
the way to the bottom.


: 
: > FieldCache introspection API
: > ----------------------------
: >
: >                 Key: LUCENE-1749
: >                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
: >             Project: Lucene - Java
: >          Issue Type: Improvement
: >          Components: Search
: >            Reporter: Hoss Man
: >            Priority: Minor
: >             Fix For: 2.9
: >
: >         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
: >
: >
: > FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
: >    * entries for the same reader/field with different types/parsers
: >    * entries for the same field/type/parser in a reader and it's subreader(s)
: >    * etc...
: 
: -- 
: This message is automatically generated by JIRA.
: -
: You can reply to this email to add a comment to the issue online.
: 
: 
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-dev-help@lucene.apache.org
: 



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737834#action_12737834 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

In the insanity check, when you drop into the sequential subreaders - I think its got to be recursive - you might have a multi at the top with other subs, or any combo thereof. I can add to next patch.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739092#action_12739092 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

General Comments on mark's latest patch...
* the changes that i understand all seem good ... some of the details in reader/searcher/query internals elude me but it sounds Yonik & McCandless have their eyes on them so i trust the three of you have it covered.
* we still need to fill in some empty/sparse javadocs, but that can be done after an initial commit.
* is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives?

Big Questions I Still Have....
* does anyone have any reservations about the new APIs introduced?
  * FieldCache.CreationPlaceholder (promoted from FieldCacheImpl)
  * FieldCache.CacheEntry
  * FieldCache.getCacheEntries()
  * FieldCache.purgeAllCaches()
  * FieldCacheSanityChecker
  * RamUsageEstimator
* does anyone have any reservations about the refactoring done in FieldCacheImpl to make this new API possible? (ie: did i break the thread safety in a way i'm not noticing?)
* is the FieldCacheImpl.Entry.type (the "SortField" int type) still needed by FieldCacheImpl.Entry? ... nothing seems to use it so it would be nice to eliminate it and simplify the CacheEntry API as well.  (i suspect it got refactored into obsolescence when the Sorting got moved into the subreaders)
* The sanity checking ignores CreationPlaceholder -- largely because of the way the numeric caches first try one parser, and then if they get an NFE try a different parser -- but this leaves the CreationPlaceholder in the cache.  It's not a big object, so i assume it was implemented this way on purpose and the sanity checker is doing the correct thing by ignoring it, but i wanted to make sure people are aware/ok with this behavior.
 
Lastly: This patch feels unnecessarily large at this point.  Several of the bugs/improvements we've uncovered here don't seem like belong in this patch, and should be tracked in separate Jira issues, which can be committed independently and enumerated in CHANGES.txt....
  * new ReaderUtil and the usage in DirectoryReader, MultiReader, MultiSearcher & IndexSearcher
  * explain subreader bug fixes in ConstantScoreQuery, QueryWeight, ValueSourceQuery, CustomScoreQuery, etc...
...i think this issue (and this patch) should be reduced to just the new sanity checkig API, and *tests* that have been changed to be more sane (where the underlying code was already fine)

Mark: would you mind splitting up the latest patch you have (you mentioned some additional minor tweaks) and opening new issues for these peripheral changes and then attaching back what's left for this patch.  Then I'll take the conch back and work on the missing javadocs.

(I'll happily commit once i get at least one thumbs up from someone on the "Big Questions" above ... we can always tweak the javadocs further in subsequent commits)



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737247#action_12737247 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. (BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ?

That's one way to go. Its got its own little issues though - some bookkeeping stuff is not serialized, and extra info about class, fields is serialzied. Transient fields (niche issue for sure) would also not be serialized. Its def another way to get an estimate. I chose a different route after considering both (googled the topic for a bit and looked at some examples before choosing). I'd be open to another route, but I thought this method was fairly fast, accurate, and generic.


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738140#action_12738140 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

bq. It's interesting that we didn't (I think?) have a similar problem w/ scorer when we switched to passing it the sub-reader.

Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher).
One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight.

bq. So the only odd thing is why explain is part of Scorer base class
Right.. it doesn't belong there.  Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway).


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

checkpoint: refactored the sanity checking code into a utility class and wrote tests specifically for it to prove it finds insane stuff.

TODO:
* clean up the api, make it less clunky (and not static)
** return structured data showing exactly which combinations in FieldCache are insane
* javadocs
* figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738698#action_12738698 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

I've got one more draft here with the smallest of tweaks - javadoc spelling errors, and one perhaps one or two other tiny things - stuff I just would toss out rather than merge - but are you doing anything here right now Hoss? I think not at the moment, so if thats the case I'll put up one more patch before you grab the conch back. Otherwise I'll hold off on anything till you put something up.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: fieldcache-introspection.patch

Here's the start of a patch to provide this functionality -- it just provides a new method/datastructure for inspecting the cache; the sanity checking utility methods should be straightforward assuming people think this is a good idea.

The new method itself is fairly simple, but quite a bit of refactoring to how the caches are managed was necessary to make it possible to implement the method sanely.  These changes to the FieldCache internals seem like they are generally a good idea from a maintenance standpoint even if people don't like the new method.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740311#action_12740311 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

bq. I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all fail because the fix for them is now in another issue.

ah ... are you talking about LUCENE-1771 ? (the jira dependency sems backwards in that case)

bq. In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed.

yeah .. i just caught that and was starting to reply ... the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache. .. i'm currently trying to figure out why (because that could confuse people as well)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

Updates:

* merged in updated ram usage estimator code
* updated most failing tests to work without creating top level FieldCaches
* removed offending calls to explain - I left nocommit comments here - depending on what we decide, we could turn off the subreader check for these
* Turned off the subreader check for stress sort test - it sorts in back compat mode and compares to the new mode - so it loads both on purpose.
* I don't remember if I touched anything else.

tests pass now

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737254#action_12737254 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing? it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double.

It *should* do the same thing - but thats sticky. If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain. I won't deny its kind of bug - but I'm not sure at the moment what the best way to address it is. I'll look into the possibility of pushing the fieldcache access to the subreaders while leaving everything else at the top reader - I have no thoughts about the feasibility of that at the moment though. I guess it might be doable.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

minor checkpoint: improved assert messages, and massaged TestRemoteSort so that it appearers more sane.

problem with PatternAnalyzerTest was unrelated (see LUCENE-1756)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737792#action_12737792 ] 

Mark Miller edited comment on LUCENE-1749 at 7/31/09 5:43 PM:
--------------------------------------------------------------

In the case that it is a caching filter? I hadn't actually looked to see if there are any other FieldCache ones either - just what tripped the tests.

I guess it could be dealt with the same way? A DocIdSet that distributes to sub readers ...


      was (Author: markrmiller@gmail.com):
    In the case that is a caching filter? I hadn't actually looked to see if there are any other FieldCache ones either - just what tripped the tests.

I guess it could be dealt with the same way? A DocIdSet that distributes to sub readers ...
  
> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740278#action_12740278 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

{quote}(Actually: that seems like a wroth while improvement to make to these classes: a MultiDocValues impl that all of the getValues(IndexReader) methods use when passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue instances for each so you don't get FieldCache expolsions if code inadvertenly passes the wrong reader to getValues. What do you think? ... new issue?){quote}

Very interesting idea - def a new issue I think. Not sure its worth it if you can't protect general fieldcache access as well though ...


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737832#action_12737832 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

Already finding some corner cases with the multi docidset stuff - I'll keep working along those lines a bit, then maybe look at some of the code you have been working on and post another patch later this weekend.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732297#action_12732297 ] 

Mark Miller edited comment on LUCENE-1749 at 7/16/09 6:35 PM:
--------------------------------------------------------------

We prob would want to provide an alternate toString that includes the ram guess and the default that skips it - i havn't tested performance, but it might take a while to check a gigantic string array.

Also, JavaImpl should probably actually be JavaMemoryModel or MemoryModel.

      was (Author: markrmiller@gmail.com):
    We prob would want to provide an alternate toString that includes the ram guess and the default that skips it - i havn't tested performance, but it might take a while to check a gigantic string array.
  
> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733955#action_12733955 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

note to self: of the contribs, TestRemoteSort had two failed tests (not horribly surprising) and PatternAnalyzerTest generated an Error (?!?!) ...

{code}
java.lang.IllegalStateException: termText
    at org.apache.lucene.index.memory.PatternAnalyzerTest.assertEquals(PatternAnalyzerTest.java:213)
    at org.apache.lucene.index.memory.PatternAnalyzerTest.run(PatternAnalyzerTest.java:148)
    at org.apache.lucene.index.memory.PatternAnalyzerTest.testMany(PatternAnalyzerTest.java:87)
{code}

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

one last updated: the Locale.US asserts in TestRemoteSort had the same problem as TestSort, they were suppose to be moved, but instead they were just copied (not sure how i missed that before)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734433#action_12734433 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Hmmm... somehow i overlooked the fact that even after i "fixed" TestRemoteSort in the last patch, it's still failing. Here's the assertion failure...
{code}
junit.framework.AssertionFailedError: testRemoteCustomSort Comparator: multi FieldCaches for same reader/fieldname with diff types
   at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:110)
   at org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort(TestRemoteSort.java:261)
   at org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java:265)
{code}

...and here's the debugging dump of the FieldCache...

{code}
*** BEGIN testRemoteCustomSort Comparator: FieldCache Losers ***
'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95,null=>[Ljava.lang.Comparable;#22056753 size guess:2 KB
'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf,null=>[Ljava.lang.Comparable;#32045680 size guess:2 KB
*** END testRemoteCustomSort Comparator: FieldCache Losers ***
*** BEGIN org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers ***
'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@651e95,null=>[Ljava.lang.Comparable;#22056753 size guess:2 KB
'org.apache.lucene.index.DirectoryReader@1108727'=>'custom',interface java.lang.Comparable,9,org.apache.lucene.search.SampleComparable$2@5b78cf,null=>[Ljava.lang.Comparable;#32045680 size guess:2 KB
*** END org.apache.lucene.search.TestRemoteSort.testRemoteCustomSort: FieldCache Losers ***
{code}

What really confuses me about this is that the same SampleComparable instance is being used with two different queries -- once with reverse=true and once with reverse=false, yet two different SampleComparable instances are showing up in cache keys -- the probably only happens when SampleComparable is used to get a SortComparator -- not when it uses a ComparatorSource earlier in the test. 

Is this a real bug in remote sorting?


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732166#action_12732166 ] 

Uwe Schindler commented on LUCENE-1749:
---------------------------------------

Looks good as a start, one question about a comment:

What do you mean with:
 * :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser.

I do not understand, do you want to remove the IntCache? What is different with it in comparison with the other ones?

Uwe

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736309#action_12736309 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

I do - I removed that last patch because I just realized that it was missing everything but one class.

Go ahead though - I'll merge with what you have.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739108#action_12739108 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non primitive class ... or should/will the docs for that API say it only works on primitives?

Its only meant to work with primitives. I'll change the name to getPrimitiveSize - on my last pass through, I'll also review the javadoc for the classes I added.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737889#action_12737889 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq . Ie we want it to share as much of the code path as possible with how searching was in fact done

Well of course ;) I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders - but of course its a doc at a time, so the doc will only come from one subreader, and the sim/weight will be top level. So easy peasy fix. That boolean valuesFromSubReaders def had some code smell - just didn't have an alternative at the moment - fix then improve !

I'll leave the 'explain at multiple levels' for another issue - I havn't even started thinking about this issue yet - I prefer to code :) Which is kind of an oxymoron.

{quote}
i don't have the code in front of me, but i thought i was adding the sub
readers to the list it's iterating over, so it will eventually recurse all
the way to the bottom.{quote}

Ah right, sorry about the false alarm. One of the few times I've seen .size() in a for loop where its actually needed ;)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

Here is a rough draft for an explain fix.

Explain for custom and valuesource now drop to per segment to retrieve fieldcache values. Resolving this issue will also resolve LUCENE-1771.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Lucene Fields: [New, Patch Available]  (was: [New])
    Fix Version/s: 2.9

Technically this isn't a bug, so i probably shouldn't add it to the 2.9 blocker list, but i really think it would be a good idea to have something like this in the 2.9 release.

At the very least: i'd like to put it on the list until/unless there is consensus that it's not needed.


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

Here is a start towards guessing the fieldcache ram usage.

It probably works fairly well, though it will be limited by stack space on a very heavily nested object graph.

I've added the size guess for getValue in the introspection output.

Its a start anyway.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch


bq. the interestingthing is that the CacheEntry.toString() doesn't show the Local.US was used when getting the Strings[] FieldCache

I'm an idiot ... the Locale isn't used like a FieldCache Parser ... the same String[] is used regardless of the Localed, so it's never part of the CacheKey.  the output is correct.

revised patch fixes TestSort as mark pointed out, and updates some javadocs where i missleading suggested different Locales might trigger InsanityType.VALUEMISMATCH

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737984#action_12737984 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. I was a bit hazy on explain, so for some reason I had it in my head that you would have to combine the explanations from multiple subreaders

bq. but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()

To be a little more clear - this was originally why I went the direction I did - I assumed the reader was being used for stats that needed to come from the top level reader. Gut reaction seeing it go into scorer. I hadn't really checked that, at least for these queries, that wasn't the case - they just use it for the filter/fieldcache.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739080#action_12739080 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

bq. Right - that code was well tested and exercised via MultiSearcher in the past (all idf values had to come from Weight to avoid getting idfs per sub-searcher).

Ahh right.

bq. One thing that's missing for explain() is that there is no way to get "df" as opposed to "idf" from the Weight.

But this only affects the "atomic" queries, right?  So eg TermWeight
could simply hold onto this value and then use it during explain.
Hmm... though TermQuery's ctor doesn't get the df directly, because it
calls similarity.idf(term, searcher).  I don't really like making a
separate additional call to docFreq.

How about, for queries that need to go and look up docFreq, their
QueryWeight impls simply hold onto the [top-level] IndexSearcher that
had been passed to their ctor, and then do the docFreq call against
that, if explain is invoked?

bq. Right.. it doesn't belong there. Perhaps deprecate and remove from the Scorer base in 3.0? (since one can't reliably call it now anyway).

+1

Hoss/Mark do you want to fold it in to the patch, here?  Or I can open
a new issue?


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737294#action_12737294 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

This issue was a fantastic idea by the way!

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740275#action_12740275 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

Here is the output - it appears to think String[] and StringIndex are both string:

VALUEMISMATCH: Multiple distinct value objects for org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a+string
	'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class org.apache.lucene.search.FieldCache$StringIndex,3,null,null=>org.apache.lucene.search.FieldCache$StringIndex#279807577 (size =~ 152 bytes)
	'org.apache.lucene.index.CompoundFileReader$CSIndexInput@56d73c7a'=>'string',class java.lang.String,11,null,null=>[Ljava.lang.String;#647057258 (size =~ 108 bytes)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741479#action_12741479 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

bq. Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker? We could eg do it once only so we don't spam the stderr logs

I'm not really comfortable dumping anything to System.err without user requesting it ... but this is a really interesting idea.  (I suppose we could add an infoStream type idea to FieldCache to expose this)

FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect itself immediately after calling createValue, and could even test if any of the Insanity instances returned are related to the current call (by comparing the CacheEntry with the Entry it's using) ... it could even log a useful stack trace since the sanity check would be happening in the same call stack as at least one of the CacheEntries in the Insanity object.

I've opened LUCENE-1798 to track implmenting somehting like this once the FieldCacheSanityChecker gets committed.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749-hossfork.patch

This is a complete overhaul of the internals of FieldCacheSanityChecker, and it's API so that it's a lot cleaner and easier to use programaticly.  

This also makes it easier for tests to run an analsis, and then ignore the _types_ of errors they "expect" without ignoring whole cache entries (so a test that expects to have subreader problems can ignore those, even if one of those cache entires also fails a sanity check with another entry for a different reason (ie: different parser on same reader)

And in the long run: this should make it easier for us to add new types of sanity checks (which i'm guessing we'll think of when the internals get overhauled) without changing the API too much.

*NOTE:* This is based on Miller's attachment #12414960 (29/Jul/09) ... i haven't merged in or looked at any of the changes he made after that.  i suspect the only overlap (if any) is how CacheEntry uses the Ram Estimation code ... i switched to having estimateSize(RamUsageEstimator) cache the value and then toString uses it if it's there ... the FieldCacheSanityChecker takes care of calling it if the client code asks for it.

Mark: viv's got all weekend off, so i'm probably not going to have time to look at this again for 4 or 5 days, if you want to take a stab at merging the patches thta would be seriously awesome.



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739743#action_12739743 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

patch is coming soon - I've merged to trunk and pulled the separate issues - just want to look over some a bit later. Would have had this sooner, but Eclipse decided to start crashing every 5 minutes this morning because firefox brought in a new xulrunner and ... ugg - at least its not Windows ... coming though.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736316#action_12736316 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

uh ... ok .. what kind of updates do you have locally?  (no point in merging later if i'm just going to write stuff you've already written)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740308#action_12740308 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

Okay, sorry - I messed up when merging with trunk.

In TestSort you had moved the local sorting to the bottom in the multi sort test - I kept that, but I also kept them higher up. So thats the fail - they just have to be removed.

Line 953-957 it looks - sorry bout that - just didn't notice it fail with the other 3.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737509#action_12737509 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

{quote}I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in - so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document. (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches){quote}

The way the per segment stuff went, we don't push down to the sub readers for the fieldcache per say - we just search each sub reader separately - so per reader fieldcache is just kind of a side effect. Then the top level reader is still used for things like stats and explain. 

I switched the explain for the offending stuff (custom/valuesource) to use a DocValues class that does push down to each subreader for the fieldcache though (while everything else still uses the top reader) - its in the scorer, so I added a switch to push down to subreaders for fieldcache access or not - only explain pushes  down, while regular scoring doesn't (regular scoring will be working per sub reader anyway, because they are searched one at a time). 

I can merge up the patches.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

Here is a patch that just updates the ram usage estimator code.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735190#action_12735190 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. Is this a real bug in remote sorting

I think so.

SortField is constructed on the client side and passed as a param to the remote Searchable. It seems you get a new factory every time this happens, not the client factory on the SortField (of course, because its constructed on the other side of the wire).

So every search call adds a new factory to the mix. If you just make the call once, the test will pass.

Its a nice find.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732297#action_12732297 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

We prob would want to provide an alternate toString that includes the ram guess and the default that skips it - i havn't tested performance, but it might take a while to check a gigantic string array.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

slightly revised patch based on java-dev@lucene discussion...

the sortFieldTYpe and Locale portions of Cache.Entry are never used by FieldCache -- just a deprecated class that abuses the Entry api out of lazyiness... so the CacheEntry debugging abstraction shouldn't expose them (but i left in code to manifest them in the toString() if they are atypical just in case).  Also added some deprecation notices so we remember to remove them once they are no longer needed.



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

checkpoint: no functional change from mark's previous patch, just improved all the javadocs, including explanation of SanityCheckers purpose and experimental/expert warnings where appropriate.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737424#action_12737424 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Quick responses to some other comments...

bq. I chose a different route after considering both

i trust you to make the right call, just thought i'd point it out in case you hadn't though of it.

bq. If you push explain to the sub readers, you will get why it scored as it did for each subreader - not one top level explain

I don't really follow you on this (i need to take a look at your proposed fix) .. i'm not suggesting we push the whole explain down to the subreader, just that when the explain method wants to get hte FieldCache value for a doc, it should fetch the FieldCache for the SegmentReader the doc is in -- so it gets the exact same value (and same FieldCache entry) as the scoring code did when it scores the document.  (or maybe i'm completley missunderstanding how these classes were reimplimented to use segment based field caches)

bq. This issue was a fantastic idea by the way!

yeah ... i was pretty out of the loop on all the "push sorting down into the segment" discussion, but when i noticed yonik pointing out all the ways solr's fieldcache usage was going to explode if we didn't change it it occured to me that this would probably be a big problem for anyone doing non-trivial stuff with Lucene, so it would be nice to have a way to toruble shoot it (i also had very little faith in Lucene-Java's test coverage since we only have unit tests that verify "correct" behavior when we make changes -- but nothing sanity checks how that behavior happened (at least: not untill now)


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739120#action_12739120 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

{quote}
Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.
{quote}

Argh, right.

bq. If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader.

+1

bq. Of course... one might wonder if we should bother supporting serializable in Lucene longer term at all - anyone dealing with distributed systems has found it to have too many shortcomings anyway.

Yeah this was never really "settled" in LUCENE-1473.  Lucene currently
supports live serialization, but not cross-version
serialization... and we have moved RemoteSearchable to contrib and
removed RMI from Searchable.

Does Solr ever rely on Lucene's "implements Serializable"?


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732190#action_12732190 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

bq. :TODO: is the "int" sort type still needed? ... doesn't seem to be used anywhere, code just tests "custom" for SortComparator vs Parser.

sorry ... badly placed quotes ... that was in referent to Entry.type. 

Until i changed getStrings, getStringIndex, and getAuto to construct Entry objects as part of my refactoring the "type" attribute (and the constructor that takes a type argument) didnt' seem to be used anywhere (as far as i could tell)

My guess: maybe some previous changes refactored logic that switched on type up into the SortFields?, so the FieldCache no longer needs to care about it?


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738100#action_12738100 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

bq. That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()

Ugh, you're right.

I think It shouldn't be doing that?  Ie, a Weight instance should
"capture" all stats needed from the top-level searcher, on creation,
and then when we ask for a scorer or an explain (or other future
things that take an IndexReader) we should always pass in a single
segment reader.  This way we don't have to duplicate the "go find the
right sub-reader" in many places.

It's interesting that we didn't (I think?) have a similar problem w/
scorer when we switched to passing it the sub-reader.

bq. I'll leave the 'explain at multiple levels' for another issue

It looks like it's up to each query, which level does what.
IndexSearcher's explain always calls Weight.explain, but then some
Query impls (eg BooleanQuery) do everything in Weight.explain, while
others (eg TermQuery, PhraseQuery) do some work in Weight.explain and
some in the scorer.

I guess this makes sense: "atomic" Queries (TermQuery, PhraseQuery)
will need to fire up a scorer since there's real work to be done to
see the specifics of how that doc was matched.  Whereas BooleanQuery simply
"glues" together other queries so it doesn't need to forward to its
[many] scorers.

So the only odd thing is why explain is part of Scorer base
class... seems like the method could/should live "privately" to only
those queries that need it.

But I agree let's leave this be for now...


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737877#action_12737877 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

bq. In fact.... maybe we should simply fix IndexSearcher.explain to do this for all queries?

That was my first thought... but it would probably break more than it helped right now (by exposing more limitations) - for example idf in TermWeight.explain()


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Chris Hostetter (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738721#action_12738721 ] 

Chris Hostetter commented on LUCENE-1749:
-----------------------------------------



: I've got one more draft here with the smallest of tweaks - javadoc 
: spelling errors, and one perhaps one or two other tiny things - stuff I 
: just would toss out rather than merge - but are you doing anything here 
: right now Hoss? I think not at the moment, so if thats the case I'll put 
: up one more patch before you grab the conch back. Otherwise I'll hold 
: off on anything till you put something up.


you have the conch ... i haven't worked on anything related to this issue 
since my last patch.

i'll try to look at it again tomorow.



-Hoss



> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736732#action_12736732 ] 

Mark Miller commented on LUCENE-1749:
-------------------------------------

bq. figure out why previously mentioned tests are breaking (need help with this one ... don't know enough about the code these tests excercise

Eh - its yucky. There are parts where the tests are passing the top level reader (say to a collector) when it should be using the sub readers. I fixed one :)
But then there is more - looked at a couple more difficult ones that also pass the top level reader for the test.

And then there is explain - IndexSearcher passes the top level reader to the weight explain, and valuesourcequery will get a fieldcache based on that reader. I guess that one is a bug.

And there are prob a few other similar type things...

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated LUCENE-1749:
-----------------------------

    Attachment: LUCENE-1749.patch

checkpoint: really hack implementation of checkFieldCacheSubReaderSanity that tells you when a FieldCache contains entries on the same field in a both wrapper/inner readers ... but it doesn't tell you which entries (and unlike checkFieldCacheTypeSanity it's no obvious just looking at the toString())

This patch causes an error in TestStressSort and *LOTS* of errors in TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues.

I'd like to think these errors are just from tests doing abnormal things, and in the case of TesStressSort that may be true (it looks like it has some hinky reuse of readers in a MultiReader) but in the other test cases where it's a little easier to see at a glance what's going on, there's really nothing odd here -- simple use of a single IndexSearcher to execute a CustomScoreQuery, ValueSourceQuery, etc... these are each causing multiple FieldCache instances to pop up for a single field (one keyed on the DirectoryReader and another keyed on a CompoundFileReader$CSIndexInput)

i'm try to work on refactoring the sanity checking methods so they are easier to use (and write some tests for them to prove the work as expected on both sane/insane) but it would be helpful if someone who understands more about how FieldCaches should look (pushed into the subreaders) could tyr out this patch and let me know if these failures look legit.


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment: LUCENE-1749.patch

Patch cleanup - more suitable for browsing.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man resolved LUCENE-1749.
------------------------------

    Resolution: Fixed
      Assignee: Hoss Man

Committed revision 803676.


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1749) FieldCache introspection API

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated LUCENE-1749:
--------------------------------

    Attachment:     (was: LUCENE-1749.patch)

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737217#action_12737217 ] 

Hoss Man commented on LUCENE-1749:
----------------------------------

Mark: thanks for looking into the tests.

If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders during scoring, then shouldn't the explain methods do the same thing?  it definitely seems like a bug if getting score explanation from a query causes your memory footprint to double.

Last night i thought over what a more useful API for hte sanity checker would like like ... 
My power is getting turned off for a few hours this afternoon so i'll work on it them and should have a much cleaner looking patch to post this evening.

(BTW: random thought that occurred to me last night: wouldn't the simplest way to implement the RamEstimator just be to use vanilla java serialization to a custom OutputStream that just counted the bytes and sent them to /dev/null) ?

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739098#action_12739098 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

bq. Asking the searcher for the docFreq is the right thing to do... but people who rely on Weight being serializable might be in for a nasty surprise.

If we decide not to ask weight to hang onto it's searcher, then the other way to do it right is to change explain() to accept a Searcher as well as a IndexReader.

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740624#action_12740624 ] 

Michael McCandless commented on LUCENE-1749:
--------------------------------------------

Maybe we should simply print a warning, eg to System.err, on detecting that 2X RAM usage has occurred, pointing people to the sanity checker?  We could eg do it once only so we don't spam the stderr logs...

> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1749) FieldCache introspection API

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739163#action_12739163 ] 

Yonik Seeley commented on LUCENE-1749:
--------------------------------------

bq. Does Solr ever rely on Lucene's "implements Serializable"?

Nope - itra-node communications use the same mechanism as clients... a generic data structures (Map,List,Document,etc) that has custom serialization to XML,JSON,Python,Ruby or Binary (and binary is now used by default).


> FieldCache introspection API
> ----------------------------
>
>                 Key: LUCENE-1749
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1749
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: fieldcache-introspection.patch, LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the FieldCache to provide info about what is in the FieldCache at any given moment.  We should also provide utility methods for sanity checking that the FieldCache doesn't contain anything "odd"...
>    * entries for the same reader/field with different types/parsers
>    * entries for the same field/type/parser in a reader and it's subreader(s)
>    * etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org