You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Bill Bell (JIRA)" <ji...@apache.org> on 2011/08/01 05:01:14 UTC

[jira] [Created] (SOLR-2686) Extend FieldCache architecture to multiple Values

Extend FieldCache architecture to multiple Values
-------------------------------------------------

                 Key: SOLR-2686
                 URL: https://issues.apache.org/jira/browse/SOLR-2686
             Project: Solr
          Issue Type: Bug
            Reporter: Bill Bell


I would consider this a bug. It appears lots of people are working around this limitation, 
why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?

Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.

Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073538#comment-13073538 ] 

Uwe Schindler commented on LUCENE-3354:
---------------------------------------

bq. If that ability is removed from Lucene, I guess we could always move some of the old FieldCache logic to Solr though.

Solr can always use SlowMultiReaderWrapper (see above)

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3354:
------------------------------------------

    Fix Version/s: 4.0

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073530#comment-13073530 ] 

Robert Muir commented on LUCENE-3354:
-------------------------------------

+1, die insanity, die.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3354:
------------------------------------------

    Attachment: LUCENE-3354.patch

Oops, uploaded the wrong patch.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>         Attachments: LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2686) Extend FieldCache architecture to multiple Values

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073515#comment-13073515 ] 

Michael McCandless commented on SOLR-2686:
------------------------------------------

+1, though really this should be a Lucene issue (FieldCache is in Lucene).

We actually have a start at this: the core part of UnInvertedField was factored into Lucene as oal.index.DocTermOrds.  I think all we need to do is make this accessible through FieldCache.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: SOLR-2686
>                 URL: https://issues.apache.org/jira/browse/SOLR-2686
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3354:
--------------------------------

    Attachment: LUCENE-3354_testspeed.patch

attached is a patch that seems to help  for me, it doesn't create such long unicode strings in the test.

Is there some reason why the test would want very long strings?

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch, LUCENE-3354_testspeed.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073532#comment-13073532 ] 

Martijn van Groningen commented on LUCENE-3354:
-----------------------------------------------

bq. What are thoughts on using DocValues rather then FieldCache?
Maybe both should be available. Not all fields have indexed docvalues.

bq. We should start with this in 4.0! For backwards compatibility we could still have the FieldCache class, but just delegating.
Changing the architecture seems like a big task to me. Maybe that should be done in a different issue. This issue will then depend on it.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073533#comment-13073533 ] 

Michael McCandless commented on LUCENE-3354:
--------------------------------------------

+1 to moving FC to atomic readers only, and let SlowMultiReaderWrapper absorb the insanity.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079284#comment-13079284 ] 

Martijn van Groningen commented on LUCENE-3354:
-----------------------------------------------

I opened LUCENE-3360 for moving FieldCache to IndexReader. This issue should be concerned with adding getDocTermOrds() to FieldCache.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Bill Bell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079203#comment-13079203 ] 

Bill Bell commented on LUCENE-3354:
-----------------------------------

Lots of activity... Can someone lead this?

Bill


> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3354:
------------------------------------------

    Attachment: LUCENE-3354.patch

Updated the patch. Added a test to TestFieldCache. I think this is ready to be committed. New issues should be concerned with integrating DocTermOrds into function queries, sorting, grouping and more.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086271#comment-13086271 ] 

Robert Muir commented on LUCENE-3354:
-------------------------------------

OK, thanks. I bet this was probably slowing things down for simpletext or something stupid :)

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch, LUCENE-3354_testspeed.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086197#comment-13086197 ] 

Martijn van Groningen commented on LUCENE-3354:
-----------------------------------------------

I committed a fix. Test pass now on my local box with -Dtests.multiplier=3.
If build is successful on Jenkins we can close this issue.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073536#comment-13073536 ] 

Yonik Seeley commented on LUCENE-3354:
--------------------------------------

bq. (icluding the broken Solr parts still using TopLevel FieldCache entries).

Some top-level field cache uses are very much by design in Solr.
If that ability is removed from Lucene, I guess we could always move some of the old FieldCache logic to Solr though.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen resolved LUCENE-3354.
-------------------------------------------

    Resolution: Fixed

Committed in revision 1158393 (trunk).

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086251#comment-13086251 ] 

Robert Muir commented on LUCENE-3354:
-------------------------------------

Thanks Martijn: any idea how we can speed this test up? for our 'ant test' runs with multiplier=3, this takes a significant amount of time (over 15 minutes!), more than all the other tests combined.

Before the commit my builds were taking about 9 minutes, log here: http://sierranevada.servebeer.com/

{noformat}
    [junit] Testsuite: org.apache.lucene.search.TestFieldCache
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1,062.362 sec
{noformat}

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073522#comment-13073522 ] 

Ryan McKinley commented on LUCENE-3354:
---------------------------------------

What are thoughts on using DocValues rather then FieldCache?

If we do choose to extend the FieldCache architecture, it would be so much cleaner if it were a simple Map<K,V> directly on the Reader rather then a static thing holding a WeakHashMap<Reader,Cache>


> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073528#comment-13073528 ] 

Uwe Schindler commented on LUCENE-3354:
---------------------------------------

In general the FieldCache should come from the reader (and non-atomic readers should throw UOE) and not from a static method of a random abstract class somewhere in the search package. The orginal FieldCache design was broken and there are many issues around this. This would also remove the insanity issues. We can of course make SlowMultiReaderWrapper behave correct, but then all users know that they do something wrong (icluding the broken Solr parts still using TopLevel FieldCache entries).

We should start with this in 4.0! For backwards compatibility we could still have the FieldCache class, but just delegating.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Moved] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller moved SOLR-2686 to LUCENE-3354:
-------------------------------------------

    Issue Type: Improvement  (was: Bug)
           Key: LUCENE-3354  (was: SOLR-2686)
       Project: Lucene - Java  (was: Solr)

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086266#comment-13086266 ] 

Martijn van Groningen commented on LUCENE-3354:
-----------------------------------------------

I don't think there is any reason for generating long unicode strings. Only the cache behavior needs to be tested.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch, LUCENE-3354_testspeed.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085032#comment-13085032 ] 

Michael McCandless commented on LUCENE-3354:
--------------------------------------------

Patch looks good Martijn!

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073518#comment-13073518 ] 

Martijn van Groningen commented on LUCENE-3354:
-----------------------------------------------

+1. If DocTermOrds is available in FieldCache, then Grouping (Term based impl) can also use DocTermOrds.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073828#comment-13073828 ] 

Hoss Man commented on LUCENE-3354:
----------------------------------

bq. This would also remove the insanity issues. 

FWIW: the WeakHashMap isn't the sole source of "insanity" - that can also come about from inconsistent usage for a single field (ie: asking for string and int caches for the same field)

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Reopened] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir reopened LUCENE-3354:
---------------------------------


reopening: there is a problem in the test

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086088#comment-13086088 ] 

Robert Muir commented on LUCENE-3354:
-------------------------------------

The new multivalued test in TestFieldCache exhibits some problems if NUM_ORD > 2.

This is the case if you e.g. use -Dtests.multiplier=3 (like hudson does)... I temporarily disabled it and put in a loud system.out.println
{noformat}
-    NUM_ORDS = atLeast(2);
+    System.out.println("WARNING: NUM_ORDS is wired to 2, test fails otherwise!!!!!!!!!!!!!!!!!!!!!");
+    NUM_ORDS = 2; //atLeast(2);
{noformat}

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3354:
------------------------------------------

    Attachment:     (was: LUCENE-3360.patch)

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen resolved LUCENE-3354.
-------------------------------------------

    Resolution: Fixed

Test passes on Jenkins.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>             Fix For: 4.0
>
>         Attachments: LUCENE-3354.patch, LUCENE-3354.patch, LUCENE-3354_testspeed.patch
>
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3354) Extend FieldCache architecture to multiple Values

Posted by "Martijn van Groningen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3354:
------------------------------------------

    Attachment: LUCENE-3360.patch

Attached initial patch.
FieldCache has a new method:
{code}
FieldCache#getDocTermOrds(reader, field)
{code}

The DocTermOrdsCreator currently doesn't validate any thing. I'm not sure what it should validate (DocTermsIndex doesn't validate either...). 

This patch does *not* rely on the patch in LUCENE-3360. Implement LUCENE-3360 properly might take some time. I think issue can be implemented much quicker.

> Extend FieldCache architecture to multiple Values
> -------------------------------------------------
>
>                 Key: LUCENE-3354
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3354
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Bill Bell
>
> I would consider this a bug. It appears lots of people are working around this limitation, 
> why don't we just change the underlying data structures to natively support multiValued fields in the FieldCache architecture?
> Then functions() will work properly, and we can do things like easily geodist() on a multiValued field.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org