You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Smiley, David W." <ds...@mitre.org> on 2013/11/20 16:44:36 UTC

Should Postings codec be Solr's default for 'id' ?

The Postings codec seems ideal for primary key fields.  Shouldn't Solr use this for its 'id' field by default?

~ David

Re: Should Postings codec be Solr's default for 'id' ?

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Gentleman,
Btw, Solr now has one more case for Pulsing codec - block indexing, while
_root_ (id) field spans on whole block and mostly it happens only once at
segment. Ideally it should be encoded like range of doc nums without gaps.
WDYT? is it worth to raise an issue?


On Wed, Nov 20, 2013 at 9:36 PM, Smiley, David W. <ds...@mitre.org> wrote:

> LOL, yeah I meant the Pulsing Codec.  Thanks for the pointer to that
> issue; I vaguely recall it.  It wouldn't hurt to have a javadoc comment in
> PulsingPostingsFormat to explain it's not needed on unique key field
> because the default codec does it.
>
> ~ David
>
> On 11/20/13 12:05 PM, "Robert Muir" <rc...@gmail.com> wrote:
>
> >Did you mean Pulsing codec?
> >
> >This has been happening automatically with the default codec for
> >"unique id" fields since 4.1
> >(https://issues.apache.org/jira/browse/LUCENE-4498)
> >
> >Pulsing can still be used for other advanced use cases (e.g. specify
> >to pulse things with docfreq < 3 or something like that), but its
> >redundant and probably actually less efficient for the "id field" case
> >than just using the default codec.
> >
> >On Wed, Nov 20, 2013 at 7:44 AM, Smiley, David W. <ds...@mitre.org>
> >wrote:
> >> The Postings codec seems ideal for primary key fields.  Shouldn't Solr
> >>use
> >> this for its 'id' field by default?
> >>
> >> ~ David
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>

Re: Should Postings codec be Solr's default for 'id' ?

Posted by "Smiley, David W." <ds...@mitre.org>.
LOL, yeah I meant the Pulsing Codec.  Thanks for the pointer to that
issue; I vaguely recall it.  It wouldn't hurt to have a javadoc comment in
PulsingPostingsFormat to explain it's not needed on unique key field
because the default codec does it.

~ David

On 11/20/13 12:05 PM, "Robert Muir" <rc...@gmail.com> wrote:

>Did you mean Pulsing codec?
>
>This has been happening automatically with the default codec for
>"unique id" fields since 4.1
>(https://issues.apache.org/jira/browse/LUCENE-4498)
>
>Pulsing can still be used for other advanced use cases (e.g. specify
>to pulse things with docfreq < 3 or something like that), but its
>redundant and probably actually less efficient for the "id field" case
>than just using the default codec.
>
>On Wed, Nov 20, 2013 at 7:44 AM, Smiley, David W. <ds...@mitre.org>
>wrote:
>> The Postings codec seems ideal for primary key fields.  Shouldn't Solr
>>use
>> this for its 'id' field by default?
>>
>> ~ David
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Should Postings codec be Solr's default for 'id' ?

Posted by Robert Muir <rc...@gmail.com>.
Did you mean Pulsing codec?

This has been happening automatically with the default codec for
"unique id" fields since 4.1
(https://issues.apache.org/jira/browse/LUCENE-4498)

Pulsing can still be used for other advanced use cases (e.g. specify
to pulse things with docfreq < 3 or something like that), but its
redundant and probably actually less efficient for the "id field" case
than just using the default codec.

On Wed, Nov 20, 2013 at 7:44 AM, Smiley, David W. <ds...@mitre.org> wrote:
> The Postings codec seems ideal for primary key fields.  Shouldn't Solr use
> this for its 'id' field by default?
>
> ~ David

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org