You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by William Bell <bi...@gmail.com> on 2014/02/01 05:20:26 UTC

facet.prefix or separation?

What should be better for performance to get those facets that begins with
A?

1.
facet=true&facet.field=conditions&facet.prefix=A

2.
When indexing create a new field conditions_A, and use it:
facet=true&facet.field=conditions_A

Thoughts?



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: facet.prefix or separation?

Posted by William Bell <bi...@gmail.com>.
This is the approach for "words that begin with" using an alpha-span on the
site:

A B C D E F G ...

The user clicks "A" and I would use conditions_A.



On Fri, Jan 31, 2014 at 9:42 PM, Alexandre Rafalovitch
<ar...@gmail.com>wrote:

> Ok, so you are pre-partitioning the facet field based on initial
> letter. So all the texts that start from A will go into conditions_A
> and all the texts that start from C will go into conditions_C.
> Interesting approach. Ignore whatever I said before.
>
> If this does not cause other issues, than it is possible that the
> partitioned approach will be slightly more efficient because it does
> not need to load into memory the field cache for non_A content. But
> that could be more memory size efficiency than the speed one.
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Sat, Feb 1, 2014 at 11:33 AM, William Bell <bi...@gmail.com> wrote:
> > Just to be perfectly clear, it is not a binary field.
> >
> > conditions = "A west side story"
> > conditions = "The edge of reason"
> >
> > I look for those strings beginning with A and set that in conditions_A:
> >
> > conditions_A = "A west side story"
> >
> > OK?
> >
> >
> > On Fri, Jan 31, 2014 at 9:29 PM, Alexandre Rafalovitch
> > <ar...@gmail.com>wrote:
> >
> >> I am quite sure that the binary flag will be faster as you will just
> >> get a gigantic vector pre-loaded into memory. The problem starts if
> >> you are going to have lots of those prefixes. Then, the memory
> >> requirements may become an issue. Then, the facet becomes more
> >> flexible as it uses the same list for any arbitrary prefix.
> >>
> >> There are my thoughts (as requested). I haven't tested this in
> production.
> >>
> >> Regards,
> >>    Alex.
> >> Personal website: http://www.outerthoughts.com/
> >> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> >> - Time is the quality of nature that keeps events from happening all
> >> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> >> book)
> >>
> >>
> >> On Sat, Feb 1, 2014 at 11:20 AM, William Bell <bi...@gmail.com>
> wrote:
> >> > What should be better for performance to get those facets that begins
> >> with
> >> > A?
> >> >
> >> > 1.
> >> > facet=true&facet.field=conditions&facet.prefix=A
> >> >
> >> > 2.
> >> > When indexing create a new field conditions_A, and use it:
> >> > facet=true&facet.field=conditions_A
> >> >
> >> > Thoughts?
> >> >
> >> >
> >> >
> >> > --
> >> > Bill Bell
> >> > billnbell@gmail.com
> >> > cell 720-256-8076
> >>
> >
> >
> >
> > --
> > Bill Bell
> > billnbell@gmail.com
> > cell 720-256-8076
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: facet.prefix or separation?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Ok, so you are pre-partitioning the facet field based on initial
letter. So all the texts that start from A will go into conditions_A
and all the texts that start from C will go into conditions_C.
Interesting approach. Ignore whatever I said before.

If this does not cause other issues, than it is possible that the
partitioned approach will be slightly more efficient because it does
not need to load into memory the field cache for non_A content. But
that could be more memory size efficiency than the speed one.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sat, Feb 1, 2014 at 11:33 AM, William Bell <bi...@gmail.com> wrote:
> Just to be perfectly clear, it is not a binary field.
>
> conditions = "A west side story"
> conditions = "The edge of reason"
>
> I look for those strings beginning with A and set that in conditions_A:
>
> conditions_A = "A west side story"
>
> OK?
>
>
> On Fri, Jan 31, 2014 at 9:29 PM, Alexandre Rafalovitch
> <ar...@gmail.com>wrote:
>
>> I am quite sure that the binary flag will be faster as you will just
>> get a gigantic vector pre-loaded into memory. The problem starts if
>> you are going to have lots of those prefixes. Then, the memory
>> requirements may become an issue. Then, the facet becomes more
>> flexible as it uses the same list for any arbitrary prefix.
>>
>> There are my thoughts (as requested). I haven't tested this in production.
>>
>> Regards,
>>    Alex.
>> Personal website: http://www.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all
>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> book)
>>
>>
>> On Sat, Feb 1, 2014 at 11:20 AM, William Bell <bi...@gmail.com> wrote:
>> > What should be better for performance to get those facets that begins
>> with
>> > A?
>> >
>> > 1.
>> > facet=true&facet.field=conditions&facet.prefix=A
>> >
>> > 2.
>> > When indexing create a new field conditions_A, and use it:
>> > facet=true&facet.field=conditions_A
>> >
>> > Thoughts?
>> >
>> >
>> >
>> > --
>> > Bill Bell
>> > billnbell@gmail.com
>> > cell 720-256-8076
>>
>
>
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076

Re: facet.prefix or separation?

Posted by William Bell <bi...@gmail.com>.
Just to be perfectly clear, it is not a binary field.

conditions = "A west side story"
conditions = "The edge of reason"

I look for those strings beginning with A and set that in conditions_A:

conditions_A = "A west side story"

OK?


On Fri, Jan 31, 2014 at 9:29 PM, Alexandre Rafalovitch
<ar...@gmail.com>wrote:

> I am quite sure that the binary flag will be faster as you will just
> get a gigantic vector pre-loaded into memory. The problem starts if
> you are going to have lots of those prefixes. Then, the memory
> requirements may become an issue. Then, the facet becomes more
> flexible as it uses the same list for any arbitrary prefix.
>
> There are my thoughts (as requested). I haven't tested this in production.
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Sat, Feb 1, 2014 at 11:20 AM, William Bell <bi...@gmail.com> wrote:
> > What should be better for performance to get those facets that begins
> with
> > A?
> >
> > 1.
> > facet=true&facet.field=conditions&facet.prefix=A
> >
> > 2.
> > When indexing create a new field conditions_A, and use it:
> > facet=true&facet.field=conditions_A
> >
> > Thoughts?
> >
> >
> >
> > --
> > Bill Bell
> > billnbell@gmail.com
> > cell 720-256-8076
>



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

Re: facet.prefix or separation?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I am quite sure that the binary flag will be faster as you will just
get a gigantic vector pre-loaded into memory. The problem starts if
you are going to have lots of those prefixes. Then, the memory
requirements may become an issue. Then, the facet becomes more
flexible as it uses the same list for any arbitrary prefix.

There are my thoughts (as requested). I haven't tested this in production.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sat, Feb 1, 2014 at 11:20 AM, William Bell <bi...@gmail.com> wrote:
> What should be better for performance to get those facets that begins with
> A?
>
> 1.
> facet=true&facet.field=conditions&facet.prefix=A
>
> 2.
> When indexing create a new field conditions_A, and use it:
> facet=true&facet.field=conditions_A
>
> Thoughts?
>
>
>
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076