You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pablo Anzorena <an...@gmail.com> on 2017/07/03 17:52:41 UTC

Solr dynamic "on the fly fields"

Hey,

I was wondering if there is some way to add fields "on the fly" based on
arithmetic operations on other fields. For example add a new field
"custom_field" = log(field1) + field2 -5.

Thanks.

Re: Solr dynamic "on the fly fields"

Posted by Erick Erickson <er...@gmail.com>.
Some aggregations are supported by combining stats with pivot facets? See:

https://lucidworks.com/2015/01/29/you-got-stats-in-my-facets/

Don't quite think that works for your use case though.

the other thing that _might_ help is all the Streaming
Expression/Streaming Aggregation work.

Best,
Erick


On Wed, Jul 5, 2017 at 6:23 AM, Pablo Anzorena <an...@gmail.com> wrote:
> Thanks Erick for the answer. Function Queries are great, but for my use
> case what I really do is making aggregations (using Json Facet for example)
> with this functions.
>
> I have tried using Function Queries with Json Facet but it does not support
> it.
>
> Any other idea you can imagine?
>
>
>
>
>
> 2017-07-03 21:57 GMT-03:00 Erick Erickson <er...@gmail.com>:
>
>> I don't know how one would do this. But I would ask what the use-case
>> is. Creating such fields at index time just seems like it would be
>> inviting abuse by creating a zillion fields as you have no control
>> over what gets created. I'm assuming your tenants don't talk to each
>> other....
>>
>> Have you thought about using function queries to pull this data out as
>> needed at _query_ time? See:
>> https://cwiki.apache.org/confluence/display/solr/Function+Queries
>>
>> Best,
>> Erick
>>
>> On Mon, Jul 3, 2017 at 12:06 PM, Pablo Anzorena <an...@gmail.com>
>> wrote:
>> > Thanks Erick,
>> >
>> > For my use case it's not possible any of those solutions. I have a
>> > multitenancy scheme in the most basic level, that is I have a single
>> > collection with fields (clientId, field1, field2, ..., field50) attending
>> > many clients.
>> >
>> > Clients can create custom fields based on arithmetic operations of any
>> > other field.
>> >
>> > So, is it possible to update let's say field49 with the follow operation:
>> > log(field39) + field25 on clientId=43?
>> >
>> > Do field39 and field25 need to be stored to accomplish this? Is there any
>> > other way to avoid storing them?
>> >
>> > Thanks!
>> >
>> >
>> > 2017-07-03 15:00 GMT-03:00 Erick Erickson <er...@gmail.com>:
>> >
>> >> There are two ways:
>> >> 1> define a dynamic field pattern, i.e.
>> >>
>> >> <dynamicField name="*_sum" type="float" ...../>
>> >>
>> >> Now just add any field in the doc you want. If it ends in "_sum" and
>> >> no other explicit field matches you have a new field.
>> >>
>> >> 2> Use the managed schema to add these on the fly. I don't recommend
>> >> this from what I know of your use case, this is primarily intended for
>> >> front-ends to be able to modify the schema and/or "field guessing".
>> >>
>> >> I do caution you though that either way don't go over-the-top. If
>> >> you're thinking of thousands of different fields that can lead to
>> >> performance issues.
>> >>
>> >> You can either put stuff in the field on your indexing client or
>> >> create a custom update component, perhaps the simplest would be a
>> >> "StatelessScriptUpdateProcessorFactory:
>> >>
>> >> see: https://cwiki.apache.org/confluence/display/solr/
>> >> Update+Request+Processors#UpdateRequestProcessors-
>> >> UpdateRequestProcessorFactories
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Mon, Jul 3, 2017 at 10:52 AM, Pablo Anzorena <
>> anzorena.fing@gmail.com>
>> >> wrote:
>> >> > Hey,
>> >> >
>> >> > I was wondering if there is some way to add fields "on the fly" based
>> on
>> >> > arithmetic operations on other fields. For example add a new field
>> >> > "custom_field" = log(field1) + field2 -5.
>> >> >
>> >> > Thanks.
>> >>
>>

Re: Solr dynamic "on the fly fields"

Posted by Pablo Anzorena <an...@gmail.com>.
Thanks Erick for the answer. Function Queries are great, but for my use
case what I really do is making aggregations (using Json Facet for example)
with this functions.

I have tried using Function Queries with Json Facet but it does not support
it.

Any other idea you can imagine?





2017-07-03 21:57 GMT-03:00 Erick Erickson <er...@gmail.com>:

> I don't know how one would do this. But I would ask what the use-case
> is. Creating such fields at index time just seems like it would be
> inviting abuse by creating a zillion fields as you have no control
> over what gets created. I'm assuming your tenants don't talk to each
> other....
>
> Have you thought about using function queries to pull this data out as
> needed at _query_ time? See:
> https://cwiki.apache.org/confluence/display/solr/Function+Queries
>
> Best,
> Erick
>
> On Mon, Jul 3, 2017 at 12:06 PM, Pablo Anzorena <an...@gmail.com>
> wrote:
> > Thanks Erick,
> >
> > For my use case it's not possible any of those solutions. I have a
> > multitenancy scheme in the most basic level, that is I have a single
> > collection with fields (clientId, field1, field2, ..., field50) attending
> > many clients.
> >
> > Clients can create custom fields based on arithmetic operations of any
> > other field.
> >
> > So, is it possible to update let's say field49 with the follow operation:
> > log(field39) + field25 on clientId=43?
> >
> > Do field39 and field25 need to be stored to accomplish this? Is there any
> > other way to avoid storing them?
> >
> > Thanks!
> >
> >
> > 2017-07-03 15:00 GMT-03:00 Erick Erickson <er...@gmail.com>:
> >
> >> There are two ways:
> >> 1> define a dynamic field pattern, i.e.
> >>
> >> <dynamicField name="*_sum" type="float" ...../>
> >>
> >> Now just add any field in the doc you want. If it ends in "_sum" and
> >> no other explicit field matches you have a new field.
> >>
> >> 2> Use the managed schema to add these on the fly. I don't recommend
> >> this from what I know of your use case, this is primarily intended for
> >> front-ends to be able to modify the schema and/or "field guessing".
> >>
> >> I do caution you though that either way don't go over-the-top. If
> >> you're thinking of thousands of different fields that can lead to
> >> performance issues.
> >>
> >> You can either put stuff in the field on your indexing client or
> >> create a custom update component, perhaps the simplest would be a
> >> "StatelessScriptUpdateProcessorFactory:
> >>
> >> see: https://cwiki.apache.org/confluence/display/solr/
> >> Update+Request+Processors#UpdateRequestProcessors-
> >> UpdateRequestProcessorFactories
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Jul 3, 2017 at 10:52 AM, Pablo Anzorena <
> anzorena.fing@gmail.com>
> >> wrote:
> >> > Hey,
> >> >
> >> > I was wondering if there is some way to add fields "on the fly" based
> on
> >> > arithmetic operations on other fields. For example add a new field
> >> > "custom_field" = log(field1) + field2 -5.
> >> >
> >> > Thanks.
> >>
>

Re: Solr dynamic "on the fly fields"

Posted by Erick Erickson <er...@gmail.com>.
I don't know how one would do this. But I would ask what the use-case
is. Creating such fields at index time just seems like it would be
inviting abuse by creating a zillion fields as you have no control
over what gets created. I'm assuming your tenants don't talk to each
other....

Have you thought about using function queries to pull this data out as
needed at _query_ time? See:
https://cwiki.apache.org/confluence/display/solr/Function+Queries

Best,
Erick

On Mon, Jul 3, 2017 at 12:06 PM, Pablo Anzorena <an...@gmail.com> wrote:
> Thanks Erick,
>
> For my use case it's not possible any of those solutions. I have a
> multitenancy scheme in the most basic level, that is I have a single
> collection with fields (clientId, field1, field2, ..., field50) attending
> many clients.
>
> Clients can create custom fields based on arithmetic operations of any
> other field.
>
> So, is it possible to update let's say field49 with the follow operation:
> log(field39) + field25 on clientId=43?
>
> Do field39 and field25 need to be stored to accomplish this? Is there any
> other way to avoid storing them?
>
> Thanks!
>
>
> 2017-07-03 15:00 GMT-03:00 Erick Erickson <er...@gmail.com>:
>
>> There are two ways:
>> 1> define a dynamic field pattern, i.e.
>>
>> <dynamicField name="*_sum" type="float" ...../>
>>
>> Now just add any field in the doc you want. If it ends in "_sum" and
>> no other explicit field matches you have a new field.
>>
>> 2> Use the managed schema to add these on the fly. I don't recommend
>> this from what I know of your use case, this is primarily intended for
>> front-ends to be able to modify the schema and/or "field guessing".
>>
>> I do caution you though that either way don't go over-the-top. If
>> you're thinking of thousands of different fields that can lead to
>> performance issues.
>>
>> You can either put stuff in the field on your indexing client or
>> create a custom update component, perhaps the simplest would be a
>> "StatelessScriptUpdateProcessorFactory:
>>
>> see: https://cwiki.apache.org/confluence/display/solr/
>> Update+Request+Processors#UpdateRequestProcessors-
>> UpdateRequestProcessorFactories
>>
>> Best,
>> Erick
>>
>> On Mon, Jul 3, 2017 at 10:52 AM, Pablo Anzorena <an...@gmail.com>
>> wrote:
>> > Hey,
>> >
>> > I was wondering if there is some way to add fields "on the fly" based on
>> > arithmetic operations on other fields. For example add a new field
>> > "custom_field" = log(field1) + field2 -5.
>> >
>> > Thanks.
>>

Re: Solr dynamic "on the fly fields"

Posted by Pablo Anzorena <an...@gmail.com>.
Thanks Erick,

For my use case it's not possible any of those solutions. I have a
multitenancy scheme in the most basic level, that is I have a single
collection with fields (clientId, field1, field2, ..., field50) attending
many clients.

Clients can create custom fields based on arithmetic operations of any
other field.

So, is it possible to update let's say field49 with the follow operation:
log(field39) + field25 on clientId=43?

Do field39 and field25 need to be stored to accomplish this? Is there any
other way to avoid storing them?

Thanks!


2017-07-03 15:00 GMT-03:00 Erick Erickson <er...@gmail.com>:

> There are two ways:
> 1> define a dynamic field pattern, i.e.
>
> <dynamicField name="*_sum" type="float" ...../>
>
> Now just add any field in the doc you want. If it ends in "_sum" and
> no other explicit field matches you have a new field.
>
> 2> Use the managed schema to add these on the fly. I don't recommend
> this from what I know of your use case, this is primarily intended for
> front-ends to be able to modify the schema and/or "field guessing".
>
> I do caution you though that either way don't go over-the-top. If
> you're thinking of thousands of different fields that can lead to
> performance issues.
>
> You can either put stuff in the field on your indexing client or
> create a custom update component, perhaps the simplest would be a
> "StatelessScriptUpdateProcessorFactory:
>
> see: https://cwiki.apache.org/confluence/display/solr/
> Update+Request+Processors#UpdateRequestProcessors-
> UpdateRequestProcessorFactories
>
> Best,
> Erick
>
> On Mon, Jul 3, 2017 at 10:52 AM, Pablo Anzorena <an...@gmail.com>
> wrote:
> > Hey,
> >
> > I was wondering if there is some way to add fields "on the fly" based on
> > arithmetic operations on other fields. For example add a new field
> > "custom_field" = log(field1) + field2 -5.
> >
> > Thanks.
>

Re: Solr dynamic "on the fly fields"

Posted by Erick Erickson <er...@gmail.com>.
There are two ways:
1> define a dynamic field pattern, i.e.

<dynamicField name="*_sum" type="float" ...../>

Now just add any field in the doc you want. If it ends in "_sum" and
no other explicit field matches you have a new field.

2> Use the managed schema to add these on the fly. I don't recommend
this from what I know of your use case, this is primarily intended for
front-ends to be able to modify the schema and/or "field guessing".

I do caution you though that either way don't go over-the-top. If
you're thinking of thousands of different fields that can lead to
performance issues.

You can either put stuff in the field on your indexing client or
create a custom update component, perhaps the simplest would be a
"StatelessScriptUpdateProcessorFactory:

see: https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors#UpdateRequestProcessors-UpdateRequestProcessorFactories

Best,
Erick

On Mon, Jul 3, 2017 at 10:52 AM, Pablo Anzorena <an...@gmail.com> wrote:
> Hey,
>
> I was wondering if there is some way to add fields "on the fly" based on
> arithmetic operations on other fields. For example add a new field
> "custom_field" = log(field1) + field2 -5.
>
> Thanks.