You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Christopher ARZUR <ch...@cognix-systems.com> on 2013/03/19 15:30:59 UTC

Bitwise operation

Hi,

Does solr (4.1.0) supports /bitwise/ AND or /bitwise/ OR operator so 
that we can specify a field to be compared against an index using 
/bitwise/ AND or OR ?

Thanks,
-- 
Christopher

Re: Bitwise operation

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Christopher,

Would you mind if i ask you about a sample?
19.03.2013 19:31 пользователь "Christopher ARZUR" <
christopher.arzur@cognix-systems.com> написал:

> Hi,
>
> Does solr (4.1.0) supports /bitwise/ AND or /bitwise/ OR operator so that
> we can specify a field to be compared against an index using /bitwise/ AND
> or OR ?
>
> Thanks,
> --
> Christopher
>

Re: Bitwise operation

Posted by Upayavira <uv...@odoko.co.uk>.
Not to my knowledge. I guess the nearest might be regular expressions
but that would involve one character, rather than one bit per element,
so not nearly as efficient.

How many bits? Can you break them down into separate fields?

Upayavira

On Tue, Mar 19, 2013, at 02:30 PM, Christopher ARZUR wrote:
> Hi,
> 
> Does solr (4.1.0) supports /bitwise/ AND or /bitwise/ OR operator so 
> that we can specify a field to be compared against an index using 
> /bitwise/ AND or OR ?
> 
> Thanks,
> -- 
> Christopher

Re: Bitwise operation

Posted by Walter Underwood <wu...@wunderwood.org>.
How often is "frequently"? If it is 1000/second, you have a problem, but you'd have a problem with most solutions.

Measure or estimate how many documents are affected, how often. Then set a latency for how long you can wait before the change is visible.

With those, you can evaluate solutions. Without those, you'll never know if it works, even after you build it.

Hundreds of thousands of documents is a small to moderate sized index. At Netflix, we reindexed a 250,000 doc index in 20 minutes. That was Solr 1.3 -- Solr is much faster now.

I would go ahead with the dynamic boolean fields solution and measure it. If the performance is close, use SSD for storage or machines with lots of RAM available for file buffers.

Atomic field-level updates may be helpful: http://wiki.apache.org/solr/Atomic_Updates

Even if you need to use a different approach, you'll know Solr a lot better after using the straightforward design.

wunder

On Mar 21, 2013, at 1:52 AM, Christopher ARZUR wrote:

> @Jan Høydahl : do you mean "cutom filter" ?
> @Walter Underwood : I also agree with you, I'd only use native functions of Solr, but I do not know how to solve my problem ... My ACLs are composed of thousands of groups (inheritance) that have deny / allow user rights and who themselves have deny / allow rights and these rights may change frequently.
> 
> I do not really see any solution to adopt, thank you for your help
> 
> Le 20/03/2013 15:48, Walter Underwood a écrit :
>> I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.
>> 
>> Updates are pretty fast, really.
>> 
>> wunder
>> 
>> On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:
>> 
>>> Don't try to optimize something which is not a problem.
>>> 
>>> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>> 
>>> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
>>> 
>>>> Hello and thank you for your answers.
>>>> I'll try to explain my problem a little better:
>>>> 
>>>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
>>>> 
>>>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
>>>> 
>>>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
>>>> 
>>>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
>>>> 
>>>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
>>>> 
>>>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
>>>> 
>>>> Thanks,
>>>> Christopher




Re: Bitwise operation

Posted by Upayavira <uv...@odoko.co.uk>.
You could use the same approach for users as for groups - have a {!join}
filter query to select docs that a user is allowed to see, and another
to select groups they are allowed to see.

I've no idea how performant this would be for you, as it depends on how
many documents a single user is allowed to view.

Upayavira

On Thu, Mar 21, 2013, at 10:55 AM, Christopher ARZUR wrote:
> Users also have rights at the individual level (in addition to 
> inheritance of their group), your solution implies that I attach 
> potentially 1million identifier in the document? I do not know Solr 
> limitations, but I think I approach it?
> 
> Le 21/03/2013 10:47, Upayavira a écrit :
> > You could attach the doc rights to the document itself, and then index
> > the group rights into a separate core, and then use pseudo-joins to
> > filter them. Effectively, you would say, "find me all the groups that my
> > user is allowed to see, then find me all documents that are in those
> > groups, based upon the document's ACLs".
> >
> > Would that work?
> >
> > Upayavira
> >
> > On Thu, Mar 21, 2013, at 08:52 AM, Christopher ARZUR wrote:
> >> @Jan Høydahl : do you mean "cutom filter" ?
> >> @Walter Underwood : I also agree with you, I'd only use native functions
> >> of Solr, but I do not know how to solve my problem ... My ACLs are
> >> composed of thousands of groups (inheritance) that have deny / allow
> >> user rights and who themselves have deny / allow rights and these rights
> >> may change frequently.
> >>
> >> I do not really see any solution to adopt, thank you for your help
> >>
> >> Le 20/03/2013 15:48, Walter Underwood a écrit :
> >>> I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.
> >>>
> >>> Updates are pretty fast, really.
> >>>
> >>> wunder
> >>>
> >>> On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:
> >>>
> >>>> Don't try to optimize something which is not a problem.
> >>>>
> >>>> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
> >>>>
> >>>> --
> >>>> Jan Høydahl, search solution architect
> >>>> Cominvent AS - www.cominvent.com
> >>>> Solr Training - www.solrtraining.com
> >>>>
> >>>> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
> >>>>
> >>>>> Hello and thank you for your answers.
> >>>>> I'll try to explain my problem a little better:
> >>>>>
> >>>>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
> >>>>>
> >>>>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
> >>>>>
> >>>>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
> >>>>>
> >>>>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
> >>>>>
> >>>>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
> >>>>>
> >>>>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
> >>>>>
> >>>>> Thanks,
> >>>>> Christopher
> >>> --
> >>> Walter Underwood
> >>> wunder@wunderwood.org
> >>>
> >>>
> >>>
> >>>
> 

Re: Bitwise operation

Posted by Christopher ARZUR <ch...@cognix-systems.com>.
Users also have rights at the individual level (in addition to 
inheritance of their group), your solution implies that I attach 
potentially 1million identifier in the document? I do not know Solr 
limitations, but I think I approach it?

Le 21/03/2013 10:47, Upayavira a écrit :
> You could attach the doc rights to the document itself, and then index
> the group rights into a separate core, and then use pseudo-joins to
> filter them. Effectively, you would say, "find me all the groups that my
> user is allowed to see, then find me all documents that are in those
> groups, based upon the document's ACLs".
>
> Would that work?
>
> Upayavira
>
> On Thu, Mar 21, 2013, at 08:52 AM, Christopher ARZUR wrote:
>> @Jan Høydahl : do you mean "cutom filter" ?
>> @Walter Underwood : I also agree with you, I'd only use native functions
>> of Solr, but I do not know how to solve my problem ... My ACLs are
>> composed of thousands of groups (inheritance) that have deny / allow
>> user rights and who themselves have deny / allow rights and these rights
>> may change frequently.
>>
>> I do not really see any solution to adopt, thank you for your help
>>
>> Le 20/03/2013 15:48, Walter Underwood a écrit :
>>> I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.
>>>
>>> Updates are pretty fast, really.
>>>
>>> wunder
>>>
>>> On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:
>>>
>>>> Don't try to optimize something which is not a problem.
>>>>
>>>> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
>>>>
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>> Solr Training - www.solrtraining.com
>>>>
>>>> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
>>>>
>>>>> Hello and thank you for your answers.
>>>>> I'll try to explain my problem a little better:
>>>>>
>>>>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
>>>>>
>>>>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
>>>>>
>>>>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
>>>>>
>>>>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
>>>>>
>>>>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
>>>>>
>>>>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
>>>>>
>>>>> Thanks,
>>>>> Christopher
>>> --
>>> Walter Underwood
>>> wunder@wunderwood.org
>>>
>>>
>>>
>>>


Re: Bitwise operation

Posted by Upayavira <uv...@odoko.co.uk>.
You could attach the doc rights to the document itself, and then index
the group rights into a separate core, and then use pseudo-joins to
filter them. Effectively, you would say, "find me all the groups that my
user is allowed to see, then find me all documents that are in those
groups, based upon the document's ACLs".

Would that work?

Upayavira

On Thu, Mar 21, 2013, at 08:52 AM, Christopher ARZUR wrote:
> @Jan Høydahl : do you mean "cutom filter" ?
> @Walter Underwood : I also agree with you, I'd only use native functions 
> of Solr, but I do not know how to solve my problem ... My ACLs are 
> composed of thousands of groups (inheritance) that have deny / allow 
> user rights and who themselves have deny / allow rights and these rights 
> may change frequently.
> 
> I do not really see any solution to adopt, thank you for your help
> 
> Le 20/03/2013 15:48, Walter Underwood a écrit :
> > I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.
> >
> > Updates are pretty fast, really.
> >
> > wunder
> >
> > On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:
> >
> >> Don't try to optimize something which is not a problem.
> >>
> >> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >> Solr Training - www.solrtraining.com
> >>
> >> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
> >>
> >>> Hello and thank you for your answers.
> >>> I'll try to explain my problem a little better:
> >>>
> >>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
> >>>
> >>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
> >>>
> >>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
> >>>
> >>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
> >>>
> >>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
> >>>
> >>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
> >>>
> >>> Thanks,
> >>> Christopher
> > --
> > Walter Underwood
> > wunder@wunderwood.org
> >
> >
> >
> >
> 

Re: Bitwise operation

Posted by Christopher ARZUR <ch...@cognix-systems.com>.
@Jan Høydahl : do you mean "cutom filter" ?
@Walter Underwood : I also agree with you, I'd only use native functions 
of Solr, but I do not know how to solve my problem ... My ACLs are 
composed of thousands of groups (inheritance) that have deny / allow 
user rights and who themselves have deny / allow rights and these rights 
may change frequently.

I do not really see any solution to adopt, thank you for your help

Le 20/03/2013 15:48, Walter Underwood a écrit :
> I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.
>
> Updates are pretty fast, really.
>
> wunder
>
> On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:
>
>> Don't try to optimize something which is not a problem.
>>
>> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
>>
>>> Hello and thank you for your answers.
>>> I'll try to explain my problem a little better:
>>>
>>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
>>>
>>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
>>>
>>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
>>>
>>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
>>>
>>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
>>>
>>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
>>>
>>> Thanks,
>>> Christopher
> --
> Walter Underwood
> wunder@wunderwood.org
>
>
>
>


Re: Bitwise operation

Posted by Walter Underwood <wu...@wunderwood.org>.
I agree. Your first step should not be trying to make Solr work they way your think it should. Try really hard to use the existing features, they are there because they solve a LOT of problems.

Updates are pretty fast, really.

wunder

On Mar 20, 2013, at 2:36 AM, Jan Høydahl wrote:

> Don't try to optimize something which is not a problem.
> 
> This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
> 
> 20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:
> 
>> Hello and thank you for your answers.
>> I'll try to explain my problem a little better:
>> 
>> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
>> 
>> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
>> 
>> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
>> 
>> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
>> 
>> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
>> 
>> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
>> 
>> Thanks,
>> Christopher
> 

--
Walter Underwood
wunder@wunderwood.org




Re: Bitwise operation

Posted by Jan Høydahl <ja...@cominvent.com>.
If you implement filtering on both user and group levels. So you record on the document ACL fields which group(s) it belongs to, and when people search you find what groups they are entitled to see and add that as a filter. So if the rights for a group changes, then you don't need to reindex the documents, since they still belong to the group, but instead you change the search filter to match the new group ACL.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

20. mars 2013 kl. 11:38 skrev Christopher ARZUR <ch...@cognix-systems.com>:

> Actually my goal is to integrate Zend Framework ACL in Solr. My problem mainly concerns the inheritance, if rights of a group of documents are changed, I can not go through all the documents for the group and update these.


Re: Bitwise operation

Posted by Christopher ARZUR <ch...@cognix-systems.com>.
Actually my goal is to integrate Zend Framework ACL in Solr. My problem 
mainly concerns the inheritance, if rights of a group of documents are 
changed, I can not go through all the documents for the group and update 
these.

Re: Bitwise operation

Posted by Jan Høydahl <ja...@cominvent.com>.
Don't try to optimize something which is not a problem.

This is what "everyone" does - update documents when ACLs for those documents change, even with multi-million documents. It works like a charm. Or do you have a special usecase where permissions for an average document changes several times a day? If not, you should be fine!

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

20. mars 2013 kl. 10:01 skrev Christopher ARZUR <ch...@cognix-systems.com>:

> Hello and thank you for your answers.
> I'll try to explain my problem a little better:
> 
> The goal is to manage ACLs via Solr without reindex the documents at each change of permission. I have hundreds of thousands of documents, users and groups and permissions (allow / denied) or each of these groups of these users.
> 
> I read a lot of things about this and it seems that the binary compraison is the best solution ... but I may be wrong :/
> 
> To do this I saw a plugin here: https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? fold = 1
> 
> Unfortunately I have not managed to successfully install the plugin although I followed the steps it seems ...
> 
> At the moment I left on a solution that is to use dynamic boolean fields but I'm afraid to face performance issues on important volumetries in terms of documents and fees.
> 
> I hope I have been more specific, I await your ideas because I'm not having chosen the right solution ... small clarification: I discovered solr only a few months ago: s
> 
> Thanks,
> Christopher


Re: Bitwise operation

Posted by Christopher ARZUR <ch...@cognix-systems.com>.
Hello and thank you for your answers.
I'll try to explain my problem a little better:

The goal is to manage ACLs via Solr without reindex the documents at 
each change of permission. I have hundreds of thousands of documents, 
users and groups and permissions (allow / denied) or each of these 
groups of these users.

I read a lot of things about this and it seems that the binary 
compraison is the best solution ... but I may be wrong :/

To do this I saw a plugin here: 
https://issues.apache.org/jira/browse/SOLR-1913 with an example of this: 
https://docs.google.com/document/d/10HuqHkYjaEm6Q2ZrRCI0QQMLbbqtRv_NXDHcTjfCRfU/edit? 
fold = 1

Unfortunately I have not managed to successfully install the plugin 
although I followed the steps it seems ...

At the moment I left on a solution that is to use dynamic boolean fields 
but I'm afraid to face performance issues on important volumetries in 
terms of documents and fees.

I hope I have been more specific, I await your ideas because I'm not 
having chosen the right solution ... small clarification: I discovered 
solr only a few months ago: s

Thanks,
Christopher

Re: Bitwise operation

Posted by Jack Krupansky <ja...@basetechnology.com>.
The simple answer is no. But the real question is what you are trying to 
accomplish. Lucene and Solr are built and optimized around the concept of a 
Boolean Query with AND, OR, and NOT terms/clauses - that should be 
sufficient to implement whatever it is that you are trying to implement. For 
example, you could have a tokenized (text) field with a list of tokens, each 
equivalent to one of your so-called "bits"; then you can use Boolean query 
to specify which tokens are required, optional, or forbidden for that field.

You gave us one proposed solution; now clue us in as to what problem you are 
actually trying to solve.

-- Jack Krupansky

-----Original Message----- 
From: Christopher ARZUR
Sent: Tuesday, March 19, 2013 10:43 AM
To: solr-user@lucene.apache.org
Subject: Bitwise operation

Hi,

Does solr (4.1.0) supports /bitwise/ AND or /bitwise/ OR operator so
that we can specify a field to be compared against an index using
/bitwise/ AND or OR ?

Thanks,
Christopher 


Bitwise operation

Posted by Christopher ARZUR <ch...@cognix-systems.com>.
Hi,

Does solr (4.1.0) supports /bitwise/ AND or /bitwise/ OR operator so 
that we can specify a field to be compared against an index using 
/bitwise/ AND or OR ?

Thanks,
Christopher