You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kristian Rickert <kr...@gmail.com> on 2011/10/04 21:55:40 UTC

"Private" text fields

I'm trying to find a way to query a "private" field in solr while
using the text fields.

So I want to allow private tags only searchable by an assigned owner.
The private tags will also query along side regular keyword tags.

Here's an example:

Company A (identified by idA) searches and finds company B (identified
by idB).  Company A would like to add some tags specific to Company B
and only searchable from company A.

So far I've found a few ways to implement this, and all of them feel
really sloppy as far as what it'll do to the index by either making
too many dynamic columns or manipulating the keywords in the value.
I'd like to know if solr can do this out of the box in any other way.
It looks like I can just create a custom field (PrivateString or
something like that) but before I head down that route, I'd like to
know what solr can do now.

So here's a couple ways I figured out so far and I don't like either of them:
Option 1: Use Dynamic Fields for the text field
I can make the dynamic fields in solr that specify the field name
followed by the company ID.  For example, if Company A (idA) above
adds the words "great client" to the field "tags" in the index, I can
make the field as follows:

tags_idA=great client

But I'm going to be working with over 10K clients, and don't think
this is a great idea.  It would allow for simple syntax and a fast
implementation, but I'd like to avoid crowding the index with 1000s of
extra columns if I can avoid it.

Option 2: Store the private tags with a company ID prepended to the keyword
When searching the private tag field, I can store each keyword by
prepending the identifier to that word.

Although this would make the query easier, it'll hamper some other
text features that expect a word to be stored as it would be searched
on.


Option 3: Customize dismax to handle a PrivateField type
This would be a custom multi-value poly field that would hold the
private owner ID as well as the full text that we want to index.  I've
not yet done this route but it seems like the cleanest way as far as
resulting syntax and index storage.  If any of the specified fields
are a "PrivateField," then it can automatically search the appropriate
IDs

So when indexing, I can have the PrivateFieldType look something like this:

Doc1:
privateTags=[{1111,great client},{3,bad client}]
Doc2:
privateTags=[{1111,scott tiger},{3,bad client}]


So when I perform a query:

http://localhost:8080/usercore/select/?q=client&qoid=1111&qf=firstName%20lastName%20privateTags&defType=dismax

So from the above, I'd want it to search the firstName lastName and
privateTags fields.  However, I'd want solr to realize that the
privateTags are a PrivateFieldType and look for the "qoid" field -
only returning matches the matching ID in the "qoid" field.

So the above query will only return Doc1 because it matches the
private tag with and ID of "1111."



Thoughts?  Ideas?

RE: "Private" text fields

Posted by "Jaeger, Jay - DOT" <Ja...@dot.wi.gov>.
My thought about this, based on some work we did when we considered using Solr to index our LAN files:

1) If it matters - if someone misusing the private tags is a real issue (and it sounds like it would be), then I think you need an application out in front to enforce this (a good idea with Solr anyway), because otherwise anyone can get at anything.  Once that is done then:

2) The front end application has a little table / database / solr index which has the list off private fields and who can search them, and enforces that.

-----Original Message-----
From: Kristian Rickert [mailto:krickert@gmail.com] 
Sent: Tuesday, October 04, 2011 2:56 PM
To: solr-user@lucene.apache.org
Subject: "Private" text fields

I'm trying to find a way to query a "private" field in solr while
using the text fields.

So I want to allow private tags only searchable by an assigned owner.
The private tags will also query along side regular keyword tags.

Here's an example:

Company A (identified by idA) searches and finds company B (identified
by idB).  Company A would like to add some tags specific to Company B
and only searchable from company A.

So far I've found a few ways to implement this, and all of them feel
really sloppy as far as what it'll do to the index by either making
too many dynamic columns or manipulating the keywords in the value.
I'd like to know if solr can do this out of the box in any other way.
It looks like I can just create a custom field (PrivateString or
something like that) but before I head down that route, I'd like to
know what solr can do now.

So here's a couple ways I figured out so far and I don't like either of them:
Option 1: Use Dynamic Fields for the text field
I can make the dynamic fields in solr that specify the field name
followed by the company ID.  For example, if Company A (idA) above
adds the words "great client" to the field "tags" in the index, I can
make the field as follows:

tags_idA=great client

But I'm going to be working with over 10K clients, and don't think
this is a great idea.  It would allow for simple syntax and a fast
implementation, but I'd like to avoid crowding the index with 1000s of
extra columns if I can avoid it.

Option 2: Store the private tags with a company ID prepended to the keyword
When searching the private tag field, I can store each keyword by
prepending the identifier to that word.

Although this would make the query easier, it'll hamper some other
text features that expect a word to be stored as it would be searched
on.


Option 3: Customize dismax to handle a PrivateField type
This would be a custom multi-value poly field that would hold the
private owner ID as well as the full text that we want to index.  I've
not yet done this route but it seems like the cleanest way as far as
resulting syntax and index storage.  If any of the specified fields
are a "PrivateField," then it can automatically search the appropriate
IDs

So when indexing, I can have the PrivateFieldType look something like this:

Doc1:
privateTags=[{1111,great client},{3,bad client}]
Doc2:
privateTags=[{1111,scott tiger},{3,bad client}]


So when I perform a query:

http://localhost:8080/usercore/select/?q=client&qoid=1111&qf=firstName%20lastName%20privateTags&defType=dismax

So from the above, I'd want it to search the firstName lastName and
privateTags fields.  However, I'd want solr to realize that the
privateTags are a PrivateFieldType and look for the "qoid" field -
only returning matches the matching ID in the "qoid" field.

So the above query will only return Doc1 because it matches the
private tag with and ID of "1111."



Thoughts?  Ideas?