You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Evan Smith <ev...@wingonwing.com> on 2014/04/28 17:17:11 UTC

how to write my first solr query

Hello,

I would like to find all documents that have say "foo bar" with a filter to
remove any cases where "foo bar" is prefixed with things like "cat", "a",
...

I am ok with a document that has "cat foo bar"  and "foo bar", but if it
only has "cat foo bar" then I don't want it while if it has "foo bar" I want
it.

I looked at span queries but was not able to come up with how to phrase
this.

Any pointers would be great!

Thank you in advance,
Evan




--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to write my first solr query

Posted by Ahmet Arslan <io...@yahoo.com>.

Hi Evan,

Confusing use case :)

You don't want "foo bar" is prefixed with "cat" ?

But you are ok with a document that has "cat foo bar"

Isn't this contradiction?




On Monday, April 28, 2014 6:26 PM, Evan Smith <ev...@wingonwing.com> wrote:
Hello,

I would like to find all documents that have say "foo bar" with a filter to
remove any cases where "foo bar" is prefixed with things like "cat", "a",
...

I am ok with a document that has "cat foo bar"  and "foo bar", but if it
only has "cat foo bar" then I don't want it while if it has "foo bar" I want
it.

I looked at span queries but was not able to come up with how to phrase
this.

Any pointers would be great!

Thank you in advance,
Evan




--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: how to write my first solr query

Posted by Evan Smith <ev...@wingonwing.com>.
Hello,

Thank you!  I will try out what you suggested and post back once I know
more.

yes given things like
cat foo bar
house foo bar
foo bar

I want to know when the term "foo bar" (but not the prefix cases I specify)
exists in my documents.

Thanks!
Evan




--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133601.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: how to write my first solr query

Posted by Jeroen Steggink <Je...@contentstrategy.nl>.
Hi Evan,

If I understand correctly, a document has to have at least one "foo bar" without having "cat" in front.

A solution would be to use a combination of the ShingleFilterFactory and query for one occurences of "foo bar" using the termfreq function.

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter
https://cwiki.apache.org/confluence/display/solr/Function+Queries

The number of shingles depends on how many terms are in the query and how many terms cannot be prefixed.

It might be easier to just retrieve all the documents which contain the phrase and process the results outside of Solr.
If you could shed some more light on what you are trying to accomplish, maybe we can help you find an even better solution to fit your problem.

Jeroen

-----Original Message-----
From: Evan Smith [mailto:evan@wingonwing.com] 
Sent: maandag 28 april 2014 19:20
To: solr-user@lucene.apache.org
Subject: Re: how to write my first solr query

Hello,

Here is a better use case

Documents A, B, C, and D

A: "dear foo bar hello"
B: "dear cat foo bar hello"
C: "dear cat foo bar hello foo bar"
D: "dear car foo bar"

I have a dictionary of items outside of solr "foo bar" and "cat foo bar"
And associated with each item is the set of "suffix's of that item"
So I know that "foo bar" has "cat foo bar" as a "suffix"

I would like to search my corpus of documents A, B, C and D And just get documents that contain "foo bar" and not the ones that contain "cat foo bar"

So if I searched on "foo bar" but not "cat foo bar"
I want to get documents A, C, D
But not B which does not have just "foo bar" but has "cat foo bar".
I am ok with C as it has a "foo bar" that is not prefixed with "cat".

Does this make sense?  I see that the ("foo bar" and not "cat foo bar") would not work as it would miss document C.  Or at least I think it would.

Evan



--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to write my first solr query

Posted by Evan Smith <ev...@wingonwing.com>.
Hello,

Here is a better use case

Documents A, B, C, and D

A: "dear foo bar hello"
B: "dear cat foo bar hello"
C: "dear cat foo bar hello foo bar"
D: "dear car foo bar"

I have a dictionary of items outside of solr 
"foo bar" and "cat foo bar"
And associated with each item is the set of "suffix's of that item"
So I know that "foo bar" has "cat foo bar" as a "suffix"

I would like to search my corpus of documents A, B, C and D
And just get documents that contain "foo bar" and not the ones that contain
"cat foo bar"

So if I searched on "foo bar" but not "cat foo bar"
I want to get documents A, C, D
But not B which does not have just "foo bar" but has "cat foo bar".
I am ok with C as it has a "foo bar" that is not prefixed with "cat".

Does this make sense?  I see that the ("foo bar" and not "cat foo bar")
would not work as it would miss document C.  Or at least I think it would.

Evan



--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html
Sent from the Solr - User mailing list archive at Nabble.com.