You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sandeep Khanzode <sa...@yahoo.com.INVALID> on 2016/11/10 19:40:42 UTC

Wildcard searches with space in TextField/StrField

Hi,
How does a search like abc* work in StrField. Since the entire thing is stored as a single token, is it a type of a trie structure that allows such wildcard matching? 
How can searches with space like 'a b*' be executed for text fields (tokenized on whitespace)? If we specify this type of query, it is broken down into two queries with field:a and field:b*. I would like them to be contiguous, sort of, like a phrase search with wild card.
SRK

Re: Wildcard searches with space in TextField/StrField

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

You could try this:

drop wildcard stuff altogether:
1) Employ edgengramfilter at index time.
2) Use plain searches at query time.

Ahmet



On Friday, November 25, 2016 4:59 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:
Hi All,

Can someone please assist with this query?

My data consists of:
1.] John Doe
2.] John V. Doe
3.] Johnson Doe
4.] Johnson V. Doe
5.] John Smith
6.] Johnson V. Smith
7.] Matt Doe
8.] Matt V. Doe
9.] Matt Doe
10.] Matthew V. Doe
11.] Matthew Smith

12.] Matthew V. Smith

Querying ...
(a) Matt/Matt* should return records 7-12
(b) John/John* should return records 1-6
(c) Doe/Doe* should return records 1-4, 7-10
(d) Smith/Smith* should return records 5,6,11,12
(e) V/V./V.*/V* should return records 2,4,6,8,10,12
(f) V. Doe/V. Doe* should return records 2,4,8,10
(g) John V/John V./John V*/John V.* should return record 2
(h) V. Smith/V. Smith* should return records 6,12

Any guidance would be appreciated!
I have tried ComplexPhraseQueryParser, but with a single token like Doe*, there is an error that indicates that the query is being identified as a prefix query. I may be missing something in the syntax.
 SRK 


    On Thursday, November 24, 2016 11:16 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:


Hi All, Erick,
Please suggest. Would like to use the ComplexPhraseQueryParser for searching text (with wildcard) that may contain special characters.
For example ...John* should match John V. DoeJohn* should match Johnson SmithBruce-Willis* should match Bruce-WillisV.* should match John V. F. Doe
SRK 

    On Thursday, November 24, 2016 5:57 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:


Hi,
This is the typical TextField with ...   <fieldType name="text123" class="solr.TextField" positionIncrementGap="100">    <analyzer>      <tokenizer class="solr.StandardTokenizerFactory"/>      <filter class="solr.LowerCaseFilterFactory"/>    </analyzer>  </fieldType>



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM <re...@gmail.com> wrote:


what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <sa...@yahoo.com.invalid> wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named "John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson <er...@gmail.com> wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
<sa...@yahoo.com> wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
> <er...@gmail.com> wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> <sandeep_khanzode@yahoo.com. invalid> wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
>>> myself)
>>>
>>>
>>> https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser
>>>
>>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>>> sandeep_khanzode@yahoo.com. invalid> wrote:
>>>
>>>> Hi,
>>>> How does a search like abc* work in StrField. Since the entire thing is
>>>> stored as a single token, is it a type of a trie structure that allows
>>>> such
>>>> wildcard matching?
>>>> How can searches with space like 'a b*' be executed for text fields
>>>> (tokenized on whitespace)? If we specify this type of query, it is
>>>> broken
>>>> down into two queries with field:a and field:b*. I would like them to be
>>>> contiguous, sort of, like a phrase search with wild card.
>>>> SRK
>>
>>
>>
>
>

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Hi All,

Can someone please assist with this query?

My data consists of:
1.] John Doe
2.] John V. Doe
3.] Johnson Doe
4.] Johnson V. Doe
5.] John Smith
6.] Johnson V. Smith
7.] Matt Doe
8.] Matt V. Doe
9.] Matt Doe
10.] Matthew V. Doe
11.] Matthew Smith

12.] Matthew V. Smith

Querying ...
(a) Matt/Matt* should return records 7-12
(b) John/John* should return records 1-6
(c) Doe/Doe* should return records 1-4, 7-10
(d) Smith/Smith* should return records 5,6,11,12
(e) V/V./V.*/V* should return records 2,4,6,8,10,12
(f) V. Doe/V. Doe* should return records 2,4,8,10
(g) John V/John V./John V*/John V.* should return record 2
(h) V. Smith/V. Smith* should return records 6,12

Any guidance would be appreciated!
I have tried ComplexPhraseQueryParser, but with a single token like Doe*, there is an error that indicates that the query is being identified as a prefix query. I may be missing something in the syntax.
 SRK 

    On Thursday, November 24, 2016 11:16 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:
 

 Hi All, Erick,
Please suggest. Would like to use the ComplexPhraseQueryParser for searching text (with wildcard) that may contain special characters.
For example ...John* should match John V. DoeJohn* should match Johnson SmithBruce-Willis* should match Bruce-WillisV.* should match John V. F. Doe
SRK 

    On Thursday, November 24, 2016 5:57 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:
 

 Hi,
This is the typical TextField with ...   <fieldType name="text123" class="solr.TextField" positionIncrementGap="100">    <analyzer>      <tokenizer class="solr.StandardTokenizerFactory"/>      <filter class="solr.LowerCaseFilterFactory"/>    </analyzer>  </fieldType>



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM <re...@gmail.com> wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <sa...@yahoo.com.invalid> wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named "John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson <er...@gmail.com> wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
<sa...@yahoo.com> wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
> <er...@gmail.com> wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> <sandeep_khanzode@yahoo.com. invalid> wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
>>> myself)
>>>
>>>
>>> https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser
>>>
>>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>>> sandeep_khanzode@yahoo.com. invalid> wrote:
>>>
>>>> Hi,
>>>> How does a search like abc* work in StrField. Since the entire thing is
>>>> stored as a single token, is it a type of a trie structure that allows
>>>> such
>>>> wildcard matching?
>>>> How can searches with space like 'a b*' be executed for text fields
>>>> (tokenized on whitespace)? If we specify this type of query, it is
>>>> broken
>>>> down into two queries with field:a and field:b*. I would like them to be
>>>> contiguous, sort of, like a phrase search with wild card.
>>>> SRK
>>
>>
>>
>
>


   



  

  

   

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Hi All, Erick,
Please suggest. Would like to use the ComplexPhraseQueryParser for searching text (with wildcard) that may contain special characters.
For example ...John* should match John V. DoeJohn* should match Johnson SmithBruce-Willis* should match Bruce-WillisV.* should match John V. F. Doe
SRK 

    On Thursday, November 24, 2016 5:57 PM, Sandeep Khanzode <sa...@yahoo.com.INVALID> wrote:
 

 Hi,
This is the typical TextField with ...   <fieldType name="text123" class="solr.TextField" positionIncrementGap="100">    <analyzer>      <tokenizer class="solr.StandardTokenizerFactory"/>      <filter class="solr.LowerCaseFilterFactory"/>    </analyzer>  </fieldType>



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM <re...@gmail.com> wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <sa...@yahoo.com.invalid> wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named "John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson <er...@gmail.com> wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
<sa...@yahoo.com> wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
> <er...@gmail.com> wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> <sandeep_khanzode@yahoo.com. invalid> wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
>>> myself)
>>>
>>>
>>> https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser
>>>
>>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>>> sandeep_khanzode@yahoo.com. invalid> wrote:
>>>
>>>> Hi,
>>>> How does a search like abc* work in StrField. Since the entire thing is
>>>> stored as a single token, is it a type of a trie structure that allows
>>>> such
>>>> wildcard matching?
>>>> How can searches with space like 'a b*' be executed for text fields
>>>> (tokenized on whitespace)? If we specify this type of query, it is
>>>> broken
>>>> down into two queries with field:a and field:b*. I would like them to be
>>>> contiguous, sort of, like a phrase search with wild card.
>>>> SRK
>>
>>
>>
>
>


   



  

   

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Hi,
This is the typical TextField with ...   <fieldType name="text123" class="solr.TextField" positionIncrementGap="100">    <analyzer>      <tokenizer class="solr.StandardTokenizerFactory"/>      <filter class="solr.LowerCaseFilterFactory"/>    </analyzer>  </fieldType>



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM <re...@gmail.com> wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <sa...@yahoo.com.invalid> wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named "John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson <er...@gmail.com> wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
<sa...@yahoo.com> wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
> <er...@gmail.com> wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> <sandeep_khanzode@yahoo.com. invalid> wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
>>> myself)
>>>
>>>
>>> https://cwiki.apache.org/ confluence/display/solr/Other+ Parsers#OtherParsers- ComplexPhraseQueryParser
>>>
>>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>>> sandeep_khanzode@yahoo.com. invalid> wrote:
>>>
>>>> Hi,
>>>> How does a search like abc* work in StrField. Since the entire thing is
>>>> stored as a single token, is it a type of a trie structure that allows
>>>> such
>>>> wildcard matching?
>>>> How can searches with space like 'a b*' be executed for text fields
>>>> (tokenized on whitespace)? If we specify this type of query, it is
>>>> broken
>>>> down into two queries with field:a and field:b*. I would like them to be
>>>> contiguous, sort of, like a phrase search with wild card.
>>>> SRK
>>
>>
>>
>
>


   



   

Re: Wildcard searches with space in TextField/StrField

Posted by Reth RM <re...@gmail.com>.
what is the fieldType of those records?

On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode <
sandeep_khanzode@yahoo.com.invalid> wrote:

> Hi Erick,
> I gave this a try.
> These are my results. There is a record with "John D. Smith", and another
> named "John Doe".
>
> 1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any
> results.
>
> 2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results.
>
>
>
> Second observation: There is a record with "John D Smith"
> 1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any
> results.
>
> 2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record.
>
> 3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record.
>
> SRK
>
>     On Sunday, November 13, 2016 7:43 AM, Erick Erickson <
> erickerickson@gmail.com> wrote:
>
>
>  Right, for that kind of use case you want complexPhraseQueryParser,
> see: https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-ComplexPhraseQueryParser
>
> Best,
> Erick
>
> On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
> <sa...@yahoo.com> wrote:
> > Thanks, Erick.
> >
> > I am actually not trying to use the String field (prefer a TextField
> here).
> > But, in my comparisons with TextField, it seems that something like
> phrase
> > matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*',
> or
> > say, 'my dog has*') can only be accomplished with a string type field,
> > especially because, with a WhitespaceTokenizer in TextField, the space
> will
> > be lost, and all tokens will be individually considered. Am I missing
> > something?
> >
> > SRK
> >
> >
> > On Friday, November 11, 2016 10:05 PM, Erick Erickson
> > <er...@gmail.com> wrote:
> >
> >
> > You have to query text and string fields differently, that's just the
> > way it works. The problem is getting the query string through the
> > parser as a _single_ token or as multiple tokens.
> >
> > Let's say you have a string field with the "a b" example. You have a
> > single token
> > a b that starts at offset 0.
> >
> > But with a text field, you have two tokens,
> > a at position 0
> > b at position 1
> >
> > But when the query parser sees "a b" (without quotes) it splits it
> > into two tokens, and only the text field has both tokens so the string
> > field won't match.
> >
> > OTOH, when the query parser sees "a\ b" it passes this through as a
> > single token, which only matches the string field as there's no
> > _single_ token "a b" in the text field.
> >
> > But a more interesting question is why you want to search this way.
> > String fields are intended for keywords, machine-generated IDs and the
> > like. They're pretty useless for searching anything except
> > 1> exact tokens
> > 2> prefixes
> >
> > While if you have "my dog has fleas" in a string field, you _can_
> > search "*dog*" and get a hit but the performance is poor when you get
> > a large corpus. Performance for "my*" will be pretty good though.
> >
> > In all this sounds like an XY problem, what's the use-case you're
> > trying to solve?
> >
> > Best,
> > Erick
> >
> >
> >
> > On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> > <sa...@yahoo.com.invalid> wrote:
> >> Hi Erick, Reth,
> >>
> >> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
> >> for StrField for me.
> >>
> >> Any attempt at creating a 'a\ b*' for a TextField does not match any
> >> documents. The parsedQuery in debug mode does show 'field:a b*'. I am
> sure
> >> there are documents that should match.
> >> Another (maybe unrelated) observation is if I have 'field:a\ b', then
> the
> >> parsedQuery is field:a field:b. Which does not match as expected
> (matches
> >> individually).
> >>
> >> Can you please provide an example that I can use in Solr Query
> dashboard?
> >> That will be helpful.
> >>
> >> I have also seen that wildcard queries work irrespective of field type
> >> i.e. StrField as well as TextField. That makes sense because with a
> >> WhitespaceTokenizer only creates word boundaries when we do not use a
> >> EdgeNGramFilter. If I am not wrong, that is. SRK
> >>
> >>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
> >> <er...@gmail.com> wrote:
> >>
> >>
> >>  You can escape the space with a backslash as  'a\ b*'
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
> >>> I don't think you can do wildcard on StrField. For text field, if your
> >>> query is "category:(test m*)"  the parsed query will be  "category:test
> >>> OR
> >>> category:m*"
> >>> You can add q.op=AND to make an AND between those terms.
> >>>
> >>> For phrase type wild card query support, as per docs, it
> >>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
> >>> myself)
> >>>
> >>>
> >>> https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-ComplexPhraseQueryParser
> >>>
> >>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
> >>> sandeep_khanzode@yahoo.com.invalid> wrote:
> >>>
> >>>> Hi,
> >>>> How does a search like abc* work in StrField. Since the entire thing
> is
> >>>> stored as a single token, is it a type of a trie structure that allows
> >>>> such
> >>>> wildcard matching?
> >>>> How can searches with space like 'a b*' be executed for text fields
> >>>> (tokenized on whitespace)? If we specify this type of query, it is
> >>>> broken
> >>>> down into two queries with field:a and field:b*. I would like them to
> be
> >>>> contiguous, sort of, like a phrase search with wild card.
> >>>> SRK
> >>
> >>
> >>
> >
> >
>
>
>
>

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named "John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK 

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson <er...@gmail.com> wrote:
 

 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
<sa...@yahoo.com> wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
> <er...@gmail.com> wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
> <sa...@yahoo.com.invalid> wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't tested it
>>> myself)
>>>
>>>
>>> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>>>
>>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>>> sandeep_khanzode@yahoo.com.invalid> wrote:
>>>
>>>> Hi,
>>>> How does a search like abc* work in StrField. Since the entire thing is
>>>> stored as a single token, is it a type of a trie structure that allows
>>>> such
>>>> wildcard matching?
>>>> How can searches with space like 'a b*' be executed for text fields
>>>> (tokenized on whitespace)? If we specify this type of query, it is
>>>> broken
>>>> down into two queries with field:a and field:b*. I would like them to be
>>>> contiguous, sort of, like a phrase search with wild card.
>>>> SRK
>>
>>
>>
>
>


   

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Thanks, Erick.
I am actually not trying to use the String field (prefer a TextField here). But, in my comparisons with TextField, it seems that something like phrase matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or say, 'my dog has*') can only be accomplished with a string type field, especially because, with a WhitespaceTokenizer in TextField, the space will be lost, and all tokens will be individually considered. Am I missing something? SRK 

    On Friday, November 11, 2016 10:05 PM, Erick Erickson <er...@gmail.com> wrote:
 

 You have to query text and string fields differently, that's just the
way it works. The problem is getting the query string through the
parser as a _single_ token or as multiple tokens.

Let's say you have a string field with the "a b" example. You have a
single token
a b that starts at offset 0.

But with a text field, you have two tokens,
a at position 0
b at position 1

But when the query parser sees "a b" (without quotes) it splits it
into two tokens, and only the text field has both tokens so the string
field won't match.

OTOH, when the query parser sees "a\ b" it passes this through as a
single token, which only matches the string field as there's no
_single_ token "a b" in the text field.

But a more interesting question is why you want to search this way.
String fields are intended for keywords, machine-generated IDs and the
like. They're pretty useless for searching anything except
1> exact tokens
2> prefixes

While if you have "my dog has fleas" in a string field, you _can_
search "*dog*" and get a hit but the performance is poor when you get
a large corpus. Performance for "my*" will be pretty good though.

In all this sounds like an XY problem, what's the use-case you're
trying to solve?

Best,
Erick



On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
<sa...@yahoo.com.invalid> wrote:
> Hi Erick, Reth,
>
> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only for StrField for me.
>
> Any attempt at creating a 'a\ b*' for a TextField does not match any documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure there are documents that should match.
> Another (maybe unrelated) observation is if I have 'field:a\ b', then the parsedQuery is field:a field:b. Which does not match as expected (matches individually).
>
> Can you please provide an example that I can use in Solr Query dashboard? That will be helpful.
>
> I have also seen that wildcard queries work irrespective of field type i.e. StrField as well as TextField. That makes sense because with a WhitespaceTokenizer only creates word boundaries when we do not use a EdgeNGramFilter. If I am not wrong, that is. SRK
>
>    On Friday, November 11, 2016 5:00 AM, Erick Erickson <er...@gmail.com> wrote:
>
>
>  You can escape the space with a backslash as  'a\ b*'
>
> Best,
> Erick
>
> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>> I don't think you can do wildcard on StrField. For text field, if your
>> query is "category:(test m*)"  the parsed query will be  "category:test OR
>> category:m*"
>> You can add q.op=AND to make an AND between those terms.
>>
>> For phrase type wild card query support, as per docs, it
>> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>>
>> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>>
>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>> sandeep_khanzode@yahoo.com.invalid> wrote:
>>
>>> Hi,
>>> How does a search like abc* work in StrField. Since the entire thing is
>>> stored as a single token, is it a type of a trie structure that allows such
>>> wildcard matching?
>>> How can searches with space like 'a b*' be executed for text fields
>>> (tokenized on whitespace)? If we specify this type of query, it is broken
>>> down into two queries with field:a and field:b*. I would like them to be
>>> contiguous, sort of, like a phrase search with wild card.
>>> SRK
>
>
>

   

Re: Wildcard searches with space in TextField/StrField

Posted by Erick Erickson <er...@gmail.com>.
You have to query text and string fields differently, that's just the
way it works. The problem is getting the query string through the
parser as a _single_ token or as multiple tokens.

Let's say you have a string field with the "a b" example. You have a
single token
a b that starts at offset 0.

But with a text field, you have two tokens,
a at position 0
b at position 1

But when the query parser sees "a b" (without quotes) it splits it
into two tokens, and only the text field has both tokens so the string
field won't match.

OTOH, when the query parser sees "a\ b" it passes this through as a
single token, which only matches the string field as there's no
_single_ token "a b" in the text field.

But a more interesting question is why you want to search this way.
String fields are intended for keywords, machine-generated IDs and the
like. They're pretty useless for searching anything except
1> exact tokens
2> prefixes

While if you have "my dog has fleas" in a string field, you _can_
search "*dog*" and get a hit but the performance is poor when you get
a large corpus. Performance for "my*" will be pretty good though.

In all this sounds like an XY problem, what's the use-case you're
trying to solve?

Best,
Erick



On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
<sa...@yahoo.com.invalid> wrote:
> Hi Erick, Reth,
>
> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only for StrField for me.
>
> Any attempt at creating a 'a\ b*' for a TextField does not match any documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure there are documents that should match.
> Another (maybe unrelated) observation is if I have 'field:a\ b', then the parsedQuery is field:a field:b. Which does not match as expected (matches individually).
>
> Can you please provide an example that I can use in Solr Query dashboard? That will be helpful.
>
> I have also seen that wildcard queries work irrespective of field type i.e. StrField as well as TextField. That makes sense because with a WhitespaceTokenizer only creates word boundaries when we do not use a EdgeNGramFilter. If I am not wrong, that is. SRK
>
>     On Friday, November 11, 2016 5:00 AM, Erick Erickson <er...@gmail.com> wrote:
>
>
>  You can escape the space with a backslash as  'a\ b*'
>
> Best,
> Erick
>
> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
>> I don't think you can do wildcard on StrField. For text field, if your
>> query is "category:(test m*)"  the parsed query will be  "category:test OR
>> category:m*"
>> You can add q.op=AND to make an AND between those terms.
>>
>> For phrase type wild card query support, as per docs, it
>> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>>
>> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>>
>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>> sandeep_khanzode@yahoo.com.invalid> wrote:
>>
>>> Hi,
>>> How does a search like abc* work in StrField. Since the entire thing is
>>> stored as a single token, is it a type of a trie structure that allows such
>>> wildcard matching?
>>> How can searches with space like 'a b*' be executed for text fields
>>> (tokenized on whitespace)? If we specify this type of query, it is broken
>>> down into two queries with field:a and field:b*. I would like them to be
>>> contiguous, sort of, like a phrase search with wild card.
>>> SRK
>
>
>

Re: Wildcard searches with space in TextField/StrField

Posted by Sandeep Khanzode <sa...@yahoo.com.INVALID>.
Hi Erick, Reth,

The 'a\ b*' as well as the q.op=AND approach worked (successfully) only for StrField for me.

Any attempt at creating a 'a\ b*' for a TextField does not match any documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure there are documents that should match.
Another (maybe unrelated) observation is if I have 'field:a\ b', then the parsedQuery is field:a field:b. Which does not match as expected (matches individually).

Can you please provide an example that I can use in Solr Query dashboard? That will be helpful. 

I have also seen that wildcard queries work irrespective of field type i.e. StrField as well as TextField. That makes sense because with a WhitespaceTokenizer only creates word boundaries when we do not use a EdgeNGramFilter. If I am not wrong, that is. SRK 

    On Friday, November 11, 2016 5:00 AM, Erick Erickson <er...@gmail.com> wrote:
 

 You can escape the space with a backslash as  'a\ b*'

Best,
Erick

On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
> I don't think you can do wildcard on StrField. For text field, if your
> query is "category:(test m*)"  the parsed query will be  "category:test OR
> category:m*"
> You can add q.op=AND to make an AND between those terms.
>
> For phrase type wild card query support, as per docs, it
> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>
> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>
> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
> sandeep_khanzode@yahoo.com.invalid> wrote:
>
>> Hi,
>> How does a search like abc* work in StrField. Since the entire thing is
>> stored as a single token, is it a type of a trie structure that allows such
>> wildcard matching?
>> How can searches with space like 'a b*' be executed for text fields
>> (tokenized on whitespace)? If we specify this type of query, it is broken
>> down into two queries with field:a and field:b*. I would like them to be
>> contiguous, sort of, like a phrase search with wild card.
>> SRK


   

Re: Wildcard searches with space in TextField/StrField

Posted by Erick Erickson <er...@gmail.com>.
You can escape the space with a backslash as  'a\ b*'

Best,
Erick

On Thu, Nov 10, 2016 at 2:37 PM, Reth RM <re...@gmail.com> wrote:
> I don't think you can do wildcard on StrField. For text field, if your
> query is "category:(test m*)"  the parsed query will be  "category:test OR
> category:m*"
> You can add q.op=AND to make an AND between those terms.
>
> For phrase type wild card query support, as per docs, it
> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>
> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>
> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
> sandeep_khanzode@yahoo.com.invalid> wrote:
>
>> Hi,
>> How does a search like abc* work in StrField. Since the entire thing is
>> stored as a single token, is it a type of a trie structure that allows such
>> wildcard matching?
>> How can searches with space like 'a b*' be executed for text fields
>> (tokenized on whitespace)? If we specify this type of query, it is broken
>> down into two queries with field:a and field:b*. I would like them to be
>> contiguous, sort of, like a phrase search with wild card.
>> SRK

Re: Wildcard searches with space in TextField/StrField

Posted by Reth RM <re...@gmail.com>.
I don't think you can do wildcard on StrField. For text field, if your
query is "category:(test m*)"  the parsed query will be  "category:test OR
category:m*"
You can add q.op=AND to make an AND between those terms.

For phrase type wild card query support, as per docs, it
is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
sandeep_khanzode@yahoo.com.invalid> wrote:

> Hi,
> How does a search like abc* work in StrField. Since the entire thing is
> stored as a single token, is it a type of a trie structure that allows such
> wildcard matching?
> How can searches with space like 'a b*' be executed for text fields
> (tokenized on whitespace)? If we specify this type of query, it is broken
> down into two queries with field:a and field:b*. I would like them to be
> contiguous, sort of, like a phrase search with wild card.
> SRK