You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/06/03 12:38:26 UTC

[GitHub] [solr] dsmiley commented on pull request #129: SOLR-15407 untokenized field type with sow=false fix + tests

dsmiley commented on pull request #129:
URL: https://github.com/apache/solr/pull/129#issuecomment-853836054


   The core of my point is that that Solr *already* has an API to interrogate the field type to see if it produces one token or more -- {{FieldType.isTokenized}}.  SolrQueryParserBase uses this in a number of places already.  It appears you want to make StrField a further specialized case when isTokenized==false is not distinguishing enough and that appears to me going too far / hacky.  It might be justifiable if we felt the result was clearly the correct behavior but we don't even think that (I don't, nor does Michael).  
   
   Nonetheless some user control could be added so that someone with a normalizing text field that uses TextField (likely with KeywordTokenizer but not necessarily) could get query parser behavior that is the same as how StrField behaves. It'd be nice if a user could set `tokenized="false"` on the fieldType definition that uses TextField if they can attest the processing doesn't tokenize.  I experimented this morning with that and was able to do it by modifying the first line of TextField.init to be this snippet:
   ```
       if ((falseProperties & TOKENIZED) == 0) {
         properties |= TOKENIZED;
       }
   ```
   (instead of insisting TOKENIZED as it did before).
   
   And maybe the existing logic in SolrQueryParserBase could be improved somehow; I'm definitely not claiming it's perfection.  It just strikes me as a cast on StrField is a strong code smell to me that we're doing something wrong when there is already isTokenized which seems suitable for your purpose.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org