You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Ratna Puttaswamy (Jira)" <ji...@apache.org> on 2021/12/09 09:44:00 UTC
[jira] [Updated] (SOLR-15841) Multi-word Synonym words in between
[ https://issues.apache.org/jira/browse/SOLR-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ratna Puttaswamy updated SOLR-15841:
------------------------------------
Description:
Hi all,
There is multi word synonym configured like below
boys pajamas => boys pajamas, sleep play
when we search for" boys pajamas" , we do get results of sleep play. That is working fine. However when someone searches for "boys red pajamas", we want 'sleep play' results to come , but it does not getting triggered due to 'red' in between.
Is it possible to have same synonym triggered, if words need not be in same order ? My schema looks like below
<fieldType name="graph_enricher_field_type" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <!-- The idea of how 'multiwords' stop phrases should be removed: - SynonymGraphFilterFactory replaces each stop phrase to '*' - PatternReplaceFilterFactory reduces this token to empty token - LengthFilterFactory filter our this token So pls please keep this tree filters together (at least in one chain) --> <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="stopphrases.txt"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" format="solr" ignoreCase="true" expand="true" tokenizerFactory="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.StemmerOverrideFilterFactory" dictionary="stemdict.txt" ignoreCase="true"/> <filter class="solr.KStemFilterFactory"/> </analyzer> </fieldType>
was:
Hi all,
There is multi word synonym configured like below
boys pajamas => boys pajamas, sleep play
when we search for" boys pajamas" , we do get results of sleep play. That is working fine. However when someone searches for "boys red pajamas", we want 'sleep play' results to come , but it does not getting triggered due to 'red' in between.
Is this possible? My schema looks like below
<fieldType name="graph_enricher_field_type" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <!-- The idea of how 'multiwords' stop phrases should be removed: - SynonymGraphFilterFactory replaces each stop phrase to '*' - PatternReplaceFilterFactory reduces this token to empty token - LengthFilterFactory filter our this token So pls please keep this tree filters together (at least in one chain) --> <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="stopphrases.txt"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" format="solr" ignoreCase="true" expand="true" tokenizerFactory="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.StemmerOverrideFilterFactory" dictionary="stemdict.txt" ignoreCase="true"/> <filter class="solr.KStemFilterFactory"/> </analyzer> </fieldType>
> Multi-word Synonym words in between
> -------------------------------------
>
> Key: SOLR-15841
> URL: https://issues.apache.org/jira/browse/SOLR-15841
> Project: Solr
> Issue Type: Wish
> Security Level: Public(Default Security Level. Issues are Public)
> Components: search
> Affects Versions: 8.8.2
> Reporter: Ratna Puttaswamy
> Priority: Minor
>
> Hi all,
> There is multi word synonym configured like below
> boys pajamas => boys pajamas, sleep play
> when we search for" boys pajamas" , we do get results of sleep play. That is working fine. However when someone searches for "boys red pajamas", we want 'sleep play' results to come , but it does not getting triggered due to 'red' in between.
>
> Is it possible to have same synonym triggered, if words need not be in same order ? My schema looks like below
> <fieldType name="graph_enricher_field_type" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <!-- The idea of how 'multiwords' stop phrases should be removed: - SynonymGraphFilterFactory replaces each stop phrase to '*' - PatternReplaceFilterFactory reduces this token to empty token - LengthFilterFactory filter our this token So pls please keep this tree filters together (at least in one chain) --> <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="stopphrases.txt"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" format="solr" ignoreCase="true" expand="true" tokenizerFactory="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.PatternReplaceFilterFactory" pattern="[^a-z0-9]" replacement="" replace="all"/> <filter class="solr.LengthFilterFactory" min="1" max="1024" /> <filter class="solr.StemmerOverrideFilterFactory" dictionary="stemdict.txt" ignoreCase="true"/> <filter class="solr.KStemFilterFactory"/> </analyzer> </fieldType>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org