You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by aneeshkappu <ha...@gmail.com> on 2018/02/24 15:58:43 UTC
Solr Phrase Count : How to get count of a phrase in a text field
solr
Hi All, I want to get the count of a phrase from a document .
Currently im using Shingle Filter factory but it consuming a large disk
space. Any alternate ways or any way to optimize this.
currently it consuming 40GB for just 46K records
my schema setting is given below
<field name="data_text" type="texto_indexado" indexed="true" stored="true"
multiValued="false"/>
<fieldType name="texto_indexado" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="10"
outputUnigrams="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr Phrase Count : How to get count of a phrase in a text field
solr
Posted by Emir Arnautović <em...@sematext.com>.
For start you don’t have to store it. Also, is 10 words shingle really needed?
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> On 24 Feb 2018, at 16:58, aneeshkappu <ha...@gmail.com> wrote:
>
> Hi All, I want to get the count of a phrase from a document .
> Currently im using Shingle Filter factory but it consuming a large disk
> space. Any alternate ways or any way to optimize this.
> currently it consuming 40GB for just 46K records
>
> my schema setting is given below
>
> <field name="data_text" type="texto_indexado" indexed="true" stored="true"
> multiValued="false"/>
>
>
> <fieldType name="texto_indexado" class="solr.TextField" omitNorms="false">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.ShingleFilterFactory" maxShingleSize="10"
> outputUnigrams="true"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
>
> </fieldType>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr Phrase Count : How to get count of a phrase in a text
field solr
Posted by aneeshkappu <ha...@gmail.com>.
Found the solution
put `debug=results` at the end of solr url
it will give you the phrase freq also.
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html