You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/06/05 09:57:20 UTC
[jira] [Resolved] (STANBOL-1089) Provide Topic Engine
SolrConfiguration that uses n-grams
[ https://issues.apache.org/jira/browse/STANBOL-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rupert Westenthaler resolved STANBOL-1089.
------------------------------------------
Resolution: Fixed
added configuration with http://svn.apache.org/r1489730
> Provide Topic Engine SolrConfiguration that uses n-grams
> --------------------------------------------------------
>
> Key: STANBOL-1089
> URL: https://issues.apache.org/jira/browse/STANBOL-1089
> Project: Stanbol
> Issue Type: New Feature
> Components: Enhancement Engines
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
>
> With the Topic Classification Engine now supporting to configure different SolrCore configurations we should provide a configuration that does use n-grams for topic classification.
> While this will not scale for very big classification schemes is should provide improvements to small and medium sized models.
> Indexing of n-grams will be based on the Solr ShingleFilterFactory [1].
> The SolrCore configuration will be provided by the name 'shingle-topic-model.solrindex.zip' by the Topic ClassificationEngine bundle to the DataFileProvider. This means that users will need to configure this name with the 'org.apache.stanbol.enhancer.engine.topic.solrCoreConfig' of the TopicClassificationEngine. This property was added by STANBOL-1087
> [1] http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira