You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tomas Ramanauskas <To...@springer.com> on 2016/07/15 11:11:59 UTC
Re: A working example to play with Naive Bayes classifier
Hi, Allesandro,
sorry for the delay. What do you mean?
As I mentioned earlier, I followed a super simply set of steps.
1. Download Solr
2. Configure classification
3. Create some documents using curl over HTTP.
Is it difficult to reproduce the steps / problem?
Tomas
> On 23 Jun 2016, at 16:42, Alessandro Benedetti <be...@gmail.com> wrote:
>
> Can you give an example of your schema, and can you run a simple query for
> you index, curious to see how the input fields are analyzed.
>
> Cheers
>
> On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti <
> benedetti.alex85@gmail.com> wrote:
>
>> This is better! At list the classifier is invoked!
>> How many docs in the index have the class assigned?
>> Take a look to the stacktrace and you should find the cause!
>> I am now on mobile, I will check the code tomorrow!
>> Cheers
>> On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" <
>> Tomas.Ramanauskas@springer.com> wrote:
>>
>>>
>>> I also tried with this config (adding **):
>>>
>>>
>>> <initParams path="/update/**">
>>> <lst name="defaults">
>>> <str name="update.chain">classification</str>
>>> </lst>
>>> </initParams>
>>>
>>>
>>>
>>>
>>>
>>> And I get the error:
>>>
>>>
>>>
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book15",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s": null,
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]'
>>> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat
>>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat
>>> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat
>>> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat
>>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat
>>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat
>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>>> org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat
>>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
>>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat
>>> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
>>> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)\n\tat
>>> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)\n\tat
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat
>>> java.lang.Thread.run(Thread.java:745)\n","code":500}}
>>>
>>>
>>> Tomas
>>>
>>>
>>> On 22 Jun 2016, at 17:22, Tomas Ramanauskas <
>>> Tomas.Ramanauskas@springer.com<ma...@springer.com>>
>>> wrote:
>>>
>>> Thanks for the response, Alessandro.
>>>
>>> I tried this and it didn’t work either:
>>>
>>>
>>>
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book14",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s": null,
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]’
>>>
>>> {"responseHeader":{"status":0,"QTime":2}}
>>>
>>> $ curl http://localhost:8983/solr/demo/get?id=book14
>>> {
>>> "doc":
>>> {
>>> "id":"book14",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5",
>>> "_version_":1537854598189940736}}
>>>
>>>
>>> I don’t see “cat_s” field in the results at all.
>>>
>>>
>>> Tomas
>>>
>>>
>>> On 22 Jun 2016, at 16:39, Alessandro Benedetti <abenedetti@apache.org
>>> <ma...@apache.org>> wrote:
>>>
>>> Hi Tomas,
>>> first consideration :
>>> an empty string is different from a NULL string.
>>> This is controversial, I would suggest you to never use the empty String
>>> as
>>> this can cause some others side effect.
>>> Apart from that, the plugin will add the class only if the class field is
>>> without any value
>>>
>>> Object documentClass = doc.getFieldValue(classFieldName);
>>> if (documentClass == null) {
>>>
>>> Saying that, I would suggest you to build a sample index with some
>>> document and then try to classify.
>>> If this doesn't solve your issue, I can help you further.
>>>
>>> Cheers
>>>
>>> On Wed, Jun 22, 2016 at 3:45 PM, Tomas Ramanauskas <
>>> Tomas.Ramanauskas@springer.com<ma...@springer.com>>
>>> wrote:
>>>
>>> I also tried this configuration, but could get the feature to work:
>>>
>>>
>>>
>>> <initParams path="/update/">
>>> <lst name="defaults">
>>> <str name="update.chain">classification</str>
>>> </lst>
>>> </initParams>
>>>
>>>
>>> <updateRequestProcessorChain name="classification">
>>> <processor class="solr.ClassificationUpdateProcessorFactory">
>>> <str name="inputFields">title_t,author_s</str>
>>> <str name="classField">cat_s</str>
>>> <str name="algorithm">bayes</str>
>>> </processor>
>>> </updateRequestProcessorChain>
>>>
>>>
>>> Tomas
>>>
>>> On 22 Jun 2016, at 13:46, Tomas Ramanauskas <
>>> Tomas.Ramanauskas@springer.com<mailto:Tomas.Ramanauskas@springer.com
>>>> <ma...@springer.com>>
>>> wrote:
>>>
>>> P.S. The version I use:
>>>
>>> 6.1.0-68
>>>
>>> Also, earlier I said “If I modify an existing record, I think the
>>> functionality works:”, but I think it doesn’t work for me at all.
>>>
>>> $ curl http://localhost:8983/solr/demo/get?id=book1
>>> {
>>> "doc":
>>> {
>>> "id":"book1",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"fantasy",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5",
>>> "_version_":1535488016326328320}}
>>>
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book1",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"aaa",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]'
>>> {"responseHeader":{"status":0,"QTime":0}}
>>>
>>> $ curl http://localhost:8983/solr/demo/get?id=book1
>>> {
>>> "doc":
>>> {
>>> "id":"book1",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"fantasy",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5",
>>> "_version_":1535488016326328320}}
>>>
>>>
>>> Tomas
>>>
>>>
>>> On 22 Jun 2016, at 12:47, Tomas Ramanauskas <
>>> Tomas.Ramanauskas@springer.com<mailto:Tomas.Ramanauskas@springer.com
>>>> <ma...@springer.com>>
>>> wrote:
>>>
>>> Hi, everyone,
>>>
>>>
>>> would someone be able to share a working example (step by step) that
>>> demonstrates the use of Naive Bayes classifier in Solr?
>>>
>>>
>>> I followed this Blog post:
>>>
>>>
>>> https://alexbenedetti.blogspot.co.uk/2015/07/solr-document-classification-part-1.html?showComment=1464358093048#c2489902302085000947
>>>
>>> And this tutorial:
>>> http://yonik.com/solr-tutorial/
>>>
>>> And this JIRA ticket:
>>> https://issues.apache.org/jira/browse/SOLR-7739
>>>
>>>
>>>
>>> So this is my configuration file (only what I added or modified):
>>>
>>> <initParams path="/update/**">
>>> <lst name="defaults">
>>> <str name="update.chain">classification</str>
>>> </lst>
>>> </initParams>
>>>
>>>
>>> <updateRequestProcessorChain name="classification">
>>> <processor class="solr.ClassificationUpdateProcessorFactory">
>>> <str name="inputFields">title_t,author_s</str>
>>> <str name="classField">cat_s</str>
>>> <str name="algorithm">bayes</str>
>>> </processor>
>>> </updateRequestProcessorChain>
>>>
>>>
>>>
>>> If I modify an existing record, I think the functionality works:
>>>
>>>
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book1",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]'
>>> {"responseHeader":{"status":0,"QTime":8}}
>>> $ curl http://localhost:8983/solr/demo/get?id=book1
>>> {
>>> "doc":
>>> {
>>> "id":"book1",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"fantasy",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5",
>>> "_version_":1535488016326328320}}
>>>
>>>
>>>
>>>
>>> If I add a new document, something isn’t quite working:
>>>
>>> $ curl http://localhost:8983/solr/demo/update -d '
>>> [
>>> {"id" : "book7",
>>> "title_t":["The Way of Kings"],
>>> "author_s":"Brandon Sanderson",
>>> "cat_s":"",
>>> "pubyear_i":2010,
>>> "ISBN_s":"978-0-7653-2635-5"
>>> }
>>> ]'
>>> {"responseHeader":{"status":0,"QTime":0}}
>>> $ curl http://localhost:8983/solr/demo/get?id=book7
>>> {
>>> "doc":null}
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>>
>>>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card - http://about.me/alessandro_benedetti
> Blog - http://alexbenedetti.blogspot.co.uk
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
Re: A working example to play with Naive Bayes classifier
Posted by koponk <ar...@gdplabs.id>.
Hi, i have some problem when implementing this solr classification,
this is my schema :
<field name="pagetext_mlt" type="text_mlt" indexed="true" stored="true"
required="false" multiValued="false" termVectors="true"/>
<field name="knn_tags" type="string" indexed="true" stored="true"
required="false" multiValued="true"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true"
docValues="true" useDocValuesAsStored="true"/>
<fieldType name="text_mlt" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_id.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and this is my solrconfig :
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">classi</str>
</lst>
</requestHandler>
<updateRequestProcessorChain name="classi">
<processor class="solr.ClassificationUpdateProcessorFactory">
<str name="inputFields">pagetext_mlt</str>
<str name="classField">knn_tags</str>
<str name="predictedClassField">prebayes_tags</str>
<field name="prebayes_tags" type="string" indexed="true" stored="true"
required="false" multiValued="true"/>
<str name="algorithm">bayes</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
but this is not working, step :
1. insert document A with pagetext_mlt="something A" and knn_tags="aaa"
2. insert document B with pagetext_mlt="something B" and knn_tags="bbb"
3. insert document C with pagetext_mlt="something B" and knn_tags=null
but field prebayes_tags always empty(i cant see it even when i stored the
field). is it something i miss?
Thanks,
Alessandro Benedetti wrote
> But how big it is your index ? Are you expecting Solr to automatically
> classify your documents without any knowledge groundbase ?
> Please attach an example of schema.
> There was a reason if I asked you :)
> Seems related the fact we get no token from the text analysis.
>
> Cheers
>
> On Fri, Jul 15, 2016 at 12:11 PM, Tomas Ramanauskas <
> Tomas.Ramanauskas@
>> wrote:
>
>> Hi, Allesandro,
>>
>> sorry for the delay. What do you mean?
>>
>>
>> As I mentioned earlier, I followed a super simply set of steps.
>>
>> 1. Download Solr
>> 2. Configure classification
>> 3. Create some documents using curl over HTTP.
>>
>>
>> Is it difficult to reproduce the steps / problem?
>>
>>
>> Tomas
>>
>>
>>
>> > On 23 Jun 2016, at 16:42, Alessandro Benedetti <
>>
> benedetti.alex85@
>> wrote:
>> >
>> > Can you give an example of your schema, and can you run a simple query
>> for
>> > you index, curious to see how the input fields are analyzed.
>> >
>> > Cheers
>> >
>> > On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti <
>> >
> benedetti.alex85@
>> wrote:
>> >
>> >> This is better! At list the classifier is invoked!
>> >> How many docs in the index have the class assigned?
>> >> Take a look to the stacktrace and you should find the cause!
>> >> I am now on mobile, I will check the code tomorrow!
>> >> Cheers
>> >> On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" <
>> >>
> Tomas.Ramanauskas@
>> wrote:
>> >>
>> >>>
>> >>> I also tried with this config (adding **):
>> >>>
>> >>>
>> >>>
> <initParams path="/update/**">
>> >>>
> <lst name="defaults">
>> >>>
> <str name="update.chain">
> classification
> </str>
>> >>>
> </lst>
>> >>>
> </initParams>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> And I get the error:
>> >>>
>> >>>
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/update -d '
>> >>> [
>> >>> {"id" : "book15",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s": null,
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5"
>> >>> }
>> >>> ]'
>> >>>
>> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat
>> >>>
>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat
>> >>>
>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat
>> >>>
>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat
>> >>>
>> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat
>> >>>
>> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat
>> >>>
>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat
>> >>>
>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat
>> >>>
>> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat
>> >>>
>> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat
>> >>>
>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat
>> >>>
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat
>> >>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat
>> >>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat
>> >>>
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat
>> >>>
>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat
>> >>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
>> >>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
>> >>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
>> >>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>> >>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>> >>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
>> >>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
>> >>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
>> >>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>> >>> org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat
>> >>>
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat
>> >>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat
>> >>>
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
>> >>>
>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat
>> >>>
>> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
>> >>>
>> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)\n\tat
>> >>>
>> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)\n\tat
>> >>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat
>> >>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat
>> >>> java.lang.Thread.run(Thread.java:745)\n","code":500}}
>> >>>
>> >>>
>> >>> Tomas
>> >>>
>> >>>
>> >>> On 22 Jun 2016, at 17:22, Tomas Ramanauskas <
>> >>>
> Tomas.Ramanauskas@
> <mailto:
> Tomas.Ramanauskas@
> >>
>> >>> wrote:
>> >>>
>> >>> Thanks for the response, Alessandro.
>> >>>
>> >>> I tried this and it didn’t work either:
>> >>>
>> >>>
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/update -d '
>> >>> [
>> >>> {"id" : "book14",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s": null,
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5"
>> >>> }
>> >>> ]’
>> >>>
>> >>> {"responseHeader":{"status":0,"QTime":2}}
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/get?id=book14
>> >>> {
>> >>> "doc":
>> >>> {
>> >>> "id":"book14",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5",
>> >>> "_version_":1537854598189940736}}
>> >>>
>> >>>
>> >>> I don’t see “cat_s” field in the results at all.
>> >>>
>> >>>
>> >>> Tomas
>> >>>
>> >>>
>> >>> On 22 Jun 2016, at 16:39, Alessandro Benedetti <
> abenedetti@
> > >>> <mailto:
> abenedetti@
> >> wrote:
>> >>>
>> >>> Hi Tomas,
>> >>> first consideration :
>> >>> an empty string is different from a NULL string.
>> >>> This is controversial, I would suggest you to never use the empty
>> String
>> >>> as
>> >>> this can cause some others side effect.
>> >>> Apart from that, the plugin will add the class only if the class
>> field
>> is
>> >>> without any value
>> >>>
>> >>> Object documentClass = doc.getFieldValue(classFieldName);
>> >>> if (documentClass == null) {
>> >>>
>> >>> Saying that, I would suggest you to build a sample index with some
>> >>> document and then try to classify.
>> >>> If this doesn't solve your issue, I can help you further.
>> >>>
>> >>> Cheers
>> >>>
>> >>> On Wed, Jun 22, 2016 at 3:45 PM, Tomas Ramanauskas <
>> >>>
> Tomas.Ramanauskas@
> <mailto:
> Tomas.Ramanauskas@
> >>
>> >>> wrote:
>> >>>
>> >>> I also tried this configuration, but could get the feature to work:
>> >>>
>> >>>
>> >>>
>> >>>
> <initParams path="/update/">
>> >>>
> <lst name="defaults">
>> >>>
> <str name="update.chain">
> classification
> </str>
>> >>>
> </lst>
>> >>>
> </initParams>
>> >>>
>> >>>
>> >>>
> <updateRequestProcessorChain name="classification">
>> >>>
> <processor class="solr.ClassificationUpdateProcessorFactory">
>> >>>
> <str name="inputFields">
> title_t,author_s
> </str>
>> >>>
> <str name="classField">
> cat_s
> </str>
>> >>>
> <str name="algorithm">
> bayes
> </str>
>> >>>
> </processor>
>> >>>
> </updateRequestProcessorChain>
>> >>>
>> >>>
>> >>> Tomas
>> >>>
>> >>> On 22 Jun 2016, at 13:46, Tomas Ramanauskas <
>> >>>
> Tomas.Ramanauskas@
> <mailto:
> Tomas.Ramanauskas@
> > >>>> <mailto:
> Tomas.Ramanauskas@
> >>
>> >>> wrote:
>> >>>
>> >>> P.S. The version I use:
>> >>>
>> >>> 6.1.0-68
>> >>>
>> >>> Also, earlier I said “If I modify an existing record, I think the
>> >>> functionality works:”, but I think it doesn’t work for me at all.
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
>> >>> {
>> >>> "doc":
>> >>> {
>> >>> "id":"book1",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"fantasy",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5",
>> >>> "_version_":1535488016326328320}}
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/update -d '
>> >>> [
>> >>> {"id" : "book1",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"aaa",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5"
>> >>> }
>> >>> ]'
>> >>> {"responseHeader":{"status":0,"QTime":0}}
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
>> >>> {
>> >>> "doc":
>> >>> {
>> >>> "id":"book1",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"fantasy",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5",
>> >>> "_version_":1535488016326328320}}
>> >>>
>> >>>
>> >>> Tomas
>> >>>
>> >>>
>> >>> On 22 Jun 2016, at 12:47, Tomas Ramanauskas <
>> >>>
> Tomas.Ramanauskas@
> <mailto:
> Tomas.Ramanauskas@
> > >>>> <mailto:
> Tomas.Ramanauskas@
> >>
>> >>> wrote:
>> >>>
>> >>> Hi, everyone,
>> >>>
>> >>>
>> >>> would someone be able to share a working example (step by step) that
>> >>> demonstrates the use of Naive Bayes classifier in Solr?
>> >>>
>> >>>
>> >>> I followed this Blog post:
>> >>>
>> >>>
>> >>>
>> https://alexbenedetti.blogspot.co.uk/2015/07/solr-document-classification-part-1.html?showComment=1464358093048#c2489902302085000947
>> >>>
>> >>> And this tutorial:
>> >>> http://yonik.com/solr-tutorial/
>> >>>
>> >>> And this JIRA ticket:
>> >>> https://issues.apache.org/jira/browse/SOLR-7739
>> >>>
>> >>>
>> >>>
>> >>> So this is my configuration file (only what I added or modified):
>> >>>
>> >>>
> <initParams path="/update/**">
>> >>>
> <lst name="defaults">
>> >>>
> <str name="update.chain">
> classification
> </str>
>> >>>
> </lst>
>> >>>
> </initParams>
>> >>>
>> >>>
>> >>>
> <updateRequestProcessorChain name="classification">
>> >>>
> <processor class="solr.ClassificationUpdateProcessorFactory">
>> >>>
> <str name="inputFields">
> title_t,author_s
> </str>
>> >>>
> <str name="classField">
> cat_s
> </str>
>> >>>
> <str name="algorithm">
> bayes
> </str>
>> >>>
> </processor>
>> >>>
> </updateRequestProcessorChain>
>> >>>
>> >>>
>> >>>
>> >>> If I modify an existing record, I think the functionality works:
>> >>>
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/update -d '
>> >>> [
>> >>> {"id" : "book1",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5"
>> >>> }
>> >>> ]'
>> >>> {"responseHeader":{"status":0,"QTime":8}}
>> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
>> >>> {
>> >>> "doc":
>> >>> {
>> >>> "id":"book1",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"fantasy",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5",
>> >>> "_version_":1535488016326328320}}
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> If I add a new document, something isn’t quite working:
>> >>>
>> >>> $ curl http://localhost:8983/solr/demo/update -d '
>> >>> [
>> >>> {"id" : "book7",
>> >>> "title_t":["The Way of Kings"],
>> >>> "author_s":"Brandon Sanderson",
>> >>> "cat_s":"",
>> >>> "pubyear_i":2010,
>> >>> "ISBN_s":"978-0-7653-2635-5"
>> >>> }
>> >>> ]'
>> >>> {"responseHeader":{"status":0,"QTime":0}}
>> >>> $ curl http://localhost:8983/solr/demo/get?id=book7
>> >>> {
>> >>> "doc":null}
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> --------------------------
>> >>>
>> >>> Benedetti Alessandro
>> >>> Visiting card : http://about.me/alessandro_benedetti
>> >>>
>> >>> "Tyger, tyger burning bright
>> >>> In the forests of the night,
>> >>> What immortal hand or eye
>> >>> Could frame thy fearful symmetry?"
>> >>>
>> >>> William Blake - Songs of Experience -1794 England
>> >>>
>> >>>
>> >>>
>> >
>> >
>> > --
>> > --------------------------
>> >
>> > Benedetti Alessandro
>> > Visiting card - http://about.me/alessandro_benedetti
>> > Blog - http://alexbenedetti.blogspot.co.uk
>> >
>> > "Tyger, tyger burning bright
>> > In the forests of the night,
>> > What immortal hand or eye
>> > Could frame thy fearful symmetry?"
>> >
>> > William Blake - Songs of Experience -1794 England
>>
>>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: A working example to play with Naive Bayes classifier
Posted by Alessandro Benedetti <ab...@apache.org>.
But how big it is your index ? Are you expecting Solr to automatically
classify your documents without any knowledge groundbase ?
Please attach an example of schema.
There was a reason if I asked you :)
Seems related the fact we get no token from the text analysis.
Cheers
On Fri, Jul 15, 2016 at 12:11 PM, Tomas Ramanauskas <
Tomas.Ramanauskas@springer.com> wrote:
> Hi, Allesandro,
>
> sorry for the delay. What do you mean?
>
>
> As I mentioned earlier, I followed a super simply set of steps.
>
> 1. Download Solr
> 2. Configure classification
> 3. Create some documents using curl over HTTP.
>
>
> Is it difficult to reproduce the steps / problem?
>
>
> Tomas
>
>
>
> > On 23 Jun 2016, at 16:42, Alessandro Benedetti <
> benedetti.alex85@gmail.com> wrote:
> >
> > Can you give an example of your schema, and can you run a simple query
> for
> > you index, curious to see how the input fields are analyzed.
> >
> > Cheers
> >
> > On Wed, Jun 22, 2016 at 6:05 PM, Alessandro Benedetti <
> > benedetti.alex85@gmail.com> wrote:
> >
> >> This is better! At list the classifier is invoked!
> >> How many docs in the index have the class assigned?
> >> Take a look to the stacktrace and you should find the cause!
> >> I am now on mobile, I will check the code tomorrow!
> >> Cheers
> >> On 22 Jun 2016 5:26 pm, "Tomas Ramanauskas" <
> >> Tomas.Ramanauskas@springer.com> wrote:
> >>
> >>>
> >>> I also tried with this config (adding **):
> >>>
> >>>
> >>> <initParams path="/update/**">
> >>> <lst name="defaults">
> >>> <str name="update.chain">classification</str>
> >>> </lst>
> >>> </initParams>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> And I get the error:
> >>>
> >>>
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book15",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s": null,
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]'
> >>>
> {"responseHeader":{"status":500,"QTime":29},"error":{"trace":"java.lang.NullPointerException\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.getTokenArray(SimpleNaiveBayesDocumentClassifier.java:202)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.analyzeSeedDocument(SimpleNaiveBayesDocumentClassifier.java:162)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignNormClasses(SimpleNaiveBayesDocumentClassifier.java:121)\n\tat
> >>>
> org.apache.lucene.classification.document.SimpleNaiveBayesDocumentClassifier.assignClass(SimpleNaiveBayesDocumentClassifier.java:81)\n\tat
> >>>
> org.apache.solr.update.processor.ClassificationUpdateProcessor.processAdd(ClassificationUpdateProcessor.java:94)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:474)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:138)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:114)\n\tat
> >>>
> org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:77)\n\tat
> >>>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)\n\tat
> >>>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)\n\tat
> >>>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)\n\tat
> >>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)\n\tat
> >>>
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)\n\tat
> >>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\n\tat
> >>>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)\n\tat
> >>>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
> >>>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
> >>>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)\n\tat
> >>>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)\n\tat
> >>>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
> >>>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
> >>>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
> >>>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
> >>> org.eclipse.jetty.server.Server.handle(Server.java:518)\n\tat
> >>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)\n\tat
> >>>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)\n\tat
> >>>
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
> >>> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat
> >>>
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
> >>>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)\n\tat
> >>>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)\n\tat
> >>>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat
> >>>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat
> >>> java.lang.Thread.run(Thread.java:745)\n","code":500}}
> >>>
> >>>
> >>> Tomas
> >>>
> >>>
> >>> On 22 Jun 2016, at 17:22, Tomas Ramanauskas <
> >>> Tomas.Ramanauskas@springer.com<ma...@springer.com>>
> >>> wrote:
> >>>
> >>> Thanks for the response, Alessandro.
> >>>
> >>> I tried this and it didn’t work either:
> >>>
> >>>
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book14",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s": null,
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]’
> >>>
> >>> {"responseHeader":{"status":0,"QTime":2}}
> >>>
> >>> $ curl http://localhost:8983/solr/demo/get?id=book14
> >>> {
> >>> "doc":
> >>> {
> >>> "id":"book14",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5",
> >>> "_version_":1537854598189940736}}
> >>>
> >>>
> >>> I don’t see “cat_s” field in the results at all.
> >>>
> >>>
> >>> Tomas
> >>>
> >>>
> >>> On 22 Jun 2016, at 16:39, Alessandro Benedetti <abenedetti@apache.org
> >>> <ma...@apache.org>> wrote:
> >>>
> >>> Hi Tomas,
> >>> first consideration :
> >>> an empty string is different from a NULL string.
> >>> This is controversial, I would suggest you to never use the empty
> String
> >>> as
> >>> this can cause some others side effect.
> >>> Apart from that, the plugin will add the class only if the class field
> is
> >>> without any value
> >>>
> >>> Object documentClass = doc.getFieldValue(classFieldName);
> >>> if (documentClass == null) {
> >>>
> >>> Saying that, I would suggest you to build a sample index with some
> >>> document and then try to classify.
> >>> If this doesn't solve your issue, I can help you further.
> >>>
> >>> Cheers
> >>>
> >>> On Wed, Jun 22, 2016 at 3:45 PM, Tomas Ramanauskas <
> >>> Tomas.Ramanauskas@springer.com<ma...@springer.com>>
> >>> wrote:
> >>>
> >>> I also tried this configuration, but could get the feature to work:
> >>>
> >>>
> >>>
> >>> <initParams path="/update/">
> >>> <lst name="defaults">
> >>> <str name="update.chain">classification</str>
> >>> </lst>
> >>> </initParams>
> >>>
> >>>
> >>> <updateRequestProcessorChain name="classification">
> >>> <processor class="solr.ClassificationUpdateProcessorFactory">
> >>> <str name="inputFields">title_t,author_s</str>
> >>> <str name="classField">cat_s</str>
> >>> <str name="algorithm">bayes</str>
> >>> </processor>
> >>> </updateRequestProcessorChain>
> >>>
> >>>
> >>> Tomas
> >>>
> >>> On 22 Jun 2016, at 13:46, Tomas Ramanauskas <
> >>> Tomas.Ramanauskas@springer.com<mailto:Tomas.Ramanauskas@springer.com
> >>>> <ma...@springer.com>>
> >>> wrote:
> >>>
> >>> P.S. The version I use:
> >>>
> >>> 6.1.0-68
> >>>
> >>> Also, earlier I said “If I modify an existing record, I think the
> >>> functionality works:”, but I think it doesn’t work for me at all.
> >>>
> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
> >>> {
> >>> "doc":
> >>> {
> >>> "id":"book1",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"fantasy",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5",
> >>> "_version_":1535488016326328320}}
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book1",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"aaa",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]'
> >>> {"responseHeader":{"status":0,"QTime":0}}
> >>>
> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
> >>> {
> >>> "doc":
> >>> {
> >>> "id":"book1",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"fantasy",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5",
> >>> "_version_":1535488016326328320}}
> >>>
> >>>
> >>> Tomas
> >>>
> >>>
> >>> On 22 Jun 2016, at 12:47, Tomas Ramanauskas <
> >>> Tomas.Ramanauskas@springer.com<mailto:Tomas.Ramanauskas@springer.com
> >>>> <ma...@springer.com>>
> >>> wrote:
> >>>
> >>> Hi, everyone,
> >>>
> >>>
> >>> would someone be able to share a working example (step by step) that
> >>> demonstrates the use of Naive Bayes classifier in Solr?
> >>>
> >>>
> >>> I followed this Blog post:
> >>>
> >>>
> >>>
> https://alexbenedetti.blogspot.co.uk/2015/07/solr-document-classification-part-1.html?showComment=1464358093048#c2489902302085000947
> >>>
> >>> And this tutorial:
> >>> http://yonik.com/solr-tutorial/
> >>>
> >>> And this JIRA ticket:
> >>> https://issues.apache.org/jira/browse/SOLR-7739
> >>>
> >>>
> >>>
> >>> So this is my configuration file (only what I added or modified):
> >>>
> >>> <initParams path="/update/**">
> >>> <lst name="defaults">
> >>> <str name="update.chain">classification</str>
> >>> </lst>
> >>> </initParams>
> >>>
> >>>
> >>> <updateRequestProcessorChain name="classification">
> >>> <processor class="solr.ClassificationUpdateProcessorFactory">
> >>> <str name="inputFields">title_t,author_s</str>
> >>> <str name="classField">cat_s</str>
> >>> <str name="algorithm">bayes</str>
> >>> </processor>
> >>> </updateRequestProcessorChain>
> >>>
> >>>
> >>>
> >>> If I modify an existing record, I think the functionality works:
> >>>
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book1",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]'
> >>> {"responseHeader":{"status":0,"QTime":8}}
> >>> $ curl http://localhost:8983/solr/demo/get?id=book1
> >>> {
> >>> "doc":
> >>> {
> >>> "id":"book1",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"fantasy",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5",
> >>> "_version_":1535488016326328320}}
> >>>
> >>>
> >>>
> >>>
> >>> If I add a new document, something isn’t quite working:
> >>>
> >>> $ curl http://localhost:8983/solr/demo/update -d '
> >>> [
> >>> {"id" : "book7",
> >>> "title_t":["The Way of Kings"],
> >>> "author_s":"Brandon Sanderson",
> >>> "cat_s":"",
> >>> "pubyear_i":2010,
> >>> "ISBN_s":"978-0-7653-2635-5"
> >>> }
> >>> ]'
> >>> {"responseHeader":{"status":0,"QTime":0}}
> >>> $ curl http://localhost:8983/solr/demo/get?id=book7
> >>> {
> >>> "doc":null}
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> --------------------------
> >>>
> >>> Benedetti Alessandro
> >>> Visiting card : http://about.me/alessandro_benedetti
> >>>
> >>> "Tyger, tyger burning bright
> >>> In the forests of the night,
> >>> What immortal hand or eye
> >>> Could frame thy fearful symmetry?"
> >>>
> >>> William Blake - Songs of Experience -1794 England
> >>>
> >>>
> >>>
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card - http://about.me/alessandro_benedetti
> > Blog - http://alexbenedetti.blogspot.co.uk
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
>
>
--
--------------------------
Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti
"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"
William Blake - Songs of Experience -1794 England