You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pratik Patel <pr...@semandex.net> on 2017/05/16 17:01:51 UTC

Solr Carrot Clustering query with specific label in it

Hi,

When we do a Carrot Clustering query on a set of solr documents we get back
following type of response.

<arr name="clusters">
    <lst>
      <arr name="labels">
        <str>DDR</str>
      </arr>
      <double name="score">3.9599865057283354</double>
      <arr name="docs">
        <str>TWINX2048-3200PRO</str>
        <str>VS1GB400C3</str>
        <str>VDBDB1A16</str>
      </arr>
    </lst>
    <lst>
      <arr name="labels">
        <str>iPod</str>
      </arr>
      <double name="score">11.959228467119022</double>
      <arr name="docs">
        <str>F8V7067-APL-KIT</str>
        <str>IW-02</str>
        <str>MA147LL/A</str>
      </arr>
    </lst>

    <!-- More clusters here, omitted. -->
</arr>

Each label(cluster) has corresponding set of documents. The question is, is
it possible to make another Carrot Clustering query with specific label in
it so as to only get back documents corresponding to that label.

In my use case, I am trying to write a streaming expression where one of
the stream is documents corresponding to a label(carrot cluster) selected
by user. Hence, I can not use the data present in original response object.

I have been exploring Carrot2 documentation but I can't seem find any
option which lets you specify a label in the query. I am using solr 6.4.1
in cloud mode and clustering algorithm is
"org.carrot2.clustering.lingo.LingoClusteringAlgorithm"

Thanks,

Pratik