You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by rks_lucene <pp...@gmail.com> on 2012/02/23 11:27:46 UTC

Can this type of sorting/boosting be done by solr

Hi,

I have a journal article citation schema like this:
{  AT - article_title
   AID - article_id (Unique id)
   AREFS - article_references_list (List of article id's referred/cited in
this article. Multi-valued)
   AA - Article Abstract
   ---
   other_article_stuff
   ...
}

So for example, in order to search for all those articles that refer(cite)
article id 51643, I simply need to search for AREFS:51643 and it will give
me the list of articles that have 51643 listed in AREFS.

Now, I want to be able to search in the text of articles and sort the
results by "most referred" articles. How can I do this ?

Say if my search query is q=AT:metal and it gives me 1700 results. How can I
sort 1700 results by those that have received maximum number of citations by
others.

I have been researching function queries to solve this but have been unable
to do so.

Thanks in advance.
Ritesh


--
View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can this type of sorting/boosting be done by solr

Posted by rks_lucene <pp...@gmail.com>.
Hi Chantal,

Yes, I have thought about the docfreq(field_name,'search_text') function,
but somehow I will have dereference the article id's (AID) from the result
of the query to the sort. The below query does not work:

q=AT:metal&sort=docfreq(AREFS,$q.AID) 

Is there a mistake in the query that am missing out or is dereferencing not
supported in Relevence functions ?

Thanks,
Ritesh




--
View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769779.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can this type of sorting/boosting be done by solr

Posted by Chantal Ackermann <ch...@btelligent.de>.
Sorry to have misunderstood.
It seems the new Relevance Functions in Solr 4.0 might help - unless you
need to use an official release.

http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions



On Thu, 2012-02-23 at 13:04 +0100, rks_lucene wrote:
> Dear Chantal,
> 
> Thanks for your reply, but thats not what I was asking.
> 
> Let me explain. The size of the list in AREFS would give me how many records
> are *referred by* an article and NOT how many records *refer to* an article.
> 
> Say if an article id - 51463 has been published in 2002 and refers to 10
> articles dating from 1990-2002. Then the count of AREFS would be 10 which is
> static once the journal has been published.
> 
> However if the same article is being *referred to* by 20 articles published
> from 2003-2012 then I am talking about this 20 count. This count is dynamic
> and as we keep adding records to the index, there are more articles that
> will refer to article 51463 it in their AREFS field in the future.
> /(Obviously when we are adding article 51463 to the index we have no clue
> who will be referring to it in the future, so we can have another field in
> it for this, nor can be update 51463 everytime someone refers to it)/
> 
> So today, if I want to know who all are referring to 51463, by actually
> searching for this id in the AREFS field. The query is as simple as
> q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
> the result count would be 20.
> 
> So back to the question, say if my search query is q=AT:metal and it gives
> me 1700 results. How can I 
> sort 1700 results by those that have received maximum number of citations
> (till date) by others. (i.e., that have maximum number of results if I
> individually search their ids in the AREFS field).
> 
> Hope this makes it clear. I feel this is a sort/boost by function query
> candidate. But I am not able to figure it out.
> 
> Thanks
> Ritesh  
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can this type of sorting/boosting be done by solr

Posted by Lee Carroll <le...@googlemail.com>.
Have you looked at external fields?

 http://lucidworks.lucidimagination.com/display/solr/Solr+Field+Types#SolrFieldTypes-WorkingwithExternalFiles

you will need a process to do the counts and note the limitation of
updates only after a commit, but i think it would fit your usecase.



On 23 February 2012 12:04, rks_lucene <pp...@gmail.com> wrote:
> Dear Chantal,
>
> Thanks for your reply, but thats not what I was asking.
>
> Let me explain. The size of the list in AREFS would give me how many records
> are *referred by* an article and NOT how many records *refer to* an article.
>
> Say if an article id - 51463 has been published in 2002 and refers to 10
> articles dating from 1990-2002. Then the count of AREFS would be 10 which is
> static once the journal has been published.
>
> However if the same article is being *referred to* by 20 articles published
> from 2003-2012 then I am talking about this 20 count. This count is dynamic
> and as we keep adding records to the index, there are more articles that
> will refer to article 51463 it in their AREFS field in the future.
> /(Obviously when we are adding article 51463 to the index we have no clue
> who will be referring to it in the future, so we can have another field in
> it for this, nor can be update 51463 everytime someone refers to it)/
>
> So today, if I want to know who all are referring to 51463, by actually
> searching for this id in the AREFS field. The query is as simple as
> q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
> the result count would be 20.
>
> So back to the question, say if my search query is q=AT:metal and it gives
> me 1700 results. How can I
> sort 1700 results by those that have received maximum number of citations
> (till date) by others. (i.e., that have maximum number of results if I
> individually search their ids in the AREFS field).
>
> Hope this makes it clear. I feel this is a sort/boost by function query
> candidate. But I am not able to figure it out.
>
> Thanks
> Ritesh
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can this type of sorting/boosting be done by solr

Posted by rks_lucene <pp...@gmail.com>.
Dear Chantal,

Thanks for your reply, but thats not what I was asking.

Let me explain. The size of the list in AREFS would give me how many records
are *referred by* an article and NOT how many records *refer to* an article.

Say if an article id - 51463 has been published in 2002 and refers to 10
articles dating from 1990-2002. Then the count of AREFS would be 10 which is
static once the journal has been published.

However if the same article is being *referred to* by 20 articles published
from 2003-2012 then I am talking about this 20 count. This count is dynamic
and as we keep adding records to the index, there are more articles that
will refer to article 51463 it in their AREFS field in the future.
/(Obviously when we are adding article 51463 to the index we have no clue
who will be referring to it in the future, so we can have another field in
it for this, nor can be update 51463 everytime someone refers to it)/

So today, if I want to know who all are referring to 51463, by actually
searching for this id in the AREFS field. The query is as simple as
q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
the result count would be 20.

So back to the question, say if my search query is q=AT:metal and it gives
me 1700 results. How can I 
sort 1700 results by those that have received maximum number of citations
(till date) by others. (i.e., that have maximum number of results if I
individually search their ids in the AREFS field).

Hope this makes it clear. I feel this is a sort/boost by function query
candidate. But I am not able to figure it out.

Thanks
Ritesh  

--
View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can this type of sorting/boosting be done by solr

Posted by Chantal Ackermann <ch...@btelligent.de>.
Hi Ritesh,

you could add another field that contains the size of the list in the
AREFS field. This way you'd simply sort by that field in descending
order.

Should you update AREFS dynamically, you'd have to update the field with
the size, as well, of course.

Chantal

On Thu, 2012-02-23 at 11:27 +0100, rks_lucene wrote:
> Hi,
> 
> I have a journal article citation schema like this:
> {  AT - article_title
>    AID - article_id (Unique id)
>    AREFS - article_references_list (List of article id's referred/cited in
> this article. Multi-valued)
>    AA - Article Abstract
>    ---
>    other_article_stuff
>    ...
> }
> 
> So for example, in order to search for all those articles that refer(cite)
> article id 51643, I simply need to search for AREFS:51643 and it will give
> me the list of articles that have 51643 listed in AREFS.
> 
> Now, I want to be able to search in the text of articles and sort the
> results by "most referred" articles. How can I do this ?
> 
> Say if my search query is q=AT:metal and it gives me 1700 results. How can I
> sort 1700 results by those that have received maximum number of citations by
> others.
> 
> I have been researching function queries to solve this but have been unable
> to do so.
> 
> Thanks in advance.
> Ritesh
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html
> Sent from the Solr - User mailing list archive at Nabble.com.