You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shameema Umer <sh...@gmail.com> on 2012/06/06 10:12:08 UTC

sort by publishedDate and get published Date in solr query results

Hi,
Please help me sort by publishedDate and get publishedDate in solr query
results. Do i need to install anything(plugin).

Thanks
Shameema

Re: sort by publishedDate and get published Date in solr query results

Posted by Shameema Umer <sh...@gmail.com>.
OK Jack. Will do.

On Wed, Jun 6, 2012 at 5:29 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Check your Solr log file to see whether errors or warnings are issued. If
> Nutch is sending bogus date values, they should produce warnings.
>
> At this stage there are two strong possibilities:
>
> 1. Nutch is simply not sending that date field value at all.
> 2. Solr is rejecting the date field value because it is not in required
> yyyy-mm-ddThh:mm:ssZ format.
>
> If #2, you need to go the update processor route I mentioned previously.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Shameema Umer
> Sent: Wednesday, June 06, 2012 7:37 AM
> To: solr-user@lucene.apache.org
> Subject: Re: sort by publishedDate and get published Date in solr query
> results
>
>
> Versions: Nutch: 1.4 and Solr: 3.4
>
> My schema file contains
> <!-- fields for feed plugin (tag is also used by microformats-reltag)-->
>       <field name="author" type="string" stored="true" indexed="true"/>
>       <field name="tag" type="string" stored="true" indexed="true"
> multiValued="true"/>
>       <field name="feed" type="string" stored="true" indexed="true"/>
>       <field name="publishedDate" type="date" stored="true"
>           indexed="true"/>
>       <field name="updatedDate" type="date" stored="true"
>           indexed="true"/>
>
>
> But I do not know whether this feed plugin is working or not as I am new to
> nutch and solr.
> Here is my query
> http://localhost:8983/solr/**select/?q=title:'.$v<http://localhost:8983/solr/select/?q=title:%27.$v>
> .'
> content:'.$v.'&sort=**publishedDate desc&fl=tilte content url
> publishedDate&start=0&rows=1&**version=2.2&indent=on&hl=true&**
> hl.fl=content&hl.fragsize=300'
>
> But this is not returning publishedDate on the results.
>
> Should i post this on nutch users mailing?
>
> Thanks.
>
>
> On Wed, Jun 6, 2012 at 4:52 PM, Jack Krupansky <ja...@basetechnology.com>**
> wrote:
>
>  Step 1: Verify that "publishedDate" is in fact the field name that Nutch
>> uses for "published date".
>>
>> Step 2: Make sure the Nutch is passing the date in the format
>> YYYY-MM-DDTHH:MM:SSZ. Whether you need a "Nutch plugin" to do that is not
>> a
>> question for this Solr mailing list. My (very limited) understanding is
>> that there was a Nutch plugin that worked for the old version of Nutch but
>> that it was not updated for the new version of Nutch.
>>
>> Step 3: Have you added the field "publishedDate" to your Solr schema with
>> field type of "date" or "tdate"?
>>
>> If you can't figure out how to fix the problem on the Nutch side of the
>> fence, then you will have to do a custom update processor for Solr. Solr
>> 4.x has some new tools that should make that easier.
>>
>> See:
>> https://issues.apache.org/****jira/browse/SOLR-2802<https://issues.apache.org/**jira/browse/SOLR-2802>
>> <https://**issues.apache.org/jira/browse/**SOLR-2802<https://issues.apache.org/jira/browse/SOLR-2802>
>> >
>>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Shameema Umer
>> Sent: Wednesday, June 06, 2012 4:12 AM
>> To: solr-user@lucene.apache.org
>> Subject: sort by publishedDate and get published Date in solr query
>> results
>>
>>
>> Hi,
>> Please help me sort by publishedDate and get publishedDate in solr query
>> results. Do i need to install anything(plugin).
>>
>> Thanks
>> Shameema
>>
>>
>

Re: sort by publishedDate and get published Date in solr query results

Posted by Jack Krupansky <ja...@basetechnology.com>.
Check your Solr log file to see whether errors or warnings are issued. If 
Nutch is sending bogus date values, they should produce warnings.

At this stage there are two strong possibilities:

1. Nutch is simply not sending that date field value at all.
2. Solr is rejecting the date field value because it is not in required 
yyyy-mm-ddThh:mm:ssZ format.

If #2, you need to go the update processor route I mentioned previously.

-- Jack Krupansky

-----Original Message----- 
From: Shameema Umer
Sent: Wednesday, June 06, 2012 7:37 AM
To: solr-user@lucene.apache.org
Subject: Re: sort by publishedDate and get published Date in solr query 
results

Versions: Nutch: 1.4 and Solr: 3.4

My schema file contains
<!-- fields for feed plugin (tag is also used by microformats-reltag)-->
        <field name="author" type="string" stored="true" indexed="true"/>
        <field name="tag" type="string" stored="true" indexed="true"
multiValued="true"/>
        <field name="feed" type="string" stored="true" indexed="true"/>
        <field name="publishedDate" type="date" stored="true"
            indexed="true"/>
        <field name="updatedDate" type="date" stored="true"
            indexed="true"/>


But I do not know whether this feed plugin is working or not as I am new to
nutch and solr.
Here is my query
http://localhost:8983/solr/select/?q=title:'.$v.'
content:'.$v.'&sort=publishedDate desc&fl=tilte content url
publishedDate&start=0&rows=1&version=2.2&indent=on&hl=true&hl.fl=content&hl.fragsize=300'

But this is not returning publishedDate on the results.

Should i post this on nutch users mailing?

Thanks.


On Wed, Jun 6, 2012 at 4:52 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> Step 1: Verify that "publishedDate" is in fact the field name that Nutch
> uses for "published date".
>
> Step 2: Make sure the Nutch is passing the date in the format
> YYYY-MM-DDTHH:MM:SSZ. Whether you need a "Nutch plugin" to do that is not 
> a
> question for this Solr mailing list. My (very limited) understanding is
> that there was a Nutch plugin that worked for the old version of Nutch but
> that it was not updated for the new version of Nutch.
>
> Step 3: Have you added the field "publishedDate" to your Solr schema with
> field type of "date" or "tdate"?
>
> If you can't figure out how to fix the problem on the Nutch side of the
> fence, then you will have to do a custom update processor for Solr. Solr
> 4.x has some new tools that should make that easier.
>
> See:
> https://issues.apache.org/**jira/browse/SOLR-2802<https://issues.apache.org/jira/browse/SOLR-2802>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Shameema Umer
> Sent: Wednesday, June 06, 2012 4:12 AM
> To: solr-user@lucene.apache.org
> Subject: sort by publishedDate and get published Date in solr query 
> results
>
>
> Hi,
> Please help me sort by publishedDate and get publishedDate in solr query
> results. Do i need to install anything(plugin).
>
> Thanks
> Shameema
> 


Re: sort by publishedDate and get published Date in solr query results

Posted by Shameema Umer <sh...@gmail.com>.
Versions: Nutch: 1.4 and Solr: 3.4

My schema file contains
<!-- fields for feed plugin (tag is also used by microformats-reltag)-->
        <field name="author" type="string" stored="true" indexed="true"/>
        <field name="tag" type="string" stored="true" indexed="true"
multiValued="true"/>
        <field name="feed" type="string" stored="true" indexed="true"/>
        <field name="publishedDate" type="date" stored="true"
            indexed="true"/>
        <field name="updatedDate" type="date" stored="true"
            indexed="true"/>


But I do not know whether this feed plugin is working or not as I am new to
nutch and solr.
Here is my query
http://localhost:8983/solr/select/?q=title:'.$v.'
content:'.$v.'&sort=publishedDate desc&fl=tilte content url
publishedDate&start=0&rows=1&version=2.2&indent=on&hl=true&hl.fl=content&hl.fragsize=300'

But this is not returning publishedDate on the results.

Should i post this on nutch users mailing?

Thanks.


On Wed, Jun 6, 2012 at 4:52 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Step 1: Verify that "publishedDate" is in fact the field name that Nutch
> uses for "published date".
>
> Step 2: Make sure the Nutch is passing the date in the format
> YYYY-MM-DDTHH:MM:SSZ. Whether you need a "Nutch plugin" to do that is not a
> question for this Solr mailing list. My (very limited) understanding is
> that there was a Nutch plugin that worked for the old version of Nutch but
> that it was not updated for the new version of Nutch.
>
> Step 3: Have you added the field "publishedDate" to your Solr schema with
> field type of "date" or "tdate"?
>
> If you can't figure out how to fix the problem on the Nutch side of the
> fence, then you will have to do a custom update processor for Solr. Solr
> 4.x has some new tools that should make that easier.
>
> See:
> https://issues.apache.org/**jira/browse/SOLR-2802<https://issues.apache.org/jira/browse/SOLR-2802>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Shameema Umer
> Sent: Wednesday, June 06, 2012 4:12 AM
> To: solr-user@lucene.apache.org
> Subject: sort by publishedDate and get published Date in solr query results
>
>
> Hi,
> Please help me sort by publishedDate and get publishedDate in solr query
> results. Do i need to install anything(plugin).
>
> Thanks
> Shameema
>

Re: sort by publishedDate and get published Date in solr query results

Posted by Jack Krupansky <ja...@basetechnology.com>.
Step 1: Verify that "publishedDate" is in fact the field name that Nutch 
uses for "published date".

Step 2: Make sure the Nutch is passing the date in the format 
YYYY-MM-DDTHH:MM:SSZ. Whether you need a "Nutch plugin" to do that is not a 
question for this Solr mailing list. My (very limited) understanding is that 
there was a Nutch plugin that worked for the old version of Nutch but that 
it was not updated for the new version of Nutch.

Step 3: Have you added the field "publishedDate" to your Solr schema with 
field type of "date" or "tdate"?

If you can't figure out how to fix the problem on the Nutch side of the 
fence, then you will have to do a custom update processor for Solr. Solr 4.x 
has some new tools that should make that easier.

See:
https://issues.apache.org/jira/browse/SOLR-2802

-- Jack Krupansky

-----Original Message----- 
From: Shameema Umer
Sent: Wednesday, June 06, 2012 4:12 AM
To: solr-user@lucene.apache.org
Subject: sort by publishedDate and get published Date in solr query results

Hi,
Please help me sort by publishedDate and get publishedDate in solr query
results. Do i need to install anything(plugin).

Thanks
Shameema