You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Jan Høydahl <ja...@cominvent.com> on 2011/08/22 16:35:50 UTC

Last-modified from Web crawler

Hi,

How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com


Re: Last-modified from Web crawler

Posted by Karl Wright <da...@gmail.com>.
It's "header-" plus the actual header name, e.g.
"header-Last-Modified".  You should see it go by if you are using the
standard logging options for Solr.

Karl


On Mon, Sep 5, 2011 at 6:34 PM, Jan Høydahl <ja...@cominvent.com> wrote:
> Hi,
>
> What would the name of the metadata from HTTP headers be? Could you give an example for the LastModified header?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 26. aug. 2011, at 10:35, Karl Wright wrote:
>
>> Jan, did this work for you?
>> Karl
>>
>> On Wed, Aug 24, 2011 at 6:54 AM, Karl Wright <da...@gmail.com> wrote:
>>> If I recall, the Solr output connector has a tab that will let you map
>>> "incoming" metadata to whatever solr fieldname you want.  It's called
>>> the Solr Field Mapping tab, and you set it on each job that indexes to
>>> a solr output connection.  Give it a try and see if it works for you.
>>>
>>> Karl
>>>
>>>
>>> On Wed, Aug 24, 2011 at 4:38 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>> Wow, that was quick :)
>>>>
>>>> So, how can we now configure so that "Last-Modified" is sent to the solr output connector as e.g. literal.last_modified ?
>>>>
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>> Solr Training - www.solrtraining.com
>>>>
>>>> On 22. aug. 2011, at 17.09, Jan Høydahl wrote:
>>>>
>>>>> CONNECTORS-243
>>>>>
>>>>> --
>>>>> Jan Høydahl, search solution architect
>>>>> Cominvent AS - www.cominvent.com
>>>>> Solr Training - www.solrtraining.com
>>>>>
>>>>> On 22. aug. 2011, at 16.38, Karl Wright wrote:
>>>>>
>>>>>> It would have to be sent as a metadata field.  This should not be
>>>>>> difficult to implement.  Can you create a JIRA ticket for it please?
>>>>>>
>>>>>> Thanks,
>>>>>> Karl
>>>>>>
>>>>>> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>>>>>>>
>>>>>>> --
>>>>>>> Jan Høydahl, search solution architect
>>>>>>> Cominvent AS - www.cominvent.com
>>>>>>> Solr Training - www.solrtraining.com
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>>
>>>
>
>

Re: Last-modified from Web crawler

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

What would the name of the metadata from HTTP headers be? Could you give an example for the LastModified header?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 26. aug. 2011, at 10:35, Karl Wright wrote:

> Jan, did this work for you?
> Karl
> 
> On Wed, Aug 24, 2011 at 6:54 AM, Karl Wright <da...@gmail.com> wrote:
>> If I recall, the Solr output connector has a tab that will let you map
>> "incoming" metadata to whatever solr fieldname you want.  It's called
>> the Solr Field Mapping tab, and you set it on each job that indexes to
>> a solr output connection.  Give it a try and see if it works for you.
>> 
>> Karl
>> 
>> 
>> On Wed, Aug 24, 2011 at 4:38 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>> Wow, that was quick :)
>>> 
>>> So, how can we now configure so that "Last-Modified" is sent to the solr output connector as e.g. literal.last_modified ?
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>> 
>>> On 22. aug. 2011, at 17.09, Jan Høydahl wrote:
>>> 
>>>> CONNECTORS-243
>>>> 
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>> Solr Training - www.solrtraining.com
>>>> 
>>>> On 22. aug. 2011, at 16.38, Karl Wright wrote:
>>>> 
>>>>> It would have to be sent as a metadata field.  This should not be
>>>>> difficult to implement.  Can you create a JIRA ticket for it please?
>>>>> 
>>>>> Thanks,
>>>>> Karl
>>>>> 
>>>>> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>>>>>> 
>>>>>> --
>>>>>> Jan Høydahl, search solution architect
>>>>>> Cominvent AS - www.cominvent.com
>>>>>> Solr Training - www.solrtraining.com
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 


Re: Last-modified from Web crawler

Posted by Karl Wright <da...@gmail.com>.
Jan, did this work for you?
Karl

On Wed, Aug 24, 2011 at 6:54 AM, Karl Wright <da...@gmail.com> wrote:
> If I recall, the Solr output connector has a tab that will let you map
> "incoming" metadata to whatever solr fieldname you want.  It's called
> the Solr Field Mapping tab, and you set it on each job that indexes to
> a solr output connection.  Give it a try and see if it works for you.
>
> Karl
>
>
> On Wed, Aug 24, 2011 at 4:38 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> Wow, that was quick :)
>>
>> So, how can we now configure so that "Last-Modified" is sent to the solr output connector as e.g. literal.last_modified ?
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> On 22. aug. 2011, at 17.09, Jan Høydahl wrote:
>>
>>> CONNECTORS-243
>>>
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>>
>>> On 22. aug. 2011, at 16.38, Karl Wright wrote:
>>>
>>>> It would have to be sent as a metadata field.  This should not be
>>>> difficult to implement.  Can you create a JIRA ticket for it please?
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>>> Hi,
>>>>>
>>>>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>>>>>
>>>>> --
>>>>> Jan Høydahl, search solution architect
>>>>> Cominvent AS - www.cominvent.com
>>>>> Solr Training - www.solrtraining.com
>>>>>
>>>>>
>>>
>>
>>
>

Re: Last-modified from Web crawler

Posted by Karl Wright <da...@gmail.com>.
If I recall, the Solr output connector has a tab that will let you map
"incoming" metadata to whatever solr fieldname you want.  It's called
the Solr Field Mapping tab, and you set it on each job that indexes to
a solr output connection.  Give it a try and see if it works for you.

Karl


On Wed, Aug 24, 2011 at 4:38 AM, Jan Høydahl <ja...@cominvent.com> wrote:
> Wow, that was quick :)
>
> So, how can we now configure so that "Last-Modified" is sent to the solr output connector as e.g. literal.last_modified ?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 22. aug. 2011, at 17.09, Jan Høydahl wrote:
>
>> CONNECTORS-243
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> On 22. aug. 2011, at 16.38, Karl Wright wrote:
>>
>>> It would have to be sent as a metadata field.  This should not be
>>> difficult to implement.  Can you create a JIRA ticket for it please?
>>>
>>> Thanks,
>>> Karl
>>>
>>> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>>> Hi,
>>>>
>>>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>>>>
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>> Solr Training - www.solrtraining.com
>>>>
>>>>
>>
>
>

Re: Last-modified from Web crawler

Posted by Jan Høydahl <ja...@cominvent.com>.
Wow, that was quick :)

So, how can we now configure so that "Last-Modified" is sent to the solr output connector as e.g. literal.last_modified ?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 22. aug. 2011, at 17.09, Jan Høydahl wrote:

> CONNECTORS-243
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
> 
> On 22. aug. 2011, at 16.38, Karl Wright wrote:
> 
>> It would have to be sent as a metadata field.  This should not be
>> difficult to implement.  Can you create a JIRA ticket for it please?
>> 
>> Thanks,
>> Karl
>> 
>> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>>> Hi,
>>> 
>>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>> 
>>> 
> 


Re: Last-modified from Web crawler

Posted by Jan Høydahl <ja...@cominvent.com>.
CONNECTORS-243

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 22. aug. 2011, at 16.38, Karl Wright wrote:

> It would have to be sent as a metadata field.  This should not be
> difficult to implement.  Can you create a JIRA ticket for it please?
> 
> Thanks,
> Karl
> 
> On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> Hi,
>> 
>> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>> 
>> 


Re: Last-modified from Web crawler

Posted by Karl Wright <da...@gmail.com>.
It would have to be sent as a metadata field.  This should not be
difficult to implement.  Can you create a JIRA ticket for it please?

Thanks,
Karl

On Mon, Aug 22, 2011 at 10:35 AM, Jan Høydahl <ja...@cominvent.com> wrote:
> Hi,
>
> How can we have the Web connector send the last-modified value from a page's HTTP header to the output connector?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
>