You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/08/16 23:00:49 UTC

Obtaining metadata extracted by parsefilter in 2.x

Hi,

I want to check that my parsefilter (which is similar to our
microformats rel-tag parsefilter) is doing the correct filtering
therefore I wish to test the content of the WebPage metadata to check
everything is working as I wish.

Say my filter method mirrors MetaTagParser#filter e.g.

...
last three lines of method
..
ByteBuffer bb = ByteBuffer.wrap(sb.toString().getBytes());
page.putToMetadata(new Utf8(REL_TAG), bb);
return parse;

I would expect every rel=""tag" to be put to the page metadata
To check for this I've tried stuff similar to

page.getFromMetadata(new Utf8("Rel-Tag"));

however I'm not sure how to store the results as an integer from which
I can check against what I know should be there! Previously I've tried
other assertions using ByteBuffer but to no avail.
Can someone help me out please?
Thank you very much in advance.
Lewis


-- 
Lewis

Re: Obtaining metadata extracted by parsefilter in 2.x

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Thank you Ferdy.

Lewis

On Fri, Aug 17, 2012 at 9:49 AM, Ferdy Galema <fe...@kalooga.com> wrote:
> Minor correction for first option: no casting is involved. Just unwrapping.
>
> On Fri, Aug 17, 2012 at 10:48 AM, Ferdy Galema <fe...@kalooga.com>wrote:
>
>> Hi,
>>
>> You have two options to implement assertions:
>>
>> Either unwrap the ByteBuffer obtained from metadata and cast to the right
>> type. The same type as you use in expected. So just the other way around as
>> you wrapped the value.
>>
>> Or wrap the expected value in the assertion to a ByteBuffer too, and
>> compare the two buffers by comparing the byte[] arrays using an appropiate
>> array compare function.
>>
>> Hope this is of any help.
>>
>> Ferdy.
>>
>>
>> On Thu, Aug 16, 2012 at 11:00 PM, Lewis John Mcgibbney <
>> lewis.mcgibbney@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I want to check that my parsefilter (which is similar to our
>>> microformats rel-tag parsefilter) is doing the correct filtering
>>> therefore I wish to test the content of the WebPage metadata to check
>>> everything is working as I wish.
>>>
>>> Say my filter method mirrors MetaTagParser#filter e.g.
>>>
>>> ...
>>> last three lines of method
>>> ..
>>> ByteBuffer bb = ByteBuffer.wrap(sb.toString().getBytes());
>>> page.putToMetadata(new Utf8(REL_TAG), bb);
>>> return parse;
>>>
>>> I would expect every rel=""tag" to be put to the page metadata
>>> To check for this I've tried stuff similar to
>>>
>>> page.getFromMetadata(new Utf8("Rel-Tag"));
>>>
>>> however I'm not sure how to store the results as an integer from which
>>> I can check against what I know should be there! Previously I've tried
>>> other assertions using ByteBuffer but to no avail.
>>> Can someone help me out please?
>>> Thank you very much in advance.
>>> Lewis
>>>
>>>
>>> --
>>> Lewis
>>>
>>
>>



-- 
Lewis

Re: Obtaining metadata extracted by parsefilter in 2.x

Posted by Ferdy Galema <fe...@kalooga.com>.
Minor correction for first option: no casting is involved. Just unwrapping.

On Fri, Aug 17, 2012 at 10:48 AM, Ferdy Galema <fe...@kalooga.com>wrote:

> Hi,
>
> You have two options to implement assertions:
>
> Either unwrap the ByteBuffer obtained from metadata and cast to the right
> type. The same type as you use in expected. So just the other way around as
> you wrapped the value.
>
> Or wrap the expected value in the assertion to a ByteBuffer too, and
> compare the two buffers by comparing the byte[] arrays using an appropiate
> array compare function.
>
> Hope this is of any help.
>
> Ferdy.
>
>
> On Thu, Aug 16, 2012 at 11:00 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Hi,
>>
>> I want to check that my parsefilter (which is similar to our
>> microformats rel-tag parsefilter) is doing the correct filtering
>> therefore I wish to test the content of the WebPage metadata to check
>> everything is working as I wish.
>>
>> Say my filter method mirrors MetaTagParser#filter e.g.
>>
>> ...
>> last three lines of method
>> ..
>> ByteBuffer bb = ByteBuffer.wrap(sb.toString().getBytes());
>> page.putToMetadata(new Utf8(REL_TAG), bb);
>> return parse;
>>
>> I would expect every rel=""tag" to be put to the page metadata
>> To check for this I've tried stuff similar to
>>
>> page.getFromMetadata(new Utf8("Rel-Tag"));
>>
>> however I'm not sure how to store the results as an integer from which
>> I can check against what I know should be there! Previously I've tried
>> other assertions using ByteBuffer but to no avail.
>> Can someone help me out please?
>> Thank you very much in advance.
>> Lewis
>>
>>
>> --
>> Lewis
>>
>
>

Re: Obtaining metadata extracted by parsefilter in 2.x

Posted by Ferdy Galema <fe...@kalooga.com>.
Hi,

You have two options to implement assertions:

Either unwrap the ByteBuffer obtained from metadata and cast to the right
type. The same type as you use in expected. So just the other way around as
you wrapped the value.

Or wrap the expected value in the assertion to a ByteBuffer too, and
compare the two buffers by comparing the byte[] arrays using an appropiate
array compare function.

Hope this is of any help.

Ferdy.

On Thu, Aug 16, 2012 at 11:00 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi,
>
> I want to check that my parsefilter (which is similar to our
> microformats rel-tag parsefilter) is doing the correct filtering
> therefore I wish to test the content of the WebPage metadata to check
> everything is working as I wish.
>
> Say my filter method mirrors MetaTagParser#filter e.g.
>
> ...
> last three lines of method
> ..
> ByteBuffer bb = ByteBuffer.wrap(sb.toString().getBytes());
> page.putToMetadata(new Utf8(REL_TAG), bb);
> return parse;
>
> I would expect every rel=""tag" to be put to the page metadata
> To check for this I've tried stuff similar to
>
> page.getFromMetadata(new Utf8("Rel-Tag"));
>
> however I'm not sure how to store the results as an integer from which
> I can check against what I know should be there! Previously I've tried
> other assertions using ByteBuffer but to no avail.
> Can someone help me out please?
> Thank you very much in advance.
> Lewis
>
>
> --
> Lewis
>