You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by kiran chitturi <ch...@gmail.com> on 2012/09/20 20:33:38 UTC

multiple values for parse-metatags plugin

Hi,

As mentioned in the issue here (
https://issues.apache.org/jira/browse/NUTCH-1467) i have made another patch
for parse-metatags plugin which will parse multiValues and stores it in the
string array.

I also updated the Junit test with the new feature but i am stuck at an
error when executing the Junit test.

When i execute the 'TestMetatagParser,java'
<http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/parse-metatags/src/test/org/apache/nutch/parse/html/TestMetatagParser.java>from
eclipse, it keeps throwing me the error

org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=file

at org.apache.nutch.protocol.ProtocolFactory.getProtocol(
ProtocolFactory.java:80)

at org.apache.nutch.parse.html.TestMetatagParser.testIt(
TestMetatagParser.java:52)

..................


I am able to execute other test files through eclipse but this one is
throwing the error above.

Can someone please tell me how to solve this ? I can patch up the issue,
once i resolve this error.  i have also specified protocol-file in the
nutch-site.xml but it did not work.

It looks like Junit is not able to recognize the html file present ? Where
should it be placed for it to be recognized ?

Many Thanks,
-- 
Kiran Chitturi