You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Israel <we...@gmail.com> on 2010/08/18 21:51:28 UTC

Plugin creative commons

Hello n, I was reading about the creative commons ccNutch plugin ..... my
question is that if this is already installed in the plugin creative commons
of nutch ?.... or if q ccNutch download the plugin .... and if the process
being performed is transparaente user, I read that works with RDF?

Thank you very much

RE: Plugin creative commons

Posted by Markus Jelsma <ma...@buyways.nl>.
Hi,

 

I have a hard time understanding what you're trying to ask. But, Nutch uses Apache Tika [1] to extract content from various formats. 

 

[1]: http://tika.apache.org/

 

Cheers,
 
-----Original message-----
From: Israel <we...@gmail.com>
Sent: Wed 18-08-2010 21:52
To: user@nutch.apache.org; 
Subject: Plugin creative commons

Hello n, I was reading about the creative commons ccNutch plugin ..... my
question is that if this is already installed in the plugin creative commons
of nutch ?.... or if q ccNutch download the plugin .... and if the process
being performed is transparaente user, I read that works with RDF?

Thank you very much

Re: Plugin creative commons

Posted by Israel <we...@gmail.com>.
Hello , I was reading about the creative commons ccNutch plugin ..... my
question is that if this is already installed in the plugin of creative
commons of nutch ?.... and if the process being performed is transparaente
for the user, I read that works with RDF? it is true?

Thank you very much



>
>>>>
>>
>
>

Re: Plugin creative commons

Posted by Israel <we...@gmail.com>.
thank you very much ...... Already configured nutch-site ...... the
selection process is transparent to the user? The plugin searches all
metadata resources?



>>>
>

Re: Plugin creative commons

Posted by Israel <we...@gmail.com>.
muchas gracias......  Ya configuré nutch-site...... el proceso de selección
es transparente para el usuario? el plugin busca en todos los metadatos de
los recursos?

>
>>

Re: Plugin creative commons

Posted by André Ricardo <an...@gmail.com>.
  Hello,

The Creative Commons plugin looks for licenses this way:
// 1st choice: subject in RDF
// 2nd: anchor w/ rel=license
// 3rd: anchor w/ CC license
see CCParserFilter.java in 
"src/plugin/creativecommons/src/java/org/creativecommons"

and stores the attributes "nc, nd, sa" in the index in the field "cc".

The Creative Commons plugin is available in Nutch (0.9 to 1.1 and so 
on), to enable it just add "creativecommons" in conf/nutch-site.xml like 
this:

<property>
<name>plugin.includes</name>
<value>
       myplugins|protocol-http|urlfilter-regex|parse-(text|html|zip|swf\
       |js)|index-(basic|anchor)|query-(basic|site|url)|\
       response-(json|xml)|summary-basic|scoring-opic\
       |urlnormalizer-(pass|regex|basic)|creativecommons
</value>
</property>

Hope this helps,
André Ricardo


On 10/08/18 20:51, Israel wrote:
> Hello n, I was reading about the creative commons ccNutch plugin ..... my
> question is that if this is already installed in the plugin creative commons
> of nutch ?.... or if q ccNutch download the plugin .... and if the process
> being performed is transparaente user, I read that works with RDF?
>
> Thank you very much
>