You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2010/07/03 02:22:30 UTC

[Nutch Wiki] Update of "WritingPluginExample-0.9" by Ramprasad Ramachandran

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "WritingPluginExample-0.9" page has been changed by Ramprasad Ramachandran.
http://wiki.apache.org/nutch/WritingPluginExample-0.9?action=diff&rev1=11&rev2=12

--------------------------------------------------

  </project>
  }}}
  
+ For Nutch-1.0 write the following:
+ 
+ {{{
+ <?xml version="1.0"?>
+ 
+ <project name="recommended" default="jar-core">
+ 
+   <import file="../build-plugin.xml"/>
+   
+  <!-- Build compilation dependencies -->
+  <target name="deps-jar">
+    <ant target="jar" inheritall="false" dir="../lib-xml"/>
+  </target>
+ 
+   <!-- Add compilation dependencies to classpath -->
+  <path id="plugin.deps">
+    <fileset dir="${nutch.root}/build">
+      <include name="**/lib-xml/*.jar" />
+    </fileset>
+  </path>
+ 
+   <!-- Deploy Unit test dependencies -->
+  <target name="deps-test">
+    <ant target="deploy" inheritall="false" dir="../lib-xml"/>
+    <ant target="deploy" inheritall="false" dir="../nutch-extensionpoints"/>
+    <ant target="deploy" inheritall="false" dir="../protocol-file"/>
+  </target>
+ 
+  
+   <!-- for junit test -->
+   <mkdir dir="${build.test}/data"/>
+   <copy file="data/recommended.html" todir="${build.test}/data"/>
+ </project>
+ }}}
+ 
  Save this file in directory [!YourCheckoutDir]/src/plugin/recommended
  
  == The HTML Parser Extension ==
  
+ NOTE: Nutch-1.0 users make sure that you save all your java files in this directory C:\nutch-1.0\src\plugin\recommended\src\java\org\apache\nutch\parse\recommended
+ 
- This is the source code for the HTML Parser extension.  It tries to grab the contents of the recommended meta tag and add them to the document being parsed. On the directory above, create a file called RecommendedParser.java and add this as the contents:
+ This is the source code for the HTML Parser extension.  It tries to grab the contents of the recommended meta tag and add them to the document being parsed. On the directory , create a file called RecommendedParser.java and add this as the contents:
  
  {{{
  package org.apache.nutch.parse.recommended;
@@ -273, +310 @@

  }}}
  
  == Compiling the plugin ==
+ 
+ For ant installation in Windows, refer this - [[http://ant.apache.org/manual/install.html|ant]]
  
  In order to build the plugin - or Nutch itself - you'll need ant.  If you're using MacOs you can easily get it via [[http://fink.sourceforge.net/|fink]].  Let's get junit while we're at it.