You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2014/07/06 12:27:45 UTC

[Nutch Wiki] Update of "FirstReport" by FjodorVershinin

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "FirstReport" page has been changed by FjodorVershinin:
https://wiki.apache.org/nutch/FirstReport?action=diff&rev1=6&rev2=7

  You can clone mercurial repository using "hg clone https://feodorv@bitbucket.org/feodorv/uinutch" command or download latest version by this link https://bitbucket.org/feodorv/uinutch/get/default.tar.gz
  
  == Future Actions ==
- '''To be conpleted by student'''
- 
+    * Add ability to get logs by REST API
+    * Implement generic crawl cycle in GUI
+    * Add ability upload seed files (or post seed data) by REST API
+    
  == Mentors Comments ==
  It has been some three weeks now since I am aware that Fjodor actually began coding on this project. He stated that he would be delaying the start due to exam committments and I was fine with this. My main concern was what would be the tangible outcome for mid-term reporting based upon how quickly he could get upt-to-speed with the rather complex nature of Nutch 2.X. As far as I am concerned, and based on his initial report as above, I am pretty satisfied that he understands the high level view of Nutch 2.X including the layers of Nutch as the Hadoop-based application, Gora as the storage abstraction later, HBase (amongst others) as the WebPage/Host storage mechanism, Solr as the indexing server and Hadoop as the framework for running Nutch jobs on. This in itself takes most people much longer than 3 weeks so I am reasonablty happy with his progress. I would like Fjodor to be contributing his documentation at regular intervals to '''THIS WIKI''', this serves a number of purposes