You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by cm...@apache.org on 2002/06/18 02:49:57 UTC

cvs commit: jakarta-lucene-sandbox/contributions/webcrawler-LARM CHANGES.txt

cmarschner    2002/06/17 17:49:57

  Modified:    contributions/webcrawler-LARM CHANGES.txt
  Log:
  added LuceneStorage
  
  Revision  Changes    Path
  1.3       +13 -1     jakarta-lucene-sandbox/contributions/webcrawler-LARM/CHANGES.txt
  
  Index: CHANGES.txt
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene-sandbox/contributions/webcrawler-LARM/CHANGES.txt,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- CHANGES.txt	1 Jun 2002 18:55:15 -0000	1.2
  +++ CHANGES.txt	18 Jun 2002 00:49:57 -0000	1.3
  @@ -1,4 +1,16 @@
  -$id: $
  +$Id$
  +
  +2002-06-18 (cmarschner)
  +	* added an experimental version of Lucene storage. see FetcherMain.java for details how to use it
  +	  LuceneStorage simply saves all fields as specified in WebDocument. add a converter to the 
  +	  storage pipeline before LuceneStorage to do preprocessing
  +
  +2002-06-17 (cmarschner)
  +	* moved HostInfo and HostManager to larm.net package
  +	* included URLNormalizer (todo: source code Docs)
  +	* changed filters to use normalized URLs when appropriate; 
  +	  logs contain normalized version of referer and URL now
  +	  (todo: change description of log format in technical_overview.rtf)
   
   2002-06-01 (cmarschner)
   	* divided Storage into LinkStorage and DocumentStorage
  
  
  

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>