You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Otis Gospodnetic <og...@yahoo.com> on 2009/08/03 05:15:18 UTC

Re: Dumping Crawl DB with XML

Mario,

I think text is the only output format.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: schroedi <sc...@gmail.com>
> To: nutch-user@lucene.apache.org
> Sent: Thursday, July 30, 2009 11:19:42 AM
> Subject: Dumping Crawl DB with XML
> 
> I found to dump the crawldb/linkdb into a txt file.
> Are there other formats to dump into? Maybe XML?
> I am thinking to read aboutn csv format
> 
> thanks in advance
> 
> Mario
> 
> -- 
> 
> Mario Schröder | http://www.finanz-checks.de
> Phone: +49 34464 62301 Cell: +49 163 27 09 807
> http://www.xing.com/go/invite/6035007.9c143c