You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by schroedi <sc...@gmail.com> on 2009/07/30 17:19:42 UTC

Dumping Crawl DB with XML

I found to dump the crawldb/linkdb into a txt file.
Are there other formats to dump into? Maybe XML?
I am thinking to read aboutn csv format

thanks in advance

Mario

-- 

Mario Schröder | http://www.finanz-checks.de
Phone: +49 34464 62301 Cell: +49 163 27 09 807
http://www.xing.com/go/invite/6035007.9c143c



Re: Dumping Crawl DB with XML

Posted by Otis Gospodnetic <og...@yahoo.com>.
Mario,

I think text is the only output format.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: schroedi <sc...@gmail.com>
> To: nutch-user@lucene.apache.org
> Sent: Thursday, July 30, 2009 11:19:42 AM
> Subject: Dumping Crawl DB with XML
> 
> I found to dump the crawldb/linkdb into a txt file.
> Are there other formats to dump into? Maybe XML?
> I am thinking to read aboutn csv format
> 
> thanks in advance
> 
> Mario
> 
> -- 
> 
> Mario Schröder | http://www.finanz-checks.de
> Phone: +49 34464 62301 Cell: +49 163 27 09 807
> http://www.xing.com/go/invite/6035007.9c143c