You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Kim Whitehall (JIRA)" <ji...@apache.org> on 2015/09/16 02:10:45 UTC

[jira] [Closed] (NUTCH-2100) Nutch dump command doesnt dump anything

     [ https://issues.apache.org/jira/browse/NUTCH-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kim Whitehall closed NUTCH-2100.
--------------------------------
    Resolution: Invalid

The command was used incorrectly. There is no bug. 

> Nutch dump command doesnt dump anything 
> ----------------------------------------
>
>                 Key: NUTCH-2100
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2100
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Kim Whitehall
>            Assignee: Chris A. Mattmann
>
> When running the cmd 
> nutch dump -segment segment -outputDir dumpFolder -mimeStats
> I receive the following 
> Dumper File Stats: 
> TOTAL Stats:
> [
> ]
> The log indicates that segments are being skipped. 
> Note, if I use nutch/readseg -dump  I can see there is content there. 
> The log is shown below:
> 2015-09-15 20:10:56,142 INFO  tools.FileDumper - Accepting all mimetypes.
> 2015-09-15 20:10:56,782 WARN  util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2015-09-15 20:10:57,057 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/crawl_generate]
> 2015-09-15 20:10:57,057 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/crawl_generate/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,057 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/crawl_fetch]
> 2015-09-15 20:10:57,057 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/crawl_fetch/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/content]
> 2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/content/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/parse_text]
> 2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/parse_text/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/parse_data]
> 2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/parse_data/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: [/.../segments/20150915195411/crawl_parse]
> 2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: [/.../segments/20150915195411/crawl_parse/content/part-00000/data]: no data directory present
> 2015-09-15 20:10:57,059 INFO  tools.FileDumper - Dumper File Stats: 
> TOTAL Stats:
> [
> ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)