You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Suneel Marthi (JIRA)" <ji...@apache.org> on 2015/10/25 05:43:27 UTC

[jira] [Work started] (MAHOUT-1775) FileNotFoundException caused by aborting the process of downloading Wikipedia dataset

     [ https://issues.apache.org/jira/browse/MAHOUT-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on MAHOUT-1775 started by Suneel Marthi.
---------------------------------------------
> FileNotFoundException caused by aborting the process of downloading Wikipedia dataset
> -------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1775
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1775
>             Project: Mahout
>          Issue Type: Bug
>          Components: Examples
>            Reporter: Bowei Zhang
>            Assignee: Suneel Marthi
>            Priority: Trivial
>             Fix For: 0.12.0
>
>
> When running the script examples/bin/classify-wikipedia.sh for the first time, it will create a wikixml folder and starts fetching data via curl. If this downloading process is aborted, then in the future when the script is run, it won't extract the .bz2 file (since extracion is guarded by the condition where wikixml doesn't exist) and starts to run Mahout, which will definately end up with throwing up a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)