You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/10/25 05:23:27 UTC

[jira] [Commented] (MAHOUT-1775) FileNotFoundException caused by aborting the process of downloading Wikipedia dataset

    [ https://issues.apache.org/jira/browse/MAHOUT-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973006#comment-14973006 ] 

ASF GitHub Bot commented on MAHOUT-1775:
----------------------------------------

Github user smarthi commented on the pull request:

    https://github.com/apache/mahout/pull/162#issuecomment-150891179
  
    LGTM


> FileNotFoundException caused by aborting the process of downloading Wikipedia dataset
> -------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1775
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1775
>             Project: Mahout
>          Issue Type: Bug
>          Components: Examples
>            Reporter: Bowei Zhang
>            Priority: Trivial
>
> When running the script examples/bin/classify-wikipedia.sh for the first time, it will create a wikixml folder and starts fetching data via curl. If this downloading process is aborted, then in the future when the script is run, it won't extract the .bz2 file (since extracion is guarded by the condition where wikixml doesn't exist) and starts to run Mahout, which will definately end up with throwing up a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)