You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "seth boyles (JIRA)" <ji...@apache.org> on 2012/07/20 02:11:34 UTC

[jira] [Created] (MAHOUT-1047) CVB hangs after completion

seth boyles created MAHOUT-1047:
-----------------------------------

             Summary: CVB hangs after completion
                 Key: MAHOUT-1047
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1047
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 0.7
         Environment: Ubuntu
            Reporter: seth boyles
            Priority: Minor
             Fix For: 0.8, 0.7


After running the new LDA CVB implementation, it hangs and does not terminate the process like every other time I run Mahout

Terminal output:


12/07/19 11:38:49 INFO mapred.LocalJobRunner: 
12/07/19 11:38:49 INFO mapred.Task: Task 'attempt_local_0022_m_000000_0' done.
12/07/19 11:38:49 INFO mapred.JobClient:  map 100% reduce 0%
12/07/19 11:38:49 INFO mapred.JobClient: Job complete: job_local_0022
12/07/19 11:38:49 INFO mapred.JobClient: Counters: 8
12/07/19 11:38:49 INFO mapred.JobClient:   File Output Format Counters 
12/07/19 11:38:49 INFO mapred.JobClient:     Bytes Written=2247793
12/07/19 11:38:49 INFO mapred.JobClient:   File Input Format Counters 
12/07/19 11:38:49 INFO mapred.JobClient:     Bytes Read=1920337
12/07/19 11:38:49 INFO mapred.JobClient:   FileSystemCounters
12/07/19 11:38:49 INFO mapred.JobClient:     FILE_BYTES_READ=1342812616
12/07/19 11:38:49 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1326092302
12/07/19 11:38:49 INFO mapred.JobClient:   Map-Reduce Framework
12/07/19 11:38:49 INFO mapred.JobClient:     Map input records=2772
12/07/19 11:38:49 INFO mapred.JobClient:     Spilled Records=0
12/07/19 11:38:49 INFO mapred.JobClient:     SPLIT_RAW_BYTES=140
12/07/19 11:38:49 INFO mapred.JobClient:     Map output records=2772
12/07/19 11:38:49 INFO driver.MahoutDriver: Program took 4089950 ms (Minutes: 68.16583333333334)

$MAHOUT_HOME/mahout cvb -i /home/seth/Scripted/mahout_data/vectors/vectors/vectors-for-cvb/ -o /home/seth/Scripted/mahout_data/clusters/ -ow -k 90 -dt /home/seth/Scripted/mahout_data/distributions -dict /home/seth/Scripted/mahout_data/vectors/vectors/dictionary.file-0 -mt /home/seth/Scripted/mahout_data/temp/ -x 20 -cd 0.05


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-1047) CVB hangs after completion

Posted by "Andy Schlaikjer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419289#comment-13419289 ] 

Andy Schlaikjer commented on MAHOUT-1047:
-----------------------------------------

Hey Seth, Thanks for the report. This is somewhat expected behavior, though it is definitely a bug-- There is a thread pool within the CVB implementation whose threads are not daemon threads and aren't halted cleanly at the end of a run, causing the jvm to hang after the main thread has terminated.

                
> CVB hangs after completion
> --------------------------
>
>                 Key: MAHOUT-1047
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1047
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.7
>         Environment: Ubuntu
>            Reporter: seth boyles
>            Priority: Minor
>              Labels: cvb, lda
>             Fix For: 0.7, 0.8
>
>
> After running the new LDA CVB implementation, it hangs and does not terminate the process like every other time I run Mahout
> Terminal output:
> 12/07/19 11:38:49 INFO mapred.LocalJobRunner: 
> 12/07/19 11:38:49 INFO mapred.Task: Task 'attempt_local_0022_m_000000_0' done.
> 12/07/19 11:38:49 INFO mapred.JobClient:  map 100% reduce 0%
> 12/07/19 11:38:49 INFO mapred.JobClient: Job complete: job_local_0022
> 12/07/19 11:38:49 INFO mapred.JobClient: Counters: 8
> 12/07/19 11:38:49 INFO mapred.JobClient:   File Output Format Counters 
> 12/07/19 11:38:49 INFO mapred.JobClient:     Bytes Written=2247793
> 12/07/19 11:38:49 INFO mapred.JobClient:   File Input Format Counters 
> 12/07/19 11:38:49 INFO mapred.JobClient:     Bytes Read=1920337
> 12/07/19 11:38:49 INFO mapred.JobClient:   FileSystemCounters
> 12/07/19 11:38:49 INFO mapred.JobClient:     FILE_BYTES_READ=1342812616
> 12/07/19 11:38:49 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1326092302
> 12/07/19 11:38:49 INFO mapred.JobClient:   Map-Reduce Framework
> 12/07/19 11:38:49 INFO mapred.JobClient:     Map input records=2772
> 12/07/19 11:38:49 INFO mapred.JobClient:     Spilled Records=0
> 12/07/19 11:38:49 INFO mapred.JobClient:     SPLIT_RAW_BYTES=140
> 12/07/19 11:38:49 INFO mapred.JobClient:     Map output records=2772
> 12/07/19 11:38:49 INFO driver.MahoutDriver: Program took 4089950 ms (Minutes: 68.16583333333334)
> $MAHOUT_HOME/mahout cvb -i /home/seth/Scripted/mahout_data/vectors/vectors/vectors-for-cvb/ -o /home/seth/Scripted/mahout_data/clusters/ -ow -k 90 -dt /home/seth/Scripted/mahout_data/distributions -dict /home/seth/Scripted/mahout_data/vectors/vectors/dictionary.file-0 -mt /home/seth/Scripted/mahout_data/temp/ -x 20 -cd 0.05

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira