You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jonathan Hasenburg (JIRA)" <ji...@apache.org> on 2015/08/26 14:33:46 UTC

[jira] [Comment Edited] (FLINK-2189) NullPointerException in MutableHashTable

    [ https://issues.apache.org/jira/browse/FLINK-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713031#comment-14713031 ] 

Jonathan Hasenburg edited comment on FLINK-2189 at 8/26/15 12:33 PM:
---------------------------------------------------------------------

Hi, I encountered the problem while testing the implementation of MultinomialNaiveBayes (https://github.com/JonathanH5/flink/blob/newNaiveB/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/MultinomialNaiveBayes.scala). I ran experiments on the IBM Powercluster of DIMA, which currently uses a 0.9.0-SNAPSHOT version of Flink. The last rebase for the Naive Base implementation happened on July the 7th. I will rebase it again, after my experiments are done (to prevent possible problems). Processing 20mb of data worked, while 250mb of data did not.

I am sure the Join in line 368 caused the problem.
Here is the complete Stack-Trace:
{code}
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
	at org.apache.flink.client.program.Client.run(Client.java:412)
	at org.apache.flink.client.program.Client.run(Client.java:355)
	at org.apache.flink.client.program.Client.run(Client.java:348)
	at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63)
	at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:590)
	at de.Hasenburg.Job$.main(Job.scala:137)
	at de.Hasenburg.Job.main(Job.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
	at java.lang.reflect.Method.invoke(Method.java:619)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.run(Client.java:315)
	at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:584)
	at org.apache.flink.client.CliFrontend.run(CliFrontend.java:290)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:880)
	at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:315)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
	at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
	at akka.dispatch.Mailbox.run(Mailbox.scala:221)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
	at org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1094)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:927)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:783)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
	at org.apache.flink.runtime.operators.hash.NonReusingBuildSecondHashMatchIterator.callWithNextKey(NonReusingBuildSecondHashMatchIterator.java:102)
	at org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
	at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
	at java.lang.Thread.run(Thread.java:853)
{code}


was (Author: jonathanh5):
Hi, I encountered the problem while testing the implementation of MultinomialNaiveBayes (https://github.com/JonathanH5/flink/blob/newNaiveB/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/MultinomialNaiveBayes.scala). I ran experiments on the IBM Powercluster of DIMA, which currently uses a 0.9.0-SNAPSHOT version of Flink. The last rebase for the Naive Base implementation happened on July the 7th. I will rebase it again, after my experiments are done (to prevent possible problems). Processing 20mb of data worked, while 250mb of data did not.

I am sure the Join in line 368 caused the problem.
Here is the complete Stack-Trace:
{cod}
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
	at org.apache.flink.client.program.Client.run(Client.java:412)
	at org.apache.flink.client.program.Client.run(Client.java:355)
	at org.apache.flink.client.program.Client.run(Client.java:348)
	at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63)
	at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:590)
	at de.Hasenburg.Job$.main(Job.scala:137)
	at de.Hasenburg.Job.main(Job.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
	at java.lang.reflect.Method.invoke(Method.java:619)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.run(Client.java:315)
	at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:584)
	at org.apache.flink.client.CliFrontend.run(CliFrontend.java:290)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:880)
	at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:315)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29)
	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
	at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
	at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
	at akka.dispatch.Mailbox.run(Mailbox.scala:221)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
	at org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1094)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:927)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:783)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
	at org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
	at org.apache.flink.runtime.operators.hash.NonReusingBuildSecondHashMatchIterator.callWithNextKey(NonReusingBuildSecondHashMatchIterator.java:102)
	at org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
	at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
	at java.lang.Thread.run(Thread.java:853)
{code}

> NullPointerException in MutableHashTable
> ----------------------------------------
>
>                 Key: FLINK-2189
>                 URL: https://issues.apache.org/jira/browse/FLINK-2189
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>            Reporter: Till Rohrmann
>
> [~Felix Neutatz] reported a {{NullPointerException}} in the {{MutableHashTable}} when running the {{ALS}} algorithm. The stack trace is the following:
> {code}
> Caused by: java.lang.NullPointerException
> 	at org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1094)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:927)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:783)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
> 	at org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
> 	at org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
> 	at org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
> 	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
> 	at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> He produced this error on his local machine with the following code:
> {code}
> implicit val env = ExecutionEnvironment.getExecutionEnvironment
> val links = MovieLensUtils.readLinks(movieLensDir + "links.csv")
> val movies = MovieLensUtils.readMovies(movieLensDir + "movies.csv")
> val ratings = MovieLensUtils.readRatings(movieLensDir + "ratings.csv")
> val tags = MovieLensUtils.readTags(movieLensDir + "tags.csv")
>           
> val ratingMatrix =  ratings.map { r => (r.userId.toInt, r.movieId.toInt, r.rating) }
> val testMatrix =  ratings.map { r => (r.userId.toInt, r.movieId.toInt) }
> val als = ALS()
>    .setIterations(10)
>    .setNumFactors(10)
>    .setBlocks(150) 
>      
> als.fit(ratingMatrix)
> val result = als.predict(testMatrix)
>      
> result.print
> val risk = als.empiricalRisk(ratingMatrix).collect().apply(0)
> println("Empirical risk: " + risk) 
> env.execute()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)