You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Prasann modi (JIRA)" <ji...@apache.org> on 2016/10/19 06:03:58 UTC

[jira] [Comment Edited] (SPARK-17588) java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when running glm using gaussian link function.

    [ https://issues.apache.org/jira/browse/SPARK-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15587790#comment-15587790 ] 

Prasann modi edited comment on SPARK-17588 at 10/19/16 6:03 AM:
----------------------------------------------------------------

I'm getting same issue.I'm using sparkr in Rstudio(Os - windows) trying to build glm model(binomial) but getting error and while executing that code it is taking so much time. Dataset contain 30 columns and 200000 records. Please suggest me to improve this code and to resolve this error ...
R Code:
# Set Spark Home
Sys.setenv(SPARK_HOME="C:/spark/spark-2.0.0-bin-hadoop2.7")
# set library path
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths()))
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk1.7.0_71")
# loading SparkR library
library(SparkR)
library(rJava)
sc <- sparkR.session(enableHiveSupport = FALSE,master = "local[*]",appName = "SparkR-Modi",sparkConfig = list(spark.sql.warehouse.dir="file:///c:/tmp/spark-warehouse"))
sqlContext <- sparkRSQL.init(sc)
spdf <- read.df(sqlContext, "C:/Users/prasann/Desktop/V/bigdata11.csv", source = "com.databricks.spark.csv", header = "true")
showDF(spdf)
# glm model
md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf)

Error :
> md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf)
Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  java.lang.AssertionError: assertion failed: lapack.dppsv returned 226.
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40)
	at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression$FamilyAndLink.initialize(GeneralizedLinearRegression.scala:340)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:275)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139)
	at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
	at org.apache.spark.ml.Predictor.fit(Predictor.scala:71)
	at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149)
	at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.c


was (Author: prasann):
I'm getting same issue.I'm using sparkr in Rstudio(Os - windows) trying to build glm model(binomial) but getting error and while executing that code it is taking so much time.Please suggest me what to do...
R Code:
# Set Spark Home
Sys.setenv(SPARK_HOME="C:/spark/spark-2.0.0-bin-hadoop2.7")
# set library path
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths()))
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk1.7.0_71")
# loading SparkR library
library(SparkR)
library(rJava)
sc <- sparkR.session(enableHiveSupport = FALSE,master = "local[*]",appName = "SparkR-Modi",sparkConfig = list(spark.sql.warehouse.dir="file:///c:/tmp/spark-warehouse"))
sqlContext <- sparkRSQL.init(sc)
spdf <- read.df(sqlContext, "C:/Users/prasann/Desktop/V/bigdata11.csv", source = "com.databricks.spark.csv", header = "true")
showDF(spdf)
# glm model
md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf)

Error :
> md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf)
Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  java.lang.AssertionError: assertion failed: lapack.dppsv returned 226.
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40)
	at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression$FamilyAndLink.initialize(GeneralizedLinearRegression.scala:340)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:275)
	at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139)
	at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
	at org.apache.spark.ml.Predictor.fit(Predictor.scala:71)
	at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149)
	at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.c

> java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when running glm using gaussian link function.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17588
>                 URL: https://issues.apache.org/jira/browse/SPARK-17588
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, SparkR
>    Affects Versions: 2.0.0
>            Reporter: sai pavan kumar chitti
>            Assignee: Sean Owen
>            Priority: Minor
>
> hi, 
> i am getting java.lang.AssertionError error when running glm, using gaussian link function, on a dataset with 109 columns and  81318461 rows
> Below is the call trace. Can someone please tell me what the issues is related to and how to go about resolving it. Is it because native acceleration is not working as i am also seeing following warning messages.
> WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
> WARN netlib.LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
> WARN netlib.LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
> 16/09/17 13:08:13 ERROR r.RBackendHandler: fit on org.apache.spark.ml.r.GeneralizedLinearRegressionWrapper failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   java.lang.AssertionError: assertion failed: lapack.dppsv returned 105.
>         at scala.Predef$.assert(Predef.scala:170)
>         at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40)
>         at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140)
>         at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:265)
>         at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139)
>         at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
>         at org.apache.spark.ml.Predictor.fit(Predictor.scala:71)
>         at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149)
>         at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>         at scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.sc
> thanks,
> pavan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org