You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yanbo Liang (JIRA)" <ji...@apache.org> on 2015/12/16 11:15:46 UTC

[jira] [Commented] (SPARK-12350) VectorAssembler#transform() initially throws an exception

    [ https://issues.apache.org/jira/browse/SPARK-12350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059788#comment-15059788 ] 

Yanbo Liang commented on SPARK-12350:
-------------------------------------

I can reproduce this issue, but it not caused by ML because it can output the transformed dataframe at the end of the error log. And if we did not use spark-shell to run this program, it works well.

> VectorAssembler#transform() initially throws an exception
> ---------------------------------------------------------
>
>                 Key: SPARK-12350
>                 URL: https://issues.apache.org/jira/browse/SPARK-12350
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>         Environment: sparkShell command from sbt
>            Reporter: Jakob Odersky
>
> Calling VectorAssembler.transform() initially throws an exception, subsequent calls work.
> h3. Steps to reproduce
> In spark-shell,
> 1. Create a dummy dataframe and define an assembler
> {code}
> import org.apache.spark.ml.feature.VectorAssembler
> val df = sc.parallelize(List((1,2), (3,4))).toDF
> val assembler = new VectorAssembler().setInputCols(Array("_1", "_2")).setOutputCol("features")
> {code}
> 2. Run
> {code}
> assembler.transform(df).show
> {code}
> Initially the following exception is thrown:
> {code}
> 15/12/15 16:20:19 ERROR TransportRequestHandler: Error opening stream /classes/org/apache/spark/sql/catalyst/expressions/Object.class for request from /9.72.139.102:60610
> java.lang.IllegalArgumentException: requirement failed: File not found: /classes/org/apache/spark/sql/catalyst/expressions/Object.class
> 	at scala.Predef$.require(Predef.scala:233)
> 	at org.apache.spark.rpc.netty.NettyStreamManager.openStream(NettyStreamManager.scala:60)
> 	at org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:136)
> 	at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:106)
> 	at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
> 	at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> 	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> 	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
> 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> Subsequent calls work:
> {code}
> +---+---+---------+
> | _1| _2| features|
> +---+---+---------+
> |  1|  2|[1.0,2.0]|
> |  3|  4|[3.0,4.0]|
> +---+---+---------+
> {code}
> It seems as though there is some internal state that is not initialized.
> [~iyounus] originally found this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org