You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Arun (JIRA)" <ji...@apache.org> on 2015/06/17 13:48:00 UTC

[jira] [Created] (SPARK-8409) i cant able to read .csv files using read.df() in sparkR of spark 1.4 for eg.) mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", header="false")

Arun created SPARK-8409:
---------------------------

             Summary:  i cant able to read .csv files using read.df() in sparkR of spark 1.4 for eg.) mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", header="false") 
                 Key: SPARK-8409
                 URL: https://issues.apache.org/jira/browse/SPARK-8409
             Project: Spark
          Issue Type: Bug
          Components: Build
    Affects Versions: 1.4.0
         Environment: sparkR API
            Reporter: Arun
            Priority: Critical


Hi, 
In SparkR shell, I invoke: 
> mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", header="false") 
I have tried various filetypes (csv, txt), all fail.   

RESPONSE: "ERROR RBackendHandler: load on 1 failed" 
BELOW THE WHOLE RESPONSE: 
15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with curMem=0, maxMem=278302556 
15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 173.4 KB, free 265.2 MB) 
15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with curMem=177600, maxMem=278302556 
15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 16.2 KB, free 265.2 MB) 
15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:37142 (size: 16.2 KB, free: 265.4 MB) 
15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at NativeMethodAccessorImpl.java:-2 
15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 
15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed 
java.lang.reflect.InvocationTargetException 
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
        at java.lang.reflect.Method.invoke(Method.java:606) 
        at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) 
        at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) 
        at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) 
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) 
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) 
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) 
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) 
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) 
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) 
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) 
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) 
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) 
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) 
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) 
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) 
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) 
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) 
        at java.lang.Thread.run(Thread.java:745) 
Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json 
        at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) 
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) 
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) 
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) 
        at scala.Option.getOrElse(Option.scala:120) 
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) 
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) 
        at scala.Option.getOrElse(Option.scala:120) 
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) 
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) 
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) 
        at scala.Option.getOrElse(Option.scala:120) 
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) 
        at org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069) 
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148) 
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109) 
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:286) 
        at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067) 
        at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) 
        at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) 
        at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138) 
        at scala.Option.getOrElse(Option.scala:120) 
        at org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137) 
        at org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137) 
        at org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) 
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) 
        at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) 
        ... 25 more 
Error: returnStatus == 0 is not TRUE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org