You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by anbu <an...@gmail.com> on 2018/04/04 08:13:18 UTC

NumberFormatException while reading and split the file

1st Approach:

error :  value split is not a member of org.apache.spark.sql.Row?

val newRdd = spark.read.text("/xyz/a/b/filename").rdd

anotherRDD = newRdd.
    map(ip =>ip.split("\\|")).map(ip => Row(if (ip(0).isEmpty()) {
null.asInstanceOf[Int] }
                                                    else ip(0).toInt, ip(1),
ip(2), ip(3), ip(4), ip(5))
													
I'm getting the error in the  line 'ip.split("\\|")' value split is not a
member of org.apache.spark.sql.Row?
 
 
Another approach:
 
 error:"java.lang.NumberFormatException: For input string:
 
 
 val newRdd = spark.read.text("/xyz/a/b/filename").rdd

anotherRDD = newRdd.
    map(ip =>ip.toString().split("\\|")).map(ip => Row(if (ip(0).isEmpty())
{ null.asInstanceOf[Int] }
                                                    else ip(0).toInt, ip(1),
ip(2), ip(3), ip(4), ip(5))
		
anotherRDD.collect().foreach(println)		
In this case I'm getting the error "java.lang.NumberFormatException: For
input string: ""												



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: NumberFormatException while reading and split the file

Posted by utkarsh_deep <ut...@gmail.com>.
Response to the 1st approach:

When you do spark.read.text("/xyz/a/b/filename") it returns a DataFrame and
when applying the rdd methods gives you a RDD[Row], so when you use map,
your function get Row as the parameter i.e; ip in your code. Therefore you
must use the Row methods to access its members.
The error message says it clearly "error :  value split is not a member of
org.apache.spark.sql.Row" that there is no method like split so it is
throwing error.



Response to the 2nd approach:

There is something fishy there. The if condition in Row ip(0).isEmpty()
should catch the case when it is an empty string so when it is not actually
empty ip(0).toInt shouldn't fail. But also you need to make sure ip(0) is
not just some random string which can't be converted to Int.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org