You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Thijs Haarhuis <th...@oranggo.com> on 2019/02/13 14:01:42 UTC
SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 ...when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs
Re: SparkR + binary type + how to get value
Posted by Felix Cheung <fe...@hotmail.com>.
from the second image it looks like there is protocol mismatch. I’d check if the SparkR package running there on Livy machine matches the Spark java release.
But in any case this seems more an issue with Livy config. I’d suggest checking with the community there:
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Tuesday, February 19, 2019 5:28 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Thanks. I got it working now by using the unlist function.
I have another question, maybe you can help me with, since I did see your naming popping up regarding the spark.lapply function.
I am using Apache Livy and am having troubles using this function, I even reported a jira ticket for it at:
https://jira.apache.org/jira/browse/LIVY-558
When I call the spark.lapply function it reports that SparkR is not initialized.
I have looked into the spark.lapply function and it seems there is no spark context.
Any idea how I can debug this?
I hope you can help.
Regards,
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Sunday, February 17, 2019 7:18 PM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
A byte buffer in R is the raw vector type, so seems like it is working as expected. What do you have in the raw byte? You could convert into other types or access individual byte directly...
https://stat.ethz.ch/R-manual/R-devel/library/base/html/raw.html
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Thursday, February 14, 2019 4:01 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Sure..
I have the following code:
printSchema(results)
cat("\n\n\n")
firstRow <- first(results)
value <- firstRow$value
cat(paste0("Value Type: '",typeof(value),"'\n\n\n"))
cat(paste0("Value: '",value,"'\n\n\n"))
results is a Spark Data Frame here.
When I run this code the following is printed to console:
[cid:04497e3e-7983-488a-8516-5d2349778f03]
You can there is only a single column in this sdf of type binary
when I collect this value and print the type it prints it is a list.
Any idea how to get the actual value, or how to process the individual bytes?
Thanks
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Thursday, February 14, 2019 5:31 AM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Please share your code
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 …when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs
Re: SparkR + binary type + how to get value
Posted by Thijs Haarhuis <th...@oranggo.com>.
Hi Felix,
Thanks. I got it working now by using the unlist function.
I have another question, maybe you can help me with, since I did see your naming popping up regarding the spark.lapply function.
I am using Apache Livy and am having troubles using this function, I even reported a jira ticket for it at:
https://jira.apache.org/jira/browse/LIVY-558
When I call the spark.lapply function it reports that SparkR is not initialized.
I have looked into the spark.lapply function and it seems there is no spark context.
Any idea how I can debug this?
I hope you can help.
Regards,
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Sunday, February 17, 2019 7:18 PM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
A byte buffer in R is the raw vector type, so seems like it is working as expected. What do you have in the raw byte? You could convert into other types or access individual byte directly...
https://stat.ethz.ch/R-manual/R-devel/library/base/html/raw.html
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Thursday, February 14, 2019 4:01 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Sure..
I have the following code:
printSchema(results)
cat("\n\n\n")
firstRow <- first(results)
value <- firstRow$value
cat(paste0("Value Type: '",typeof(value),"'\n\n\n"))
cat(paste0("Value: '",value,"'\n\n\n"))
results is a Spark Data Frame here.
When I run this code the following is printed to console:
[cid:04497e3e-7983-488a-8516-5d2349778f03]
You can there is only a single column in this sdf of type binary
when I collect this value and print the type it prints it is a list.
Any idea how to get the actual value, or how to process the individual bytes?
Thanks
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Thursday, February 14, 2019 5:31 AM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Please share your code
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 …when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs
Re: SparkR + binary type + how to get value
Posted by Felix Cheung <fe...@hotmail.com>.
A byte buffer in R is the raw vector type, so seems like it is working as expected. What do you have in the raw byte? You could convert into other types or access individual byte directly...
https://stat.ethz.ch/R-manual/R-devel/library/base/html/raw.html
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Thursday, February 14, 2019 4:01 AM
To: Felix Cheung; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Hi Felix,
Sure..
I have the following code:
printSchema(results)
cat("\n\n\n")
firstRow <- first(results)
value <- firstRow$value
cat(paste0("Value Type: '",typeof(value),"'\n\n\n"))
cat(paste0("Value: '",value,"'\n\n\n"))
results is a Spark Data Frame here.
When I run this code the following is printed to console:
[cid:04497e3e-7983-488a-8516-5d2349778f03]
You can there is only a single column in this sdf of type binary
when I collect this value and print the type it prints it is a list.
Any idea how to get the actual value, or how to process the individual bytes?
Thanks
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Thursday, February 14, 2019 5:31 AM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Please share your code
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 …when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs
Re: SparkR + binary type + how to get value
Posted by Thijs Haarhuis <th...@oranggo.com>.
Hi Felix,
Sure..
I have the following code:
printSchema(results)
cat("\n\n\n")
firstRow <- first(results)
value <- firstRow$value
cat(paste0("Value Type: '",typeof(value),"'\n\n\n"))
cat(paste0("Value: '",value,"'\n\n\n"))
results is a Spark Data Frame here.
When I run this code the following is printed to console:
[cid:04497e3e-7983-488a-8516-5d2349778f03]
You can there is only a single column in this sdf of type binary
when I collect this value and print the type it prints it is a list.
Any idea how to get the actual value, or how to process the individual bytes?
Thanks
Thijs
________________________________
From: Felix Cheung <fe...@hotmail.com>
Sent: Thursday, February 14, 2019 5:31 AM
To: Thijs Haarhuis; user@spark.apache.org
Subject: Re: SparkR + binary type + how to get value
Please share your code
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 …when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs
Re: SparkR + binary type + how to get value
Posted by Felix Cheung <fe...@hotmail.com>.
Please share your code
________________________________
From: Thijs Haarhuis <th...@oranggo.com>
Sent: Wednesday, February 13, 2019 6:09 AM
To: user@spark.apache.org
Subject: SparkR + binary type + how to get value
Hi all,
Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R?
I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it.
In my case I collect the spark data frame to a R data frame and access the first row.
When I print this row to the console it does print all the hex values correctly.
However when I access the column it prints it is a list of 1 …when I print the type of the child element..it again prints it is a list.
I expected this value to be of a raw type.
Anybody has some experience with this?
Thanks
Thijs