You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Sree Eedupuganti <sr...@inndata.in> on 2016/08/29 09:27:07 UTC

How to acess the WrappedArray

Here is the snippet of code :

//The entry point into all functionality in Spark is the SparkSession
class. To create a basic SparkSession, just use SparkSession.builder():

SparkSession spark = SparkSession.builder().appName("Java Spark SQL Example"
).master("local").getOrCreate();

//With a SparkSession, applications can create DataFrames from an existing
RDD, from a Hive table, or from Spark data sources.

Dataset<Row> rows_salaries = spark.read().json(
"/Users/sreeharsha/Downloads/rows_salaries.json");

// Register the DataFrame as a SQL temporary view

rows_salaries.createOrReplaceTempView("salaries");

// SQL statements can be run by using the sql methods provided by spark

List<Row> df = spark.sql("select * from salaries").collectAsList();

for(Row r:df){

                        if(r.get(0)!=null)

                       System.out.println(r.get(0).toString());


                    }


Actaul Output :

WrappedArray(WrappedArray(1, B9B42DE1-E810-4489-9735-B365A47A4012, 1,
1467358044, 697390, 1467358044, 697390, null, Aaron,Patricia G,
Facilities/Office Services II, A03031, OED-Employment Dev (031),
1979-10-24T00:00:00, 56705.00, 54135.44))

Expecting Output:

Need elements from the WrappedArray

Below you can find the attachment of .json file

Re: How to acess the WrappedArray

Posted by Bedrytski Aliaksandr <sp...@bedryt.ski>.

Hi,

It depends on how you see "elements from the WrappedArray" represented.
Is it a List[Any] or you need a special case class for each line? Or you
want to create a DataFrame that will hold the type for each column?

Will the json file always be < 100mb so that you can pre-treat it with a
*sed* command?
If it's the case I would recommend to transform this file into a csv (as
it is a more structured type of file) using bash tools and then read it
with spark while casting column types to the ones that are expected (or
leave the inferred types if they are sufficient enough).

Or (if the file is expected to be larger than bash tools can handle) you
could iterate over the resulting WrappedArray and create a case class
for each line.

PS: I wonder where the *meta* object from the json goes.

--
  Bedrytski Aliaksandr
  spark@bedryt.ski



On Mon, Aug 29, 2016, at 11:27, Sreeharsha wrote:
> Here is the snippet of code :
>
> //The entry point into all functionality in Spark is the SparkSession
> class. To create a basic SparkSession, just use
> SparkSession.builder():
> SparkSession spark = SparkSession.builder().appName("Java Spark SQL
> Example").master("local").getOrCreate();
> //With a SparkSession, applications can create DataFrames from an
> existing RDD, from a Hive table, or from Spark data sources.
> Dataset<Row> rows_salaries =
> spark.read().json("/Users/sreeharsha/Downloads/rows_salaries.json");
> // Register the DataFrame as a SQL temporary view
> rows_salaries.createOrReplaceTempView("salaries");
> // SQL statements can be run by using the sql methods provided
> by spark
> List<Row> df = spark.sql("select * from salaries").collectAsList();
> for(Row r:df){
>                         if(r.get(0)!=null)
>                        System.out.println(r.get(0).toString());
>                     }
>
>
> Actaul Output :
> WrappedArray(WrappedArray(1, B9B42DE1-E810-4489-9735-B365A47A4012, 1,
> 1467358044, 697390, 1467358044, 697390, null, Aaron,Patricia G,
> Facilities/Office Services II, A03031, OED-Employment Dev (031), 1979-10-
> 24T00:00:00, 56705.00, 54135.44))
> Expecting Output:
> Need elements from the WrappedArray
> Below you can find the attachment of .json file
>
>   *rows_salaries.json* (4M) Download Attachment[1]
>
> View this message in context:How to acess the WrappedArray[2]
>  Sent from the Apache Spark User List mailing list archive[3] at
>  Nabble.com.


Links:

  1. http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27615/0/rows_salaries.json
  2. http://apache-spark-user-list.1001560.n3.nabble.com/How-to-acess-the-WrappedArray-tp27615.html
  3. http://apache-spark-user-list.1001560.n3.nabble.com/

Re: How to acess the WrappedArray

Posted by Denis Bolshakov <bo...@gmail.com>.

Hello,

Not sure that it will help, but I would do the following

1. Need to create a case class which matches your json schema.
2. Change the following line:
old:
Dataset<Row> rows_salaries = spark.read().json("/Users/
sreeharsha/Downloads/rows_salaries.json");
new:
Dataset<MyCaseClass> rows_salaries = spark.read().json("/Users/
sreeharsha/Downloads/rows_salaries.json").as[MyCaseClass];
3. Make your code compiling successfully

BR,
Denis

On 29 August 2016 at 12:27, Sree Eedupuganti <sr...@inndata.in> wrote:

> Here is the snippet of code :
>
> //The entry point into all functionality in Spark is the SparkSession
> class. To create a basic SparkSession, just use SparkSession.builder():
>
> SparkSession spark = SparkSession.builder().appName("Java Spark SQL
> Example").master("local").getOrCreate();
>
> //With a SparkSession, applications can create DataFrames from an existing
> RDD, from a Hive table, or from Spark data sources.
>
> Dataset<Row> rows_salaries = spark.read().json("/Users/
> sreeharsha/Downloads/rows_salaries.json");
>
> // Register the DataFrame as a SQL temporary view
>
> rows_salaries.createOrReplaceTempView("salaries");
>
> // SQL statements can be run by using the sql methods provided by spark
>
> List<Row> df = spark.sql("select * from salaries").collectAsList();
>
> for(Row r:df){
>
>                         if(r.get(0)!=null)
>
>                        System.out.println(r.get(0).toString());
>
>
>                     }
>
>
> Actaul Output :
>
> WrappedArray(WrappedArray(1, B9B42DE1-E810-4489-9735-B365A47A4012, 1,
> 1467358044, 697390, 1467358044, 697390, null, Aaron,Patricia G,
> Facilities/Office Services II, A03031, OED-Employment Dev (031),
> 1979-10-24T00:00:00, 56705.00, 54135.44))
>
> Expecting Output:
>
> Need elements from the WrappedArray
>
> Below you can find the attachment of .json file
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>



-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.denis@gmail.com