You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Zakaria Hili <za...@gmail.com> on 2016/06/03 14:44:21 UTC

JavaDStream to Dataframe: Java

Hi,
I m newbie in spark and I want to ask you a simple question.
I have an JavaDStream which contains data selected from sql database.
something like (id, user, score ...)
and I want to convert the JavaDStream to a dataframe .

how can I do this with java ?
Thank you
ᐧ

Re: JavaDStream to Dataframe: Java

Posted by Alexander Krasheninnikov <a....@corp.badoo.com>.
Hello!
While operating the JavaDStream you may use a transform() or foreach()
methods, which give you an access to an RDD.

JavaDStream<Row> dataFrameStream =
ctx.textFileStream("source").transform(new Function2<JavaRDD<String>,
Time, JavaRDD<Row>>() {
    @Override
    public JavaRDD<Row> call(JavaRDD<String> incomingRdd, Time
batchTime) throws Exception {
        // Get an API for operating DataFrames
        HiveContext ctx = new HiveContext(incomingRdd.context());
        // create a schema for DataFrame (declare columns)
        StructType schema = null;
        // map incoming data into RDD of DataFrame's rows
        JavaRDD<Row> rowsRdd = incomingRdd.map(rddMember -> new
GenericRow(100));
        // DataFrame creation
        DataFrame df = ctx.createDataFrame(rowsRdd, schema);

        // here you may perform some operations on df, or return it as a stream

        return df.toJavaRDD();
    }
});



On Fri, Jun 3, 2016 at 5:44 PM, Zakaria Hili <za...@gmail.com> wrote:

> Hi,
> I m newbie in spark and I want to ask you a simple question.
> I have an JavaDStream which contains data selected from sql database.
> something like (id, user, score ...)
> and I want to convert the JavaDStream to a dataframe .
>
> how can I do this with java ?
> Thank you
> ᐧ
>