You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nathan Kronenfeld <nk...@uncharted.software> on 2016/10/26 16:11:00 UTC

CSV conversion

We are finally converting from Spark 1.6 to Spark 2.0, and are finding one
barrier we can't get past.

In the past, we converted CSV RDDs (not files) to DataFrames using
DataBricks SparkCSV library - creating a CsvParser and calling
parser.csvRdd.

The current incarnation of spark-csv seems only to have a CSV file format
exposed, and the only entry points we can find are when reading files.

What is the modern pattern for converting an already-read RDD of CSV lines
into a dataframe?

Thanks,
                    Nathan Kronenfeld
                    Uncharted Software