You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by upendra 1991 <up...@yahoo.com.INVALID> on 2017/06/05 21:15:36 UTC

Adding header to an rdd before saving to text file

I am reading a CSV(file has headers header 1st,header2) and generating rdd, After few transformations I create an rdd and finally write it to a txt file. 
What's the best way to add the header from source file, into rdd and have it available as header into new file I.e, when I transform the rdd into textfile using saveAsTexFile("newfile") the header 1, header 2 shall be available.

Thanks,Upendra

Re: Adding header to an rdd before saving to text file

Posted by Irving Duran <ir...@gmail.com>.
Not a best option, but I've done this before. If you know the columns
structure you could manually write them to the file before exporting.

On Tue, Jun 6, 2017 at 12:39 AM 颜发才(Yan Facai) <fa...@gmail.com> wrote:

> Hi, upendra.
> It will be easier to use DataFrame to read/save csv file with header, if
> you'd like.
>
> On Tue, Jun 6, 2017 at 5:15 AM, upendra 1991 <
> upendra1991@yahoo.com.invalid> wrote:
>
>> I am reading a CSV(file has headers header 1st,header2) and generating
>> rdd,
>> After few transformations I create an rdd and finally write it to a txt
>> file.
>>
>> What's the best way to add the header from source file, into rdd and have
>> it available as header into new file I.e, when I transform the rdd into
>> textfile using saveAsTexFile("newfile") the header 1, header 2 shall be
>> available.
>>
>>
>> Thanks,
>> Upendra
>>
>
> --
Thank You,

Irving Duran

Re: Adding header to an rdd before saving to text file

Posted by "颜发才 (Yan Facai)" <fa...@gmail.com>.
Hi, upendra.
It will be easier to use DataFrame to read/save csv file with header, if
you'd like.

On Tue, Jun 6, 2017 at 5:15 AM, upendra 1991 <up...@yahoo.com.invalid>
wrote:

> I am reading a CSV(file has headers header 1st,header2) and generating
> rdd,
> After few transformations I create an rdd and finally write it to a txt
> file.
>
> What's the best way to add the header from source file, into rdd and have
> it available as header into new file I.e, when I transform the rdd into
> textfile using saveAsTexFile("newfile") the header 1, header 2 shall be
> available.
>
>
> Thanks,
> Upendra
>