You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by "Hellstroem Christian (AE/PJ-ASC)" <Ch...@de.bosch.com> on 2016/01/29 14:43:13 UTC

Split header from CSV and use for JSON

Hi,

I'm just starting out with NiFi and I am ingesting a CSV file with headers. I would like to strip the header from each file and then use the header (with column names) to map each row to JSON, after which more processing is done. Right now I'm stuck on which processors to use and how to do the settings.

The CSV files are pretty much as follows:

SOME GENERIC INFO
COL1,COL2,COL3
VAL11,VAL12,VAL13
VAL21,VAL22,VAL23
...

Basically I want to remove the first line with 'some generic info' that is different for each file, and I want the columns from line 2 to become fields in JSON, so that I end up with

{
  "COL1": "VAL11",
 "COL2" : "VAL12",
 "COL3": "VAL13"
}

etc.

I currently have a GetFile processor for the CSV file and an AttributesToJSON afterwards with default settings, so there is no manual mapping done anywhere. I checked out the CSV2JSON template by Hortonworks but that's a very manual process and that would be extremely tedious as I have files with roughly a hundred columns.

I'm really stuck and any pointers would be greatly appreciated.

Also a content preview feature would be really helpful...

Cheers,

Ian.

Re: Split header from CSV and use for JSON

Posted by Joe Witt <jo...@gmail.com>.
Ian

Makes sense.  Happy to help put together a template to see if we can
get you closer.  Working on some 050 release items at the moment but
will flag this to come back

Thanks
Joe

On Fri, Jan 29, 2016 at 8:43 AM, Hellstroem Christian (AE/PJ-ASC)
<Ch...@de.bosch.com> wrote:
> Hi,
>
>
>
> I’m just starting out with NiFi and I am ingesting a CSV file with headers.
> I would like to strip the header from each file and then use the header
> (with column names) to map each row to JSON, after which more processing is
> done. Right now I’m stuck on which processors to use and how to do the
> settings.
>
>
>
> The CSV files are pretty much as follows:
>
>
>
> SOME GENERIC INFO
>
> COL1,COL2,COL3
>
> VAL11,VAL12,VAL13
>
> VAL21,VAL22,VAL23
>
> …
>
>
>
> Basically I want to remove the first line with ‘some generic info’ that is
> different for each file, and I want the columns from line 2 to become fields
> in JSON, so that I end up with
>
>
>
> {
>
>   “COL1”: “VAL11”,
>
>  “COL2” : ”VAL12”,
>
>  “COL3”: “VAL13”
>
> }
>
>
>
> etc.
>
>
>
> I currently have a GetFile processor for the CSV file and an
> AttributesToJSON afterwards with default settings, so there is no manual
> mapping done anywhere. I checked out the CSV2JSON template by Hortonworks
> but that’s a very manual process and that would be extremely tedious as I
> have files with roughly a hundred columns.
>
>
>
> I’m really stuck and any pointers would be greatly appreciated.
>
>
>
> Also a content preview feature would be really helpful…
>
>
>
> Cheers,
>
>
>
> Ian.