You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Shawn Weeks <sw...@weeksconsulting.us> on 2018/08/18 17:29:39 UTC

CSV Illegal Initial Character

I was building some example NiFi workflows from the CSV files at https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html specifically nile.csv and it appears that NiFi is trying to include the quoted header with quotes in the Avro schema it generates. This is an all defaults CSVReader used with a JsonRecordSetWriter in ConvertRecord. Wondering if this is a bug or expected behavior. I'm using the latest 1.7 binaries from nifi.apache.org.

org.apache.avro.SchemaParseException: Illegal initial character: "Flood"
              at org.apache.avro.Schema.validateName(Schema.java:1147)
              at org.apache.avro.Schema.access$200(Schema.java:81)
              at org.apache.avro.Schema$Field.<init>(Schema.java:403)
              at org.apache.avro.Schema$Field.<init>(Schema.java:423)
              at org.apache.avro.Schema$Field.<init>(Schema.java:415)
              at org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)
              at org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)
              at org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)
              at org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)
              at org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)
              at org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
              at org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)
              at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)
              at org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)
              at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
              at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
              at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
              at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)

Re: CSV Illegal Initial Character

Posted by Pierre Villard <pi...@gmail.com>.
Hi Shawn,
Sounds like a legitimate ask, could you file a JIRA for that?

Thanks,
Pierre

Le lun. 20 août 2018 à 16:05, Shawn Weeks <sw...@weeksconsulting.us> a
écrit :

> That option is ignored if you're deriving the schema from the header.
> Since in the CSV standard supports quoting everything then having the
> header quoted should be supported.
>
>
> Thanks
>
> Shawn
>
>
> ------------------------------
> *From:* Mark Payne <ma...@hotmail.com>
> *Sent:* Saturday, August 18, 2018 6:56 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: CSV Illegal Initial Character
>
> Hey Shawn,
>
> It sounds like you need to set the cvs reader’s “Treat First Line as
> Header” property to true. By default it treats the first line as the first
> record (as opposed to the header), which looks like the case here.
>
> Sent from my iPhone
>
> On Aug 18, 2018, at 1:30 PM, Shawn Weeks <sw...@weeksconsulting.us>
> wrote:
>
> I was building some example NiFi workflows from the CSV files at
> https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html specifically
> nile.csv and it appears that NiFi is trying to include the quoted header
> with quotes in the Avro schema it generates. This is an all defaults
> CSVReader used with a JsonRecordSetWriter in ConvertRecord. Wondering if
> this is a bug or expected behavior. I’m using the latest 1.7 binaries from
> nifi.apache.org.
>
>
>
> org.apache.avro.SchemaParseException: Illegal initial character: "Flood"
>
>               at org.apache.avro.Schema.validateName(Schema.java:1147)
>
>               at org.apache.avro.Schema.access$200(Schema.java:81)
>
>               at org.apache.avro.Schema$Field.<init>(Schema.java:403)
>
>               at org.apache.avro.Schema$Field.<init>(Schema.java:423)
>
>               at org.apache.avro.Schema$Field.<init>(Schema.java:415)
>
>               at
> org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)
>
>               at
> org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)
>
>               at
> org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)
>
>               at
> org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)
>
>               at
> org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)
>
>               at
> org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
>
>               at
> org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)
>
>               at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)
>
>               at
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)
>
>               at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>
>               at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
>
>               at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
>
>               at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>
>               at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
>               at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>
>               at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>
>               at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>
>               at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>               at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>               at java.lang.Thread.run(Thread.java:748)
>
>

Re: CSV Illegal Initial Character

Posted by Shawn Weeks <sw...@weeksconsulting.us>.
That option is ignored if you're deriving the schema from the header. Since in the CSV standard supports quoting everything then having the header quoted should be supported.


Thanks

Shawn


________________________________
From: Mark Payne <ma...@hotmail.com>
Sent: Saturday, August 18, 2018 6:56 PM
To: users@nifi.apache.org
Subject: Re: CSV Illegal Initial Character

Hey Shawn,

It sounds like you need to set the cvs reader’s “Treat First Line as Header” property to true. By default it treats the first line as the first record (as opposed to the header), which looks like the case here.

Sent from my iPhone

On Aug 18, 2018, at 1:30 PM, Shawn Weeks <sw...@weeksconsulting.us>> wrote:


I was building some example NiFi workflows from the CSV files at https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html specifically nile.csv and it appears that NiFi is trying to include the quoted header with quotes in the Avro schema it generates. This is an all defaults CSVReader used with a JsonRecordSetWriter in ConvertRecord. Wondering if this is a bug or expected behavior. I’m using the latest 1.7 binaries from nifi.apache.org<http://nifi.apache.org>.



org.apache.avro.SchemaParseException: Illegal initial character: "Flood"

              at org.apache.avro.Schema.validateName(Schema.java:1147)

              at org.apache.avro.Schema.access$200(Schema.java:81)

              at org.apache.avro.Schema$Field.<init>(Schema.java:403)

              at org.apache.avro.Schema$Field.<init>(Schema.java:423)

              at org.apache.avro.Schema$Field.<init>(Schema.java:415)

              at org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)

              at org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)

              at org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)

              at org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)

              at org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)

              at org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)

              at org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)

              at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)

              at org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)

              at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)

              at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)

              at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)

              at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)

              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)

              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

              at java.lang.Thread.run(Thread.java:748)

Re: CSV Illegal Initial Character

Posted by Mark Payne <ma...@hotmail.com>.
Hey Shawn,

It sounds like you need to set the cvs reader’s “Treat First Line as Header” property to true. By default it treats the first line as the first record (as opposed to the header), which looks like the case here.

Sent from my iPhone

On Aug 18, 2018, at 1:30 PM, Shawn Weeks <sw...@weeksconsulting.us>> wrote:

I was building some example NiFi workflows from the CSV files at https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html specifically nile.csv and it appears that NiFi is trying to include the quoted header with quotes in the Avro schema it generates. This is an all defaults CSVReader used with a JsonRecordSetWriter in ConvertRecord. Wondering if this is a bug or expected behavior. I’m using the latest 1.7 binaries from nifi.apache.org<http://nifi.apache.org>.

org.apache.avro.SchemaParseException: Illegal initial character: "Flood"
              at org.apache.avro.Schema.validateName(Schema.java:1147)
              at org.apache.avro.Schema.access$200(Schema.java:81)
              at org.apache.avro.Schema$Field.<init>(Schema.java:403)
              at org.apache.avro.Schema$Field.<init>(Schema.java:423)
              at org.apache.avro.Schema$Field.<init>(Schema.java:415)
              at org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)
              at org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)
              at org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)
              at org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)
              at org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)
              at org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
              at org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)
              at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)
              at org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)
              at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
              at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
              at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
              at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)