You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/11/15 22:34:50 UTC

[GitHub] [incubator-druid] abk-code opened a new issue #8877: csv based ingestion question

abk-code opened a new issue #8877: csv based ingestion question
URL: https://github.com/apache/incubator-druid/issues/8877

Hi folks,
I've a POC where I'm ingesting based on csv format. The issue I'm facing is, if the data / payload has the comma as part of the data, the ingestion is getting messed up because column to value mapping is getting confused.

For eg: My row is as below
timestamp,a,b,c,10.0.0.1,10.0.10.2,hello,world
But actually ,10.0.0.1,10.0.10.2 is part of single column.

So what is the way to ingest in such cases where delimitar is part of the row (data) itself?

timestamp,a,b,c,"10.0.0.1,10.0.10.2",hello,world

Will this work by enclosing the column that has the comma within double quotes? Atleast that is what I see in the https://druid.apache.org/docs/latest/ingestion/data-formats.html but wanted to confirm from the experts here.

Also, I know json will definitely solve this since every row will have key-value, but
over the wire (OTW) kafka data will be very large due to the overhead.
Regards.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org