You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/09/25 03:36:45 UTC

[GitHub] [druid] FrankChen021 commented on a change in pull request #10383: Fix ingestion failure of pretty-formatted JSON message

FrankChen021 commented on a change in pull request #10383:
URL: https://github.com/apache/druid/pull/10383#discussion_r494728667



##########
File path: core/src/main/java/org/apache/druid/data/input/impl/JsonReader.java
##########
@@ -33,13 +40,98 @@
 
 import java.io.IOException;
 import java.util.Collections;
-import java.util.List;
+import java.util.Iterator;
 import java.util.Map;
+import java.util.NoSuchElementException;
 
-public class JsonReader extends TextReader
+/**
+ * <pre>
+ * In constract to {@link JsonLineReader} which processes input text line by line independently,
+ * this class tries to parse the input text as a whole to an array of objects.
+ *
+ * The input text can be:
+ * 1. a JSON string of an object in a line or multiple lines(such as pretty-printed JSON text)
+ * 2. multiple JSON object strings concated by white space character(s)
+ *
+ * For case 2, what should be noticed is that if an exception is thrown when parsing one JSON string,
+ * the rest JSON text will all be ignored
+ *
+ * For more information, see: https://github.com/apache/druid/pull/10383
+ * </pre>
+ */
+public class JsonReader implements InputEntityReader

Review comment:
       > The sampler currently assumes that there is only one JSON object in an input chunk which could have either an array or a nested object. 
   
   That's the root cause why `ExceptionThrowingIterator` is extracted and `JsonReader` inherits from InputEntityReader directly.
   
   Your suggestion provides a new and simple way to deal with it. I'll test the code later.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org