You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/02/15 13:44:55 UTC

[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #12259: Make ParseExceptions more informative

abhishekagarwal87 commented on a change in pull request #12259:
URL: https://github.com/apache/druid/pull/12259#discussion_r806836691



##########
File path: core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
##########
@@ -155,6 +239,24 @@ public void close() throws IOException
    */
   protected abstract List<Map<String, Object>> toMap(T intermediateRow) throws IOException;
 
+  private String buildParseExceptionMessage(
+      String formatString,
+      @Nullable InputEntity source,
+      @Nullable Long recordNumber,
+      Map<String, Object> metadata,

Review comment:
       this can be null too I suppose. since not every implementation has metadata. 

##########
File path: core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
##########
@@ -135,9 +198,30 @@ public void close() throws IOException
 
   /**
    * Creates an iterator of intermediate rows. The returned rows will be consumed by {@link #parseInputRows} and
-   * {@link #toMap}.
+   * {@link #toMap}. Either this or {@link #intermediateRowIteratorWithMetadata()} should be implemented
+   */
+  protected CloseableIterator<T> intermediateRowIterator() throws IOException
+  {
+    throw new UnsupportedEncodingException("intermediateRowIterator not implemented");
+  }
+
+  /**
+   * Same as {@code intermediateRowIterator}, but it also contains the metadata such as the line number to generate
+   * more informative {@link ParseException}.
+   */
+  protected CloseableIteratorWithMetadata<T> intermediateRowIteratorWithMetadata() throws IOException
+  {
+    return CloseableIteratorWithMetadata.fromCloseableIterator(intermediateRowIterator());
+  }
+
+  /**
+   * @return InputEntity which the subclass is reading from. Useful in generating informative {@link ParseException}s

Review comment:
       can you add an example here e.g. for filename can be a useful info. 

##########
File path: core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
##########
@@ -52,28 +57,46 @@
       // good idea. Subclasses could implement read() with some duplicate codes to avoid unnecessary iteration on
       // a singleton list.
       Iterator<InputRow> rows = null;
+      long currentRecordNumber = 1;
 
       @Override
       public boolean hasNext()
       {
         if (rows == null || !rows.hasNext()) {
-          if (!intermediateRowIterator.hasNext()) {
+          if (!intermediateRowIteratorWithMetadata.hasNext()) {
             return false;
           }
-          final T row = intermediateRowIterator.next();
+          final T row = intermediateRowIteratorWithMetadata.next();
+          final Map<String, Object> metadata = intermediateRowIteratorWithMetadata.metadata();

Review comment:
       we can move it to the catch block so metadata is prepared only if needed. 

##########
File path: core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
##########
@@ -93,39 +116,79 @@ public InputRow next()
       @Override
       public void close() throws IOException
       {
-        intermediateRowIterator.close();
+        intermediateRowIteratorWithMetadata.close();
       }
     };
   }
 
   @Override
   public CloseableIterator<InputRowListPlusRawValues> sample() throws IOException
   {
-    return intermediateRowIterator().map(row -> {

Review comment:
       could we just use regular `hasNext, next()` etc and get rid of `mapWithMetadata` entirely? 

##########
File path: core/src/main/java/org/apache/druid/data/input/IntermediateRowParsingReader.java
##########
@@ -155,6 +239,24 @@ public void close() throws IOException
    */
   protected abstract List<Map<String, Object>> toMap(T intermediateRow) throws IOException;
 
+  private String buildParseExceptionMessage(
+      String formatString,
+      @Nullable InputEntity source,
+      @Nullable Long recordNumber,
+      Map<String, Object> metadata,
+      Object... baseArgs
+  )
+  {
+    Map<String, Object> temp = Maps.newHashMap(metadata);
+    if (source != null) {
+      temp.put("source", source.getUri());
+    }
+    if (recordNumber != null) {
+      temp.put("recordNumber", recordNumber);
+    }
+    return StringUtils.nonStrictFormat(formatString, baseArgs, temp);

Review comment:
       your format string has only one '%s', right? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org