You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/09/13 09:10:54 UTC

[GitHub] [druid] clintropolis opened a new pull request, #13079: escape smoosh file names, fix bug with empty json paths

clintropolis opened a new pull request, #13079:
URL: https://github.com/apache/druid/pull/13079

   ### Description
   This PR fixes some issues encountered with nested column objects with key names containing newlines or commas, which while valid JSON, are not cool with the way the `meta.smoosh` file currently works.
   
   To remedy this, the smoosh file now escapes commas and newlines when writing the `meta.smoosh` file, and unescapes them upon mapping.
   
   This PR also fixes an issue with the jsonpath and jq parsers when handling empty keys, which are also valid JSON property names. They were both a bit overly strict, and now should allow them if contained in syntax appropriate quotes.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a diff in pull request #13079: escape smoosh file names, fix bug with empty json paths

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on code in PR #13079:
URL: https://github.com/apache/druid/pull/13079#discussion_r1027678714


##########
core/src/main/java/org/apache/druid/java/util/common/io/smoosh/SmooshedFileMapper.java:
##########
@@ -166,4 +167,52 @@ public void close()
       throw new RuntimeException(thrown);
     }
   }
+
+  public static String escapeFilename(String fileName)
+  {
+    StringBuilder sb = new StringBuilder(fileName.length());
+    for (int i = 0; i < fileName.length(); i++) {
+      if ('\n' == fileName.charAt(i)) {
+        sb.append("\\n");
+      } else if (',' == fileName.charAt(i)) {
+        sb.append("\\u002c");
+      } else {
+        sb.append(fileName.charAt(i));
+      }
+    }
+    return sb.toString();
+  }
+
+  public static String unescapeFilename(String fileName)
+  {
+    StringBuilder sb = new StringBuilder(fileName.length());
+    boolean escaped = false;
+    for (int i = 0; i < fileName.length(); i++) {
+      if ('\\' == fileName.charAt(i)) {
+        escaped = true;
+      } else {
+        if (escaped) {
+          escaped = false;
+          if ('n' == fileName.charAt(i)) {
+            sb.append("\n");
+          } else if (
+              'u' == fileName.charAt(i) &&
+              '0' == fileName.charAt(i + 1) &&
+              '0' == fileName.charAt(i + 2) &&
+              '2' == fileName.charAt(i + 3) &&
+              'c' == fileName.charAt(i + 4)

Review Comment:
   what if we are are exceeding the string length here? 



##########
core/src/main/java/org/apache/druid/java/util/common/io/smoosh/SmooshedFileMapper.java:
##########
@@ -48,6 +48,7 @@
  */
 public class SmooshedFileMapper implements Closeable
 {
+  private static final String COMMA = "\u002c";

Review Comment:
   is it used anywhere? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] escape smoosh file names, fix bug with empty json paths (druid)

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #13079:
URL: https://github.com/apache/druid/pull/13079#issuecomment-1882031594

   This pull request has been marked as stale due to 60 days of inactivity.
   It will be closed in 4 weeks if no further activity occurs. If you think
   that's incorrect or this pull request should instead be reviewed, please simply
   write any comment. Even if closed, you can still revive the PR at any time or
   discuss it on the dev@druid.apache.org list.
   Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


Re: [PR] escape smoosh file names, fix bug with empty json paths (druid)

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #13079:
URL: https://github.com/apache/druid/pull/13079#issuecomment-1989680046

   This pull request has been marked as stale due to 60 days of inactivity.
   It will be closed in 4 weeks if no further activity occurs. If you think
   that's incorrect or this pull request should instead be reviewed, please simply
   write any comment. Even if closed, you can still revive the PR at any time or
   discuss it on the dev@druid.apache.org list.
   Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org