You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "davisusanibar (via GitHub)" <gi...@apache.org> on 2023/06/30 22:47:02 UTC

[GitHub] [arrow] davisusanibar opened a new pull request, #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

davisusanibar opened a new pull request, #36422:
URL: https://github.com/apache/arrow/pull/36422

   ### Rationale for this change
   
   Enable Support for reading JSON Datasets https://github.com/apache/arrow/pull/33732 on Java side
   
   ### What changes are included in this PR?
   
   Support for reading JSON Datasets
   
   ### Are these changes tested?
   
   Unit test added
   
   ### Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #36422:
URL: https://github.com/apache/arrow/pull/36422#discussion_r1251290913


##########
java/dataset/src/test/java/org/apache/arrow/dataset/TextBasedWriteSupport.java:
##########
@@ -24,17 +24,18 @@
 import java.net.URISyntaxException;
 import java.util.Random;
 
-public class CsvWriteSupport {
+public class TextBasedWriteSupport {
   private final URI uri;
   private final Random random = new Random();
 
-  public CsvWriteSupport(File outputFolder) throws URISyntaxException {
-    uri = new URI("file", outputFolder.getPath() + "/" + "generated-" + random.nextLong() + ".csv", null);
+  public TextBasedWriteSupport(File outputFolder, String jsonOrCsvFileExtension) throws URISyntaxException {
+    uri = new URI("file", outputFolder.getPath() + File.separator +
+        "generated-" + random.nextLong() + jsonOrCsvFileExtension, null);

Review Comment:
   ```suggestion
     public TextBasedWriteSupport(File outputFolder, String fileExtension) throws URISyntaxException {
       uri = new URI("file", outputFolder.getPath() + File.separator +
           "generated-" + random.nextLong() + fileExtension, null);
   ```



##########
java/dataset/src/test/java/org/apache/arrow/dataset/TextBasedWriteSupport.java:
##########
@@ -24,17 +24,18 @@
 import java.net.URISyntaxException;
 import java.util.Random;
 
-public class CsvWriteSupport {
+public class TextBasedWriteSupport {
   private final URI uri;
   private final Random random = new Random();
 
-  public CsvWriteSupport(File outputFolder) throws URISyntaxException {
-    uri = new URI("file", outputFolder.getPath() + "/" + "generated-" + random.nextLong() + ".csv", null);
+  public TextBasedWriteSupport(File outputFolder, String jsonOrCsvFileExtension) throws URISyntaxException {
+    uri = new URI("file", outputFolder.getPath() + File.separator +
+        "generated-" + random.nextLong() + jsonOrCsvFileExtension, null);
   }
 
-  public static CsvWriteSupport writeTempFile(File outputFolder, String... values)
+  public static TextBasedWriteSupport writeTempFile(File outputFolder, String jsonOrCsvFileExtension, String... values)

Review Comment:
   ```suggestion
     public static TextBasedWriteSupport writeTempFile(File outputFolder, String fileExtension, String... values)
   ```



##########
java/dataset/src/test/java/org/apache/arrow/dataset/TextBasedWriteSupport.java:
##########
@@ -24,17 +24,18 @@
 import java.net.URISyntaxException;
 import java.util.Random;
 
-public class CsvWriteSupport {
+public class TextBasedWriteSupport {
   private final URI uri;
   private final Random random = new Random();
 
-  public CsvWriteSupport(File outputFolder) throws URISyntaxException {
-    uri = new URI("file", outputFolder.getPath() + "/" + "generated-" + random.nextLong() + ".csv", null);
+  public TextBasedWriteSupport(File outputFolder, String jsonOrCsvFileExtension) throws URISyntaxException {
+    uri = new URI("file", outputFolder.getPath() + File.separator +
+        "generated-" + random.nextLong() + jsonOrCsvFileExtension, null);
   }
 
-  public static CsvWriteSupport writeTempFile(File outputFolder, String... values)
+  public static TextBasedWriteSupport writeTempFile(File outputFolder, String jsonOrCsvFileExtension, String... values)
       throws URISyntaxException, IOException {
-    CsvWriteSupport writer = new CsvWriteSupport(outputFolder);
+    TextBasedWriteSupport writer = new TextBasedWriteSupport(outputFolder, jsonOrCsvFileExtension);

Review Comment:
   ```suggestion
       TextBasedWriteSupport writer = new TextBasedWriteSupport(outputFolder, fileExtension);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #36422:
URL: https://github.com/apache/arrow/pull/36422#issuecomment-1615255432

   :warning: GitHub issue #36421 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #36422:
URL: https://github.com/apache/arrow/pull/36422#discussion_r1248989539


##########
java/dataset/src/test/java/org/apache/arrow/dataset/CsvJsonWriteSupport.java:
##########
@@ -24,17 +24,18 @@
 import java.net.URISyntaxException;
 import java.util.Random;
 
-public class CsvWriteSupport {
+public class CsvJsonWriteSupport {

Review Comment:
   How about using `TextBasedWriteSupport` or something?
   If we use this style, we may need to add more formats to class name such as `CsvJsonXXXYYYWriteSupport` in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou merged pull request #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou merged PR #36422:
URL: https://github.com/apache/arrow/pull/36422


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] conbench-apache-arrow[bot] commented on pull request #36422: GH-36421: [Java] Enable Support for reading JSON Datasets

Posted by "conbench-apache-arrow[bot] (via GitHub)" <gi...@apache.org>.
conbench-apache-arrow[bot] commented on PR #36422:
URL: https://github.com/apache/arrow/pull/36422#issuecomment-1626279841

   Conbench analyzed the 6 benchmark runs on commit `a2ccd460`.
   
   There were 5 benchmark results indicating a performance regression:
   
   - Commit Run on `ursa-i9-9960x` at [2023-07-07 19:20:06Z](http://conbench.ursa.dev/compare/runs/7bcb409fb7714240a5235d55bdd61e22...814bf8272baf4f9bb1221a540eed4d60/)
     - [compression=uncompressed, dataset=fanniemae_2016Q4, file_type=feather, language=R, output_type=dataframe](http://conbench.ursa.dev/compare/benchmarks/064a84a6331f788080006d6b8006e264...064a879053497a728000f47eca13c7a7)
     - [compression=uncompressed, dataset=fanniemae_2016Q4, file_type=feather, language=R, output_type=table](http://conbench.ursa.dev/compare/benchmarks/064a84a57c1878d08000a88c7e3874ff...064a878f35c47c4c8000ac92d2ea15ed)
   - and 3 more (see the report linked below)
   
   The [full Conbench report](https://github.com/apache/arrow/runs/14871726701) has more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org