You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu> on 2021/12/17 14:26:57 UTC

Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...

From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )


Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................

[NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found

Details:
- When a JSON file is not found while reading external azure
  datalake dataset, skip the file and continue reading, do
  not fail.

Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)



  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/83/14583/1

diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
 import org.apache.hyracks.util.LogRedactionUtil;
 
 import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
 import com.azure.storage.file.datalake.DataLakeFileClient;
 import com.azure.storage.file.datalake.DataLakeFileSystemClient;
 import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
 
 public class AzureDataLakeInputStream extends AbstractExternalInputStream {
 
@@ -67,8 +67,7 @@
             if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
                 in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
             }
-        } catch (DataLakeStorageException ex) {
-            // TODO(htowaileb): need to find the right error for Azure Data Lake
+        } catch (BlobStorageException ex) {
             if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
                 LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
                         + "found in container " + container);

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-MessageType: newchange

Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 2
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-Comment-Date: Fri, 17 Dec 2021 15:28:26 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment

Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )


Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................

[NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found

Details:
- When a JSON file is not found while reading external azure
  datalake dataset, skip the file and continue reading, do
  not fail.

Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)



  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/83/14583/1

diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
 import org.apache.hyracks.util.LogRedactionUtil;
 
 import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
 import com.azure.storage.file.datalake.DataLakeFileClient;
 import com.azure.storage.file.datalake.DataLakeFileSystemClient;
 import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
 
 public class AzureDataLakeInputStream extends AbstractExternalInputStream {
 
@@ -67,8 +67,7 @@
             if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
                 in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
             }
-        } catch (DataLakeStorageException ex) {
-            // TODO(htowaileb): need to find the right error for Azure Data Lake
+        } catch (BlobStorageException ex) {
             if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
                 LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
                         + "found in container " + container);

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-MessageType: newchange

Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................


Patch Set 2: Verified+1 Code-Review+1 Integration-Tests+1


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 2
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-Comment-Date: Fri, 17 Dec 2021 15:28:19 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment

Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................


Patch Set 1:

Analytics Compatibility Compilation Successful
https://cbjenkins.page.link/1DKuJNryebU2d9ALA : SUCCESS


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:33:12 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
Gerrit-MessageType: comment

Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has submitted this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................

[NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found

Details:
- When a JSON file is not found while reading external azure
  datalake dataset, skip the file and continue reading, do
  not fail.

Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
Tested-by: Hussain Towaileb <hu...@gmail.com>
Integration-Tests: Hussain Towaileb <hu...@gmail.com>
Reviewed-by: Hussain Towaileb <hu...@gmail.com>
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)

Approvals:
  Hussain Towaileb: Looks good to me, approved; Verified; Verified



diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
 import org.apache.hyracks.util.LogRedactionUtil;
 
 import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
 import com.azure.storage.file.datalake.DataLakeFileClient;
 import com.azure.storage.file.datalake.DataLakeFileSystemClient;
 import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
 
 public class AzureDataLakeInputStream extends AbstractExternalInputStream {
 
@@ -67,8 +67,7 @@
             if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
                 in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
             }
-        } catch (DataLakeStorageException ex) {
-            // TODO(htowaileb): need to find the right error for Azure Data Lake
+        } catch (BlobStorageException ex) {
             if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
                 LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
                         + "found in container " + container);

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 3
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-MessageType: merged

Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Michael Blow <mb...@apache.org>:

Michael Blow has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................


Patch Set 1: Code-Review+2

(1 comment)

Do we intend to add a test case for this at some point?

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1//COMMIT_MSG 
Commit Message:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1//COMMIT_MSG@1 
PS1, Line 1: Parent:     311d6f7d ([ASTERIXDB-3000] Incorrect result in SQL-compat mode)
maybe we should consider coming up with a sub-category for the different external datasets, just to increase brevity:  e.g. [EXT][AZDL]



-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:33:12 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment

Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...

Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:

Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )

Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................


Patch Set 1:

(1 comment)

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java 
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java:

https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java@70 
PS1, Line 70: BlobStorageException
Since Azure Datalake is built on top of Azure Blob Storage, it throws a Blob not found exception if a file is not found.



-- 
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:31:53 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Gerrit-MessageType: comment