You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu> on 2021/12/17 14:26:57 UTC
Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................
[NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
Details:
- When a JSON file is not found while reading external azure
datalake dataset, skip the file and continue reading, do
not fail.
Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/83/14583/1
diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
import org.apache.hyracks.util.LogRedactionUtil;
import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
import com.azure.storage.file.datalake.DataLakeFileClient;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
public class AzureDataLakeInputStream extends AbstractExternalInputStream {
@@ -67,8 +67,7 @@
if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
}
- } catch (DataLakeStorageException ex) {
- // TODO(htowaileb): need to find the right error for Azure Data Lake
+ } catch (BlobStorageException ex) {
if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
+ "found in container " + container);
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-MessageType: newchange
Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................
Patch Set 2: Code-Review+2
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 2
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-Comment-Date: Fri, 17 Dec 2021 15:28:26 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................
[NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
Details:
- When a JSON file is not found while reading external azure
datalake dataset, skip the file and continue reading, do
not fail.
Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/83/14583/1
diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
import org.apache.hyracks.util.LogRedactionUtil;
import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
import com.azure.storage.file.datalake.DataLakeFileClient;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
public class AzureDataLakeInputStream extends AbstractExternalInputStream {
@@ -67,8 +67,7 @@
if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
}
- } catch (DataLakeStorageException ex) {
- // TODO(htowaileb): need to find the right error for Azure Data Lake
+ } catch (BlobStorageException ex) {
if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
+ "found in container " + container);
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-MessageType: newchange
Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................
Patch Set 2: Verified+1 Code-Review+1 Integration-Tests+1
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 2
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-Comment-Date: Fri, 17 Dec 2021 15:28:19 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................
Patch Set 1:
Analytics Compatibility Compilation Successful
https://cbjenkins.page.link/1DKuJNryebU2d9ALA : SUCCESS
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:33:12 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has submitted this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
......................................................................
[NO ISSUE][EXT][AZDL]: Skip instead of failing when json file not found
Details:
- When a JSON file is not found while reading external azure
datalake dataset, skip the file and continue reading, do
not fail.
Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
Tested-by: Hussain Towaileb <hu...@gmail.com>
Integration-Tests: Hussain Towaileb <hu...@gmail.com>
Reviewed-by: Hussain Towaileb <hu...@gmail.com>
---
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
1 file changed, 2 insertions(+), 3 deletions(-)
Approvals:
Hussain Towaileb: Looks good to me, approved; Verified; Verified
diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
index f0c185e..b7d142f 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
@@ -35,10 +35,10 @@
import org.apache.hyracks.util.LogRedactionUtil;
import com.azure.storage.blob.models.BlobErrorCode;
+import com.azure.storage.blob.models.BlobStorageException;
import com.azure.storage.file.datalake.DataLakeFileClient;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
-import com.azure.storage.file.datalake.models.DataLakeStorageException;
public class AzureDataLakeInputStream extends AbstractExternalInputStream {
@@ -67,8 +67,7 @@
if (lowerCaseFileName.endsWith(".gz") || lowerCaseFileName.endsWith(".gzip")) {
in = new GZIPInputStream(in, ExternalDataConstants.DEFAULT_BUFFER_SIZE);
}
- } catch (DataLakeStorageException ex) {
- // TODO(htowaileb): need to find the right error for Azure Data Lake
+ } catch (BlobStorageException ex) {
if (ex.getErrorCode().equals(BlobErrorCode.BLOB_NOT_FOUND)) {
LOGGER.debug(() -> "Key " + LogRedactionUtil.userData(filePaths.get(nextFileIndex)) + " was not "
+ "found in container " + container);
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 3
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Anon. E. Moose #1000171
Gerrit-MessageType: merged
Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Michael Blow <mb...@apache.org>:
Michael Blow has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................
Patch Set 1: Code-Review+2
(1 comment)
Do we intend to add a test case for this at some point?
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1//COMMIT_MSG
Commit Message:
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1//COMMIT_MSG@1
PS1, Line 1: Parent: 311d6f7d ([ASTERIXDB-3000] Incorrect result in SQL-compat mode)
maybe we should consider coming up with a sub-category for the different external datasets, just to increase brevity: e.g. [EXT][AZDL]
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:33:12 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: Yes
Gerrit-MessageType: comment
Change in asterixdb[master]: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing w...
Posted by AsterixDB Code Review <do...@asterix-gerrit.ics.uci.edu>.
From Hussain Towaileb <hu...@gmail.com>:
Hussain Towaileb has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583 )
Change subject: [NO ISSUE][EXT] Azure Datalake ext dataset: Skip instead of failing when json file not found
......................................................................
Patch Set 1:
(1 comment)
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java
File asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java:
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583/1/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/record/reader/azure/datalake/AzureDataLakeInputStream.java@70
PS1, Line 70: BlobStorageException
Since Azure Datalake is built on top of Azure Blob Storage, it throws a Blob not found exception if a file is not found.
--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/14583
To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Ic9b04e418483cc245379e35c9a20f1a4c4389e87
Gerrit-Change-Number: 14583
Gerrit-PatchSet: 1
Gerrit-Owner: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Ali Alsuliman <al...@gmail.com>
Gerrit-Reviewer: Dmitry Lychagin <dm...@couchbase.com>
Gerrit-Reviewer: Hussain Towaileb <hu...@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-CC: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Comment-Date: Fri, 17 Dec 2021 14:31:53 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Gerrit-MessageType: comment