You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Hussain Towaileb (Jira)" <ji...@apache.org> on 2022/09/07 12:32:00 UTC
[jira] [Created] (ASTERIXDB-3073) Dynamic Prefixes for External Datasets
Hussain Towaileb created ASTERIXDB-3073:
-------------------------------------------
Summary: Dynamic Prefixes for External Datasets
Key: ASTERIXDB-3073
URL: https://issues.apache.org/jira/browse/ASTERIXDB-3073
Project: Apache AsterixDB
Issue Type: Epic
Components: EXT - External data
Affects Versions: 0.9.8
Reporter: Hussain Towaileb
Assignee: Hussain Towaileb
Fix For: 0.9.9
Currently, when a user creates an external dataset, a prefix can be provided which directs the external dataset to the location the files need to be read from. This has a major impact on performance as it allows us to only read the files we are interested in an avoid reading unnecessary files.
However, a limitation to the current implementation is that the prefix is always a static path, leading to challenges such as reading the file (for example) of all userId > 1 or all files of userId INĀ [1, 2, 3], in such scenarios we always end up reading all the files, which can be a very expensive operation, then using our WHERE clause to get the desired result.
This feature aims to support a more dynamic approach to allow for a flexible prefix that can support different scenarios (for example, the user passing the desired userId in the prefix instead of a single prefix value) and still maintain the behavior of reading the minimal number of files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)