You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/12/17 13:02:00 UTC
[jira] [Commented] (ASTERIXDB-2194) Introduce datasource functions
[ https://issues.apache.org/jira/browse/ASTERIXDB-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16294130#comment-16294130 ]
ASF subversion and git services commented on ASTERIXDB-2194:
------------------------------------------------------------
Commit b4d166b3ca042ce34d737f5d2a4fb758fa45d3e5 in asterixdb's branch refs/heads/master from [~alamoudi]
[ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=b4d166b ]
[ASTERIXDB-2194][COMP] Introduce datasource functions
- user model changes: yes
Some functions can be datasources
- storage format changes: no
- interface changes: yes
- Add IDatasourceFunction: A function that is also a datasource
- Add IFunctionToDataSourceTransformer: transform an unnest
function into a datascan during compilation
Details:
- Currently, functions are location agnostic and are run on
parameters that are either passed through them during compile
time or runtime.
- An exception to this is the dataset function which has
an associated location constraints running on the nodes
which host the dataset.
- In this change, we introduce a general framework that allows
creation of new functions similar to the dataset function.
- Such functions are called datasource Functions.
- A datasource function takes constant parameters and run on
a set of partitions similar to the dataset function.
- The first example of such functions is the DatasetResources
function.
- The DatasetResources function takes two parameters, a dataverse
and a dataset. It is then run on all nodes and returns a set
of dataset resources.
- Test cases are added for this function.
Change-Id: Ibcf807ac713a21e8f4d59868525467386e801303
Reviewed-on: https://asterix-gerrit.ics.uci.edu/2216
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: abdullah alamoudi <ba...@gmail.com>
Tested-by: abdullah alamoudi <ba...@gmail.com>
> Introduce datasource functions
> ------------------------------
>
> Key: ASTERIXDB-2194
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2194
> Project: Apache AsterixDB
> Issue Type: Improvement
> Components: COMP - Compiler
> Reporter: Abdullah Alamoudi
> Assignee: Abdullah Alamoudi
>
> Sometimes, we would like to be able to query system status. For example:
> 1. Disk space.
> 2. Number of components of a dataset.
> 3. Memory usage.
> And many others. Being able to query such information and utilize the power of the query language and the runtime makes a great investigative/diagnostic tool.
> Currently, there is no easy way to do that. Such functionality can be achieved through:
> 1. External datasets but that takes a lot of work in terms of development and usage.
> 2. Use specific diagnostic end points but then that is also a lot of development work and you end up losing the ability to use the query language.
> Current proposal is to introduce datasource functions. A datasource function is different from normal functions as:
> 1. Takes constants ( as opposed to variables).
> 2. Has location constraints "For a start, it can be on all nodes".
> An example would be the function dataset_resources(String dataverse, String dataset);
> This function takes a dataverse and a dataset and produce a set of json representing the disk resources of the dataset.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)