You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/12/17 13:02:00 UTC

[jira] [Commented] (ASTERIXDB-2194) Introduce datasource functions

    [ https://issues.apache.org/jira/browse/ASTERIXDB-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16294130#comment-16294130 ] 

ASF subversion and git services commented on ASTERIXDB-2194:
------------------------------------------------------------

Commit b4d166b3ca042ce34d737f5d2a4fb758fa45d3e5 in asterixdb's branch refs/heads/master from [~alamoudi]
[ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=b4d166b ]

[ASTERIXDB-2194][COMP] Introduce datasource functions

- user model changes: yes
  Some functions can be datasources
- storage format changes: no
- interface changes: yes
  - Add IDatasourceFunction: A function that is also a datasource
  - Add IFunctionToDataSourceTransformer: transform an unnest
    function into a datascan during compilation

Details:
- Currently, functions are location agnostic and are run on
  parameters that are either passed through them during compile
  time or runtime.
- An exception to this is the dataset function which has
  an associated location constraints running on the nodes
  which host the dataset.
- In this change, we introduce a general framework that allows
  creation of new functions similar to the dataset function.
- Such functions are called datasource Functions.
- A datasource function takes constant parameters and run on
  a set of partitions similar to the dataset function.
- The first example of such functions is the DatasetResources
  function.
- The DatasetResources function takes two parameters, a dataverse
  and a dataset. It is then run on all nodes and returns a set
  of dataset resources.
- Test cases are added for this function.

Change-Id: Ibcf807ac713a21e8f4d59868525467386e801303
Reviewed-on: https://asterix-gerrit.ics.uci.edu/2216
Sonar-Qube: Jenkins <je...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: abdullah alamoudi <ba...@gmail.com>
Tested-by: abdullah alamoudi <ba...@gmail.com>


> Introduce datasource functions
> ------------------------------
>
>                 Key: ASTERIXDB-2194
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2194
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: COMP - Compiler
>            Reporter: Abdullah Alamoudi
>            Assignee: Abdullah Alamoudi
>
> Sometimes, we would like to be able to query system status. For example:
>    1. Disk space.
>    2. Number of components of a dataset.
>    3. Memory usage.
> And many others. Being able to query such information and utilize the power of the query language and the runtime makes a great investigative/diagnostic tool.
> Currently, there is no easy way to do that. Such functionality can be achieved through:
> 1. External datasets but that takes a lot of work in terms of development and usage.
> 2. Use specific diagnostic end points but then that is also a lot of development work and you end up losing the ability to use the query language.
> Current proposal is to introduce datasource functions. A datasource function is different from normal functions as:
> 1. Takes constants ( as opposed to variables).
> 2. Has location constraints "For a start, it can be on all nodes".
> An example would be the function dataset_resources(String dataverse, String dataset);
> This function takes a dataverse and a dataset and produce a set of json representing the disk resources of the dataset.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)