You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metamodel.apache.org by Ashish Mukherjee <as...@gmail.com> on 2015/05/20 19:12:35 UTC

API suggestion for new Data Connectors

Hello,

I was looking through some of the DataContext classes like ElasticSearch
etc.  for my needs and was working on the Solr class recently. It seems
that it is not uncommon not to push-down many querying operations because
implementing executeQuery() can be tricky, as every kind of SQL is
supported by executeQuery(). Therefore, some of the connectors may not
scale very well due to lot of in-memory processing of large data-sets on
the application side.

I was wondering if the following design would simplify implementation of
executeQuery() in each DataContext implementation -

Let QueryPostProcessDataContext expose few more granular hooks (like
executeCountQuery() already exists), such as createFilters(),
createHaving() etc.

These can have default implementations in QueryPostProcessDataContext class
which are called in a pipeline pattern for the entire query construction
and execution.

Those implementing classes which can/want to implement joins etc. natively
can do so else the functionality can be satisfied by the base class method
call in the pipeline.

By following such a design, it may be easier to support as many query
functions natively as possible while leaving the rest to the MM framework.
Then, it would not be an all or nothing implementation for a data connector
and keep memory footprint of the application manageable even for large
data-sets.

These are just high level thoughts as of now, which I wanted to bounce off
with the group.

Regards,
Ashish

Re: API suggestion for new Data Connectors

Posted by Ashish Mukherjee <as...@gmail.com>.
Hi Kasper,

Thanks for the feedback on that thought.

Yes, it won't be trivial, I agree. Will spend some time on it whenever I
can grab some hours soon, perhaps next weekend.

Regards,
Ashish

On Wed, May 20, 2015 at 11:35 PM, Kasper Sørensen <
i.am.kasper.sorensen@gmail.com> wrote:

> Hi Ashish,
>
> I support that idea fully. I don't yet have any great ideas on how to pull
> it off - that would probably require more than a few experiments and design
> strategies, but what you describe sounds good. Would love to see a simple
> example of it, maybe to begin with just to implement it for a single aspect
> such as WHERE items or so.
>
> Best regards,
> Kasper
>
> 2015-05-20 19:12 GMT+02:00 Ashish Mukherjee <as...@gmail.com>:
>
> > Hello,
> >
> > I was looking through some of the DataContext classes like ElasticSearch
> > etc.  for my needs and was working on the Solr class recently. It seems
> > that it is not uncommon not to push-down many querying operations because
> > implementing executeQuery() can be tricky, as every kind of SQL is
> > supported by executeQuery(). Therefore, some of the connectors may not
> > scale very well due to lot of in-memory processing of large data-sets on
> > the application side.
> >
> > I was wondering if the following design would simplify implementation of
> > executeQuery() in each DataContext implementation -
> >
> > Let QueryPostProcessDataContext expose few more granular hooks (like
> > executeCountQuery() already exists), such as createFilters(),
> > createHaving() etc.
> >
> > These can have default implementations in QueryPostProcessDataContext
> class
> > which are called in a pipeline pattern for the entire query construction
> > and execution.
> >
> > Those implementing classes which can/want to implement joins etc.
> natively
> > can do so else the functionality can be satisfied by the base class
> method
> > call in the pipeline.
> >
> > By following such a design, it may be easier to support as many query
> > functions natively as possible while leaving the rest to the MM
> framework.
> > Then, it would not be an all or nothing implementation for a data
> connector
> > and keep memory footprint of the application manageable even for large
> > data-sets.
> >
> > These are just high level thoughts as of now, which I wanted to bounce
> off
> > with the group.
> >
> > Regards,
> > Ashish
> >
>

Re: API suggestion for new Data Connectors

Posted by Kasper Sørensen <i....@gmail.com>.
Hi Ashish,

I support that idea fully. I don't yet have any great ideas on how to pull
it off - that would probably require more than a few experiments and design
strategies, but what you describe sounds good. Would love to see a simple
example of it, maybe to begin with just to implement it for a single aspect
such as WHERE items or so.

Best regards,
Kasper

2015-05-20 19:12 GMT+02:00 Ashish Mukherjee <as...@gmail.com>:

> Hello,
>
> I was looking through some of the DataContext classes like ElasticSearch
> etc.  for my needs and was working on the Solr class recently. It seems
> that it is not uncommon not to push-down many querying operations because
> implementing executeQuery() can be tricky, as every kind of SQL is
> supported by executeQuery(). Therefore, some of the connectors may not
> scale very well due to lot of in-memory processing of large data-sets on
> the application side.
>
> I was wondering if the following design would simplify implementation of
> executeQuery() in each DataContext implementation -
>
> Let QueryPostProcessDataContext expose few more granular hooks (like
> executeCountQuery() already exists), such as createFilters(),
> createHaving() etc.
>
> These can have default implementations in QueryPostProcessDataContext class
> which are called in a pipeline pattern for the entire query construction
> and execution.
>
> Those implementing classes which can/want to implement joins etc. natively
> can do so else the functionality can be satisfied by the base class method
> call in the pipeline.
>
> By following such a design, it may be easier to support as many query
> functions natively as possible while leaving the rest to the MM framework.
> Then, it would not be an all or nothing implementation for a data connector
> and keep memory footprint of the application manageable even for large
> data-sets.
>
> These are just high level thoughts as of now, which I wanted to bounce off
> with the group.
>
> Regards,
> Ashish
>