You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Dmitri Bronnikov <dm...@gmail.com> on 2017/04/05 22:48:11 UTC

External data sources

What's the best way to implement SQL that joins an Ignite cache with an
external data source, e.g. a Cassandra column family or a MySql table? A
very big external table, not one that can be cached in memory prior to
running the query.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by vkulichenko <va...@gmail.com>.
Dmitri,

Correct, there are no plans.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11883.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by Jörn Franke <jo...@gmail.com>.
That being said, it is rather easy to include in your application that Hadoop client libraries and use any of the available inputformats. You do not need a Hadoop cluster to read files, it can even read from the local file system. This is done also by Spark and others.

> On 10. Apr 2017, at 19:58, Dmitri Bronnikov <dm...@gmail.com> wrote:
> 
> Hi Val, so nothing in progress and no immediate plans, is that right?
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11859.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by Dmitri Bronnikov <dm...@gmail.com>.
Hi Val, so nothing in progress and no immediate plans, is that right?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11859.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by vkulichenko <va...@gmail.com>.
Dmitri,

The main advantage of Ignite SQL is in-memory indexing that allows to
achieve super-fast query performance. I believe it's not possible to add
arbitrary external data sources without losing this.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11837.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by Dmitri Bronnikov <dm...@gmail.com>.
As I understand, Ignite has two completely disconnected features -
write-through/read-through caching and SQL. If SQL fetched what's not
currently cached in memory from the cache store that would solve the
problem. This is only one possible solution and it covers only the sources
described/cached by Ignite caches. It's also possible to add an API to
expose external sources to SQL without threading it through an Ignite cache
(something like external rdbms tables, Presto plugin, etc.). Any plans to
implement either of the two or anything equivalent? Thanks.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11829.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by Andrey Mashenkov <an...@gmail.com>.
Hi Dmitry,

You can try to create additional table with only row ID and field for join.
So, you will be able to make joins, but you will need to retrieve detailed
data from external store manually.

To speed up external store lookups, you can create one more cache with
eviction policy backed with external store.

Also, you can configure cache interceptor to automatically substitute
values from external store on basic get\put operations. However, this will
not work for Scan and SQL queries.

On Thu, Apr 6, 2017 at 7:55 PM, Dmitri Bronnikov <dmitri.bronnikov@gmail.com
> wrote:

> Thanks, Andrey, I know it doesn't just work out of the box, but what's the
> best way to add something like that? It's a fairly common feature, Oracle
> has external tables, Presto has a plug-in mechanism that allows, among
> other
> things, to pushdown predicates to the external table driver.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/External-data-sources-tp11766p11786.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Best regards,
Andrey V. Mashenkov

Re: External data sources

Posted by Dmitri Bronnikov <dm...@gmail.com>.
Thanks, Andrey, I know it doesn't just work out of the box, but what's the
best way to add something like that? It's a fairly common feature, Oracle
has external tables, Presto has a plug-in mechanism that allows, among other
things, to pushdown predicates to the external table driver.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/External-data-sources-tp11766p11786.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: External data sources

Posted by Andrey Mashenkov <an...@gmail.com>.
Hi Dmitri,

There is no way to make SQL join with external data source. SQL engine
requires all data should be in cache.


On Thu, Apr 6, 2017 at 1:48 AM, Dmitri Bronnikov <dmitri.bronnikov@gmail.com
> wrote:

> What's the best way to implement SQL that joins an Ignite cache with an
> external data source, e.g. a Cassandra column family or a MySql table? A
> very big external table, not one that can be cached in memory prior to
> running the query.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/External-data-sources-tp11766.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Best regards,
Andrey V. Mashenkov