You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Sachin Gaikwad <sa...@gmail.com> on 2016/05/16 09:07:50 UTC

Using external data source for storing tables

Hi all,

1) We have our own data source where we store tables. This data store is
NoSQL store written in C++ (key-value based). Now we would like to use
these tables with Impala. Can we use external-data-source APIs for this
purpose?

2) I looked at this
<https://github.com/cloudera/Impala/blob/master/testdata/bin/create-data-source-table.sql>
and this
<https://github.com/cloudera/Impala/blob/master/ext-data-source/test/src/main/java/com/cloudera/impala/extdatasource/AllTypesDataSource.java>
but not able to understand how will we create tables in external source. I
am thinking we will need to provide C++ or Java APIs to
create/read/write/query tables which Impala will use, is this correct?

Thanks,
Sachin

Re: Using external data source for storing tables

Posted by Matthew Jacobs <mj...@cloudera.com>.
Hi Sachin,

The "External Data Source" mechanism was implemented for internal use and
isn't a supported feature, but you can try use it for this purpose if you'd
like. You should know it has a number of significant limitations, most
notably that it cannot be distributed (i.e. not scalable, runs on a single
node) and not performance optimized in any way. There are maybe more
limitations I don't recall off the top of my head. You would need to create
a Java class (there no C++ interface) that implements the ExternalDataSource
interface
<https://github.com/cloudera/Impala/blob/master/ext-data-source/api/src/main/java/com/cloudera/impala/extdatasource/v1/ExternalDataSource.java>,
and then you would need to register the data source and create a table
using the data source. The links you already found are the best examples
that we have of this.

Best,
Matt

On Mon, May 16, 2016 at 2:07 AM Sachin Gaikwad <sa...@gmail.com>
wrote:

> Hi all,
>
> 1) We have our own data source where we store tables. This data store is
> NoSQL store written in C++ (key-value based). Now we would like to use
> these tables with Impala. Can we use external-data-source APIs for this
> purpose?
>
> 2) I looked at this
> <https://github.com/cloudera/Impala/blob/master/testdata/bin/create-data-source-table.sql>
> and this
> <https://github.com/cloudera/Impala/blob/master/ext-data-source/test/src/main/java/com/cloudera/impala/extdatasource/AllTypesDataSource.java>
> but not able to understand how will we create tables in external source. I
> am thinking we will need to provide C++ or Java APIs to
> create/read/write/query tables which Impala will use, is this correct?
>
> Thanks,
> Sachin
>