You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Sebastien Rosset <sr...@gmail.com> on 2023/04/02 15:06:56 UTC

Using calcite to parse untrusted SQL queries and add implicit filters

Would it make sense to use Apache Calcite as a security mediator? Has
calcite already been used for that purpose? The mediator would parse
untrusted SQL queries, reject queries other than SELECT (including any
sub-queries), and inject multi-tenancy WHERE filters in every SELECT query,
including sub-queries?

More specifically, consider a Druid database which is configured to
implement multi-tenancy with shared datasources:
https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources.
Every Druid datasource would have a "tenant_id" attribute that can be used
in a Druid-SQL WHERE clause.

Thank you. Sebastien

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Sebastien Rosset <sr...@gmail.com>.
Thank you for the insights. Sharing a summary of the steps. I was able to
create a basic transformer that performs the following steps:
1. Construct a SqlParser.
2. Parse a SQL query provided as input string, using the above SqlParser.
3. Visit every SELECT node using SqlShuffle. Transform the WHERE clause
using a custom SqlStdOperatorTable.AND.createCall()

Reading the calcite tutorial and documentation, I saw the calcite library
implements a Druid adapter. I initially thought that meant the following:
1. The Druid adapter would implement the *SqlConformance* interface.
2. It would be possible to create a builder that conforms with the Druid
SQL dialect. This would be done by invoking ConfigBuilder.setConformance()
and ConfigBuilder.setParserFactory().

However, it looks like the Druid conformance is implemented outside the org.
apache.calcite library. Instead, it appears the conformance is implemented
outside the adapter:
https://github.com/apache/druid/blob/master/sql/src/main/java/org/apache/druid/sql/calcite/planner/PlannerFactory.java
.
So to get the Druid conformance, the org.apache.druid.sql library is
needed. https://github.com/apache/druid/blob/master/sql/pom.xml


On Sun, Apr 2, 2023 at 12:25 PM Sebastien Rosset <sr...@gmail.com> wrote:

> > Also you could give users access to views. A particular user might have
> a view with “WHERE tenant = 100” so if they ask for any any other tenant is
> they would get no results.
>
> I'm not sure if you are referring to the Druid View manager
> https://docs.imply.io/latest/druid/operations/views/view-manager/, which
> is an alpha feature. It would elegantly solve the problem, but it's still
> an alpha feature.
>
> On Sun, Apr 2, 2023 at 11:35 AM Julian Hyde <jh...@gmail.com>
> wrote:
>
>> +1 what Askar said
>>
>> Also you could give users access to views. A particular user might have a
>> view with “WHERE tenant = 100” so if they ask for any any other tenant is
>> they would get no results.
>>
>> It’s also possible to use Calcite to check grants (table and column
>> access) and fail a query if they access objects they are not allowed to see
>> or use. We should add features to support this use case better.
>>
>> Julian
>>
>> > On Apr 2, 2023, at 11:09, Askar Bozcan <as...@gmail.com> wrote:
>> >
>> > (EDIT)
>> > ... *For this case *there's also a Druid adapter which uses the native
>> JSON
>> > intf.
>> >
>> >> On Sun, 2 Apr 2023 at 21:08, Askar Bozcan <as...@gmail.com>
>> wrote:
>> >>
>> >> What's your use case? Do you want to:
>> >> a) Use Calcite as a "frontend" DB of sorts, to accept all queries and
>> send
>> >> the processed SQL query (from Calcite) only the accepted queries? For
>> this
>> >> case
>> >> b) Just use Calcite as a query processor, and send the processed query
>> >> yourself?
>> >>
>> >> - Askar
>> >>
>> >>> On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com>
>> wrote:
>> >>>
>> >>> Thank you for the quick response. I am new to Calcite, it's good to
>> hear
>> >>> there might be a possibility. I will investigate. Regarding the
>> tenant_id,
>> >>> if the input SQL statement is:
>> >>> SELECT a, b, c
>> >>> FROM datasource
>> >>> WHERE input_expression
>> >>>
>> >>> Then the tool should add a "security filter" as shown below:
>> >>>
>> >>> SELECT a, b, c
>> >>> FROM datasource
>> >>> WHERE tenant_id = 'abcd123' AND input_expression
>> >>>
>> >>>
>> >>> On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> Hello Sebastien,
>> >>>> I'd say it's quite feasible for this purpose by having Calcite
>> push-down
>> >>>> pre-processed queries to Druid and being a kind of a "front" to
>> >>> underlying
>> >>>> Druid DB.
>> >>>>
>> >>>> Regarding rejecting queries other than SELECT:
>> >>>> 1) Parse the query and get the SqlNode representing the root of the
>> >>> syntax
>> >>>> tree.
>> >>>> 2) Extend SqlShuttle
>> >>>> <
>> >>>>
>> >>>
>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
>> >>>>>
>> >>>> to check & manipulate the syntax tree, returning null for rejected
>> >>> nodes &
>> >>>> recursively iterating for SELECT sub-queries.
>> >>>> Note that you can use SqlSelect's getFrom, getGroup, etc to get all
>> of
>> >>> the
>> >>>> possible nodes which can contain subqueries. Also note that SqlSelect
>> >>>> itself is a subtype of SqlCall.
>> >>>>
>> >>>> Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
>> >>>> familiar with Druid nor with Druid adapter in Calcite. Regardless, it
>> >>>> should be doable on the adapter level AFAIK.
>> >>>>
>> >>>> Hope that was helpful!
>> >>>>
>> >>>> Regards,
>> >>>> Askar Bozcan
>> >>>>
>> >>>> On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>>> Would it make sense to use Apache Calcite as a security mediator?
>> Has
>> >>>>> calcite already been used for that purpose? The mediator would parse
>> >>>>> untrusted SQL queries, reject queries other than SELECT (including
>> any
>> >>>>> sub-queries), and inject multi-tenancy WHERE filters in every SELECT
>> >>>> query,
>> >>>>> including sub-queries?
>> >>>>>
>> >>>>> More specifically, consider a Druid database which is configured to
>> >>>>> implement multi-tenancy with shared datasources:
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
>> >>>>> .
>> >>>>> Every Druid datasource would have a "tenant_id" attribute that can
>> be
>> >>>> used
>> >>>>> in a Druid-SQL WHERE clause.
>> >>>>>
>> >>>>> Thank you. Sebastien
>> >>>>>
>> >>>>
>> >>>
>> >>
>>
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Sebastien Rosset <sr...@gmail.com>.
> Also you could give users access to views. A particular user might have a
view with “WHERE tenant = 100” so if they ask for any any other tenant is
they would get no results.

I'm not sure if you are referring to the Druid View manager
https://docs.imply.io/latest/druid/operations/views/view-manager/, which is
an alpha feature. It would elegantly solve the problem, but it's still an
alpha feature.

On Sun, Apr 2, 2023 at 11:35 AM Julian Hyde <jh...@gmail.com> wrote:

> +1 what Askar said
>
> Also you could give users access to views. A particular user might have a
> view with “WHERE tenant = 100” so if they ask for any any other tenant is
> they would get no results.
>
> It’s also possible to use Calcite to check grants (table and column
> access) and fail a query if they access objects they are not allowed to see
> or use. We should add features to support this use case better.
>
> Julian
>
> > On Apr 2, 2023, at 11:09, Askar Bozcan <as...@gmail.com> wrote:
> >
> > (EDIT)
> > ... *For this case *there's also a Druid adapter which uses the native
> JSON
> > intf.
> >
> >> On Sun, 2 Apr 2023 at 21:08, Askar Bozcan <as...@gmail.com>
> wrote:
> >>
> >> What's your use case? Do you want to:
> >> a) Use Calcite as a "frontend" DB of sorts, to accept all queries and
> send
> >> the processed SQL query (from Calcite) only the accepted queries? For
> this
> >> case
> >> b) Just use Calcite as a query processor, and send the processed query
> >> yourself?
> >>
> >> - Askar
> >>
> >>> On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com>
> wrote:
> >>>
> >>> Thank you for the quick response. I am new to Calcite, it's good to
> hear
> >>> there might be a possibility. I will investigate. Regarding the
> tenant_id,
> >>> if the input SQL statement is:
> >>> SELECT a, b, c
> >>> FROM datasource
> >>> WHERE input_expression
> >>>
> >>> Then the tool should add a "security filter" as shown below:
> >>>
> >>> SELECT a, b, c
> >>> FROM datasource
> >>> WHERE tenant_id = 'abcd123' AND input_expression
> >>>
> >>>
> >>> On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hello Sebastien,
> >>>> I'd say it's quite feasible for this purpose by having Calcite
> push-down
> >>>> pre-processed queries to Druid and being a kind of a "front" to
> >>> underlying
> >>>> Druid DB.
> >>>>
> >>>> Regarding rejecting queries other than SELECT:
> >>>> 1) Parse the query and get the SqlNode representing the root of the
> >>> syntax
> >>>> tree.
> >>>> 2) Extend SqlShuttle
> >>>> <
> >>>>
> >>>
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
> >>>>>
> >>>> to check & manipulate the syntax tree, returning null for rejected
> >>> nodes &
> >>>> recursively iterating for SELECT sub-queries.
> >>>> Note that you can use SqlSelect's getFrom, getGroup, etc to get all of
> >>> the
> >>>> possible nodes which can contain subqueries. Also note that SqlSelect
> >>>> itself is a subtype of SqlCall.
> >>>>
> >>>> Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
> >>>> familiar with Druid nor with Druid adapter in Calcite. Regardless, it
> >>>> should be doable on the adapter level AFAIK.
> >>>>
> >>>> Hope that was helpful!
> >>>>
> >>>> Regards,
> >>>> Askar Bozcan
> >>>>
> >>>> On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Would it make sense to use Apache Calcite as a security mediator? Has
> >>>>> calcite already been used for that purpose? The mediator would parse
> >>>>> untrusted SQL queries, reject queries other than SELECT (including
> any
> >>>>> sub-queries), and inject multi-tenancy WHERE filters in every SELECT
> >>>> query,
> >>>>> including sub-queries?
> >>>>>
> >>>>> More specifically, consider a Druid database which is configured to
> >>>>> implement multi-tenancy with shared datasources:
> >>>>>
> >>>>>
> >>>>
> >>>
> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
> >>>>> .
> >>>>> Every Druid datasource would have a "tenant_id" attribute that can be
> >>>> used
> >>>>> in a Druid-SQL WHERE clause.
> >>>>>
> >>>>> Thank you. Sebastien
> >>>>>
> >>>>
> >>>
> >>
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Julian Hyde <jh...@gmail.com>.
+1 what Askar said

Also you could give users access to views. A particular user might have a view with “WHERE tenant = 100” so if they ask for any any other tenant is they would get no results. 

It’s also possible to use Calcite to check grants (table and column access) and fail a query if they access objects they are not allowed to see or use. We should add features to support this use case better. 

Julian

> On Apr 2, 2023, at 11:09, Askar Bozcan <as...@gmail.com> wrote:
> 
> (EDIT)
> ... *For this case *there's also a Druid adapter which uses the native JSON
> intf.
> 
>> On Sun, 2 Apr 2023 at 21:08, Askar Bozcan <as...@gmail.com> wrote:
>> 
>> What's your use case? Do you want to:
>> a) Use Calcite as a "frontend" DB of sorts, to accept all queries and send
>> the processed SQL query (from Calcite) only the accepted queries? For this
>> case
>> b) Just use Calcite as a query processor, and send the processed query
>> yourself?
>> 
>> - Askar
>> 
>>> On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com> wrote:
>>> 
>>> Thank you for the quick response. I am new to Calcite, it's good to hear
>>> there might be a possibility. I will investigate. Regarding the tenant_id,
>>> if the input SQL statement is:
>>> SELECT a, b, c
>>> FROM datasource
>>> WHERE input_expression
>>> 
>>> Then the tool should add a "security filter" as shown below:
>>> 
>>> SELECT a, b, c
>>> FROM datasource
>>> WHERE tenant_id = 'abcd123' AND input_expression
>>> 
>>> 
>>> On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com>
>>> wrote:
>>> 
>>>> Hello Sebastien,
>>>> I'd say it's quite feasible for this purpose by having Calcite push-down
>>>> pre-processed queries to Druid and being a kind of a "front" to
>>> underlying
>>>> Druid DB.
>>>> 
>>>> Regarding rejecting queries other than SELECT:
>>>> 1) Parse the query and get the SqlNode representing the root of the
>>> syntax
>>>> tree.
>>>> 2) Extend SqlShuttle
>>>> <
>>>> 
>>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
>>>>> 
>>>> to check & manipulate the syntax tree, returning null for rejected
>>> nodes &
>>>> recursively iterating for SELECT sub-queries.
>>>> Note that you can use SqlSelect's getFrom, getGroup, etc to get all of
>>> the
>>>> possible nodes which can contain subqueries. Also note that SqlSelect
>>>> itself is a subtype of SqlCall.
>>>> 
>>>> Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
>>>> familiar with Druid nor with Druid adapter in Calcite. Regardless, it
>>>> should be doable on the adapter level AFAIK.
>>>> 
>>>> Hope that was helpful!
>>>> 
>>>> Regards,
>>>> Askar Bozcan
>>>> 
>>>> On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com>
>>> wrote:
>>>> 
>>>>> Would it make sense to use Apache Calcite as a security mediator? Has
>>>>> calcite already been used for that purpose? The mediator would parse
>>>>> untrusted SQL queries, reject queries other than SELECT (including any
>>>>> sub-queries), and inject multi-tenancy WHERE filters in every SELECT
>>>> query,
>>>>> including sub-queries?
>>>>> 
>>>>> More specifically, consider a Druid database which is configured to
>>>>> implement multi-tenancy with shared datasources:
>>>>> 
>>>>> 
>>>> 
>>> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
>>>>> .
>>>>> Every Druid datasource would have a "tenant_id" attribute that can be
>>>> used
>>>>> in a Druid-SQL WHERE clause.
>>>>> 
>>>>> Thank you. Sebastien
>>>>> 
>>>> 
>>> 
>> 

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Askar Bozcan <as...@gmail.com>.
(EDIT)
... *For this case *there's also a Druid adapter which uses the native JSON
intf.

On Sun, 2 Apr 2023 at 21:08, Askar Bozcan <as...@gmail.com> wrote:

> What's your use case? Do you want to:
> a) Use Calcite as a "frontend" DB of sorts, to accept all queries and send
> the processed SQL query (from Calcite) only the accepted queries? For this
> case
> b) Just use Calcite as a query processor, and send the processed query
> yourself?
>
> - Askar
>
> On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com> wrote:
>
>> Thank you for the quick response. I am new to Calcite, it's good to hear
>> there might be a possibility. I will investigate. Regarding the tenant_id,
>> if the input SQL statement is:
>> SELECT a, b, c
>> FROM datasource
>> WHERE input_expression
>>
>> Then the tool should add a "security filter" as shown below:
>>
>> SELECT a, b, c
>> FROM datasource
>> WHERE tenant_id = 'abcd123' AND input_expression
>>
>>
>> On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com>
>> wrote:
>>
>> > Hello Sebastien,
>> > I'd say it's quite feasible for this purpose by having Calcite push-down
>> > pre-processed queries to Druid and being a kind of a "front" to
>> underlying
>> > Druid DB.
>> >
>> > Regarding rejecting queries other than SELECT:
>> > 1) Parse the query and get the SqlNode representing the root of the
>> syntax
>> > tree.
>> > 2) Extend SqlShuttle
>> > <
>> >
>> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
>> > >
>> > to check & manipulate the syntax tree, returning null for rejected
>> nodes &
>> > recursively iterating for SELECT sub-queries.
>> > Note that you can use SqlSelect's getFrom, getGroup, etc to get all of
>> the
>> > possible nodes which can contain subqueries. Also note that SqlSelect
>> > itself is a subtype of SqlCall.
>> >
>> > Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
>> > familiar with Druid nor with Druid adapter in Calcite. Regardless, it
>> > should be doable on the adapter level AFAIK.
>> >
>> > Hope that was helpful!
>> >
>> > Regards,
>> > Askar Bozcan
>> >
>> > On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com>
>> wrote:
>> >
>> > > Would it make sense to use Apache Calcite as a security mediator? Has
>> > > calcite already been used for that purpose? The mediator would parse
>> > > untrusted SQL queries, reject queries other than SELECT (including any
>> > > sub-queries), and inject multi-tenancy WHERE filters in every SELECT
>> > query,
>> > > including sub-queries?
>> > >
>> > > More specifically, consider a Druid database which is configured to
>> > > implement multi-tenancy with shared datasources:
>> > >
>> > >
>> >
>> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
>> > > .
>> > > Every Druid datasource would have a "tenant_id" attribute that can be
>> > used
>> > > in a Druid-SQL WHERE clause.
>> > >
>> > > Thank you. Sebastien
>> > >
>> >
>>
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Sebastien Rosset <sr...@gmail.com>.
Askar,
Yes, I am exploring whether I could use Calcite as a "frontend" DB that
intercepts a SQL query, validates it, rejects queries that attempts to
mutate data, and transform the WHERE clauses. After filtering and
transformation, the query would be sent to the real database.

I've already done that with native Druid queries. I am now investigating
how to do it with Druid SQL.

On Sun, Apr 2, 2023 at 11:09 AM Askar Bozcan <as...@gmail.com> wrote:

> What's your use case? Do you want to:
> a) Use Calcite as a "frontend" DB of sorts, to accept all queries and send
> the processed SQL query (from Calcite) only the accepted queries? For this
> case
> b) Just use Calcite as a query processor, and send the processed query
> yourself?
>
> - Askar
>
> On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com> wrote:
>
> > Thank you for the quick response. I am new to Calcite, it's good to hear
> > there might be a possibility. I will investigate. Regarding the
> tenant_id,
> > if the input SQL statement is:
> > SELECT a, b, c
> > FROM datasource
> > WHERE input_expression
> >
> > Then the tool should add a "security filter" as shown below:
> >
> > SELECT a, b, c
> > FROM datasource
> > WHERE tenant_id = 'abcd123' AND input_expression
> >
> >
> > On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com>
> wrote:
> >
> > > Hello Sebastien,
> > > I'd say it's quite feasible for this purpose by having Calcite
> push-down
> > > pre-processed queries to Druid and being a kind of a "front" to
> > underlying
> > > Druid DB.
> > >
> > > Regarding rejecting queries other than SELECT:
> > > 1) Parse the query and get the SqlNode representing the root of the
> > syntax
> > > tree.
> > > 2) Extend SqlShuttle
> > > <
> > >
> >
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
> > > >
> > > to check & manipulate the syntax tree, returning null for rejected
> nodes
> > &
> > > recursively iterating for SELECT sub-queries.
> > > Note that you can use SqlSelect's getFrom, getGroup, etc to get all of
> > the
> > > possible nodes which can contain subqueries. Also note that SqlSelect
> > > itself is a subtype of SqlCall.
> > >
> > > Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
> > > familiar with Druid nor with Druid adapter in Calcite. Regardless, it
> > > should be doable on the adapter level AFAIK.
> > >
> > > Hope that was helpful!
> > >
> > > Regards,
> > > Askar Bozcan
> > >
> > > On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com>
> wrote:
> > >
> > > > Would it make sense to use Apache Calcite as a security mediator? Has
> > > > calcite already been used for that purpose? The mediator would parse
> > > > untrusted SQL queries, reject queries other than SELECT (including
> any
> > > > sub-queries), and inject multi-tenancy WHERE filters in every SELECT
> > > query,
> > > > including sub-queries?
> > > >
> > > > More specifically, consider a Druid database which is configured to
> > > > implement multi-tenancy with shared datasources:
> > > >
> > > >
> > >
> >
> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
> > > > .
> > > > Every Druid datasource would have a "tenant_id" attribute that can be
> > > used
> > > > in a Druid-SQL WHERE clause.
> > > >
> > > > Thank you. Sebastien
> > > >
> > >
> >
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Askar Bozcan <as...@gmail.com>.
What's your use case? Do you want to:
a) Use Calcite as a "frontend" DB of sorts, to accept all queries and send
the processed SQL query (from Calcite) only the accepted queries? For this
case
b) Just use Calcite as a query processor, and send the processed query
yourself?

- Askar

On Sun, 2 Apr 2023 at 20:38, Sebastien Rosset <sr...@gmail.com> wrote:

> Thank you for the quick response. I am new to Calcite, it's good to hear
> there might be a possibility. I will investigate. Regarding the tenant_id,
> if the input SQL statement is:
> SELECT a, b, c
> FROM datasource
> WHERE input_expression
>
> Then the tool should add a "security filter" as shown below:
>
> SELECT a, b, c
> FROM datasource
> WHERE tenant_id = 'abcd123' AND input_expression
>
>
> On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com> wrote:
>
> > Hello Sebastien,
> > I'd say it's quite feasible for this purpose by having Calcite push-down
> > pre-processed queries to Druid and being a kind of a "front" to
> underlying
> > Druid DB.
> >
> > Regarding rejecting queries other than SELECT:
> > 1) Parse the query and get the SqlNode representing the root of the
> syntax
> > tree.
> > 2) Extend SqlShuttle
> > <
> >
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
> > >
> > to check & manipulate the syntax tree, returning null for rejected nodes
> &
> > recursively iterating for SELECT sub-queries.
> > Note that you can use SqlSelect's getFrom, getGroup, etc to get all of
> the
> > possible nodes which can contain subqueries. Also note that SqlSelect
> > itself is a subtype of SqlCall.
> >
> > Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
> > familiar with Druid nor with Druid adapter in Calcite. Regardless, it
> > should be doable on the adapter level AFAIK.
> >
> > Hope that was helpful!
> >
> > Regards,
> > Askar Bozcan
> >
> > On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com> wrote:
> >
> > > Would it make sense to use Apache Calcite as a security mediator? Has
> > > calcite already been used for that purpose? The mediator would parse
> > > untrusted SQL queries, reject queries other than SELECT (including any
> > > sub-queries), and inject multi-tenancy WHERE filters in every SELECT
> > query,
> > > including sub-queries?
> > >
> > > More specifically, consider a Druid database which is configured to
> > > implement multi-tenancy with shared datasources:
> > >
> > >
> >
> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
> > > .
> > > Every Druid datasource would have a "tenant_id" attribute that can be
> > used
> > > in a Druid-SQL WHERE clause.
> > >
> > > Thank you. Sebastien
> > >
> >
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Sebastien Rosset <sr...@gmail.com>.
Thank you for the quick response. I am new to Calcite, it's good to hear
there might be a possibility. I will investigate. Regarding the tenant_id,
if the input SQL statement is:
SELECT a, b, c
FROM datasource
WHERE input_expression

Then the tool should add a "security filter" as shown below:

SELECT a, b, c
FROM datasource
WHERE tenant_id = 'abcd123' AND input_expression


On Sun, Apr 2, 2023 at 9:56 AM Askar Bozcan <as...@gmail.com> wrote:

> Hello Sebastien,
> I'd say it's quite feasible for this purpose by having Calcite push-down
> pre-processed queries to Druid and being a kind of a "front" to underlying
> Druid DB.
>
> Regarding rejecting queries other than SELECT:
> 1) Parse the query and get the SqlNode representing the root of the syntax
> tree.
> 2) Extend SqlShuttle
> <
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html
> >
> to check & manipulate the syntax tree, returning null for rejected nodes &
> recursively iterating for SELECT sub-queries.
> Note that you can use SqlSelect's getFrom, getGroup, etc to get all of the
> possible nodes which can contain subqueries. Also note that SqlSelect
> itself is a subtype of SqlCall.
>
> Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
> familiar with Druid nor with Druid adapter in Calcite. Regardless, it
> should be doable on the adapter level AFAIK.
>
> Hope that was helpful!
>
> Regards,
> Askar Bozcan
>
> On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com> wrote:
>
> > Would it make sense to use Apache Calcite as a security mediator? Has
> > calcite already been used for that purpose? The mediator would parse
> > untrusted SQL queries, reject queries other than SELECT (including any
> > sub-queries), and inject multi-tenancy WHERE filters in every SELECT
> query,
> > including sub-queries?
> >
> > More specifically, consider a Druid database which is configured to
> > implement multi-tenancy with shared datasources:
> >
> >
> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
> > .
> > Every Druid datasource would have a "tenant_id" attribute that can be
> used
> > in a Druid-SQL WHERE clause.
> >
> > Thank you. Sebastien
> >
>

Re: Using calcite to parse untrusted SQL queries and add implicit filters

Posted by Askar Bozcan <as...@gmail.com>.
Hello Sebastien,
I'd say it's quite feasible for this purpose by having Calcite push-down
pre-processed queries to Druid and being a kind of a "front" to underlying
Druid DB.

Regarding rejecting queries other than SELECT:
1) Parse the query and get the SqlNode representing the root of the syntax
tree.
2) Extend SqlShuttle
<https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/util/SqlShuttle.html>
to check & manipulate the syntax tree, returning null for rejected nodes &
recursively iterating for SELECT sub-queries.
Note that you can use SqlSelect's getFrom, getGroup, etc to get all of the
possible nodes which can contain subqueries. Also note that SqlSelect
itself is a subtype of SqlCall.

Regarding "tenant_id" in WHERE case, I'm not too sure as I'm neither
familiar with Druid nor with Druid adapter in Calcite. Regardless, it
should be doable on the adapter level AFAIK.

Hope that was helpful!

Regards,
Askar Bozcan

On Sun, 2 Apr 2023 at 18:07, Sebastien Rosset <sr...@gmail.com> wrote:

> Would it make sense to use Apache Calcite as a security mediator? Has
> calcite already been used for that purpose? The mediator would parse
> untrusted SQL queries, reject queries other than SELECT (including any
> sub-queries), and inject multi-tenancy WHERE filters in every SELECT query,
> including sub-queries?
>
> More specifically, consider a Druid database which is configured to
> implement multi-tenancy with shared datasources:
>
> https://druid.apache.org/docs/latest/querying/multitenancy.html#partitioning-shared-datasources
> .
> Every Druid datasource would have a "tenant_id" attribute that can be used
> in a Druid-SQL WHERE clause.
>
> Thank you. Sebastien
>