You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Lisen Mu <im...@gmail.com> on 2013/05/07 19:02:42 UTC
about mysql se
Hi Jacques,
If you got time, would you like to check the test case at:
git@github.com:immars/incubator-drill.git
branch mysqlse-wip
which forked from your execwork and fixed some protoc problem in pom.
There is a test at MySQLSETest, which only parses physical plans at:
mysql_plan.json
mysql_plan2.json
Am I representing push down correctly, with .json and MySQLScanPOP? It's
all in ops field.
Many thanks...
//mysql_plan2.json
{
head:{
type:"APACHE_DRILL_PHYSICAL",
version:"1",
generator:{
type:"manual",
info:"na"
}
},
graph:[
{
@id:1,
pop:"scan-mysql",
entries:[
{id:0, location: "jdbc:mysql://localhost:3306/drilltest", tables:
["employee","department"], parent: 1},
{id:1, location: "jdbc:mysql://localhost:3306/drilltest2", tables:
["employee","department"], parent: 1}
],
output: [ //todo: map column to outputs?
{mode: "VECTOR", type: "INT32"}, //field 0
{mode: "VECTOR", type: "BYTES"}, //field 1
{mode: "VECTOR", type: "BYTES"} //field 2
],
ops:[
{
@id:5,
child:1,
pop:"filter",
expr: "id==16"
},
{
@id: 6,
pop: "join",
type: "INNER",
left: 5,
right: 1,
conditions:[{
relationship: "==",
left: "employee.deptId",
right: "department.id"
}]
},
{
@id: 4,
child: 1,
pop:"project",
exprs: [
{ ref: "output.dept_name", expr: "department.name"},
{ ref: "output.guy_name", expr: "employee.name"},
{ ref: "output.guy_salary", expr: "employee.salary"}
]
}
]
},
{
@id:10,
child:1,
pop:"log-store"
}
]
}
Re: about mysql se
Posted by Lisen Mu <im...@gmail.com>.
Jacques,
Thanks for the reply!
So POP -> sql conversion happens in optimization rule implementation, and
ScanPOP deals with sql query with much less effort.
Well, as I'm not quite clear about how optimization works, I think I would
wait for a test case with optimizer...
Thanks,
Lisen
On Mon, May 13, 2013 at 12:03 PM, Jacques Nadeau <ja...@apache.org> wrote:
> Generally, inside a particular storage engine, it is the storage
> engine's optimization rules which decide how to represent the what
> happens with optimization. So while what you have is reasonable, I
> would actually guess that you'd be more likely to have something where
> all the ops simply turn into a sql query within the plan (as opposed
> to a complicated set of operations). For an example, I'll use HBase.
> In that case, HBase has the concept of filters and column family and
> column qualifiers. The optimization rules that convert a filter would
> be defined to transform the physical filter operator into a an HBase
> filter. The HBase filter would be something that could be easily
> passed to HBase.
>
> Does that make sense?
>
> Thanks,
> Jacques
>
>
> On Tue, May 7, 2013 at 10:02 AM, Lisen Mu <im...@gmail.com> wrote:
> > Hi Jacques,
> >
> > If you got time, would you like to check the test case at:
> >
> > git@github.com:immars/incubator-drill.git
> >
> > branch mysqlse-wip
> >
> > which forked from your execwork and fixed some protoc problem in pom.
> >
> > There is a test at MySQLSETest, which only parses physical plans at:
> >
> > mysql_plan.json
> > mysql_plan2.json
> >
> > Am I representing push down correctly, with .json and MySQLScanPOP? It's
> > all in ops field.
> >
> > Many thanks...
> >
> > //mysql_plan2.json
> > {
> > head:{
> > type:"APACHE_DRILL_PHYSICAL",
> > version:"1",
> > generator:{
> > type:"manual",
> > info:"na"
> > }
> > },
> > graph:[
> > {
> > @id:1,
> > pop:"scan-mysql",
> > entries:[
> > {id:0, location: "jdbc:mysql://localhost:3306/drilltest", tables:
> > ["employee","department"], parent: 1},
> > {id:1, location: "jdbc:mysql://localhost:3306/drilltest2",
> tables:
> > ["employee","department"], parent: 1}
> > ],
> > output: [ //todo: map column to outputs?
> > {mode: "VECTOR", type: "INT32"}, //field 0
> > {mode: "VECTOR", type: "BYTES"}, //field 1
> > {mode: "VECTOR", type: "BYTES"} //field 2
> > ],
> > ops:[
> >
> > {
> > @id:5,
> > child:1,
> > pop:"filter",
> > expr: "id==16"
> > },
> > {
> > @id: 6,
> > pop: "join",
> > type: "INNER",
> > left: 5,
> > right: 1,
> > conditions:[{
> > relationship: "==",
> > left: "employee.deptId",
> > right: "department.id"
> > }]
> > },
> > {
> > @id: 4,
> > child: 1,
> > pop:"project",
> > exprs: [
> > { ref: "output.dept_name", expr: "department.name"},
> > { ref: "output.guy_name", expr: "employee.name"},
> > { ref: "output.guy_salary", expr: "employee.salary"}
> > ]
> > }
> > ]
> > },
> > {
> > @id:10,
> > child:1,
> > pop:"log-store"
> > }
> > ]
> > }
>
Re: about mysql se
Posted by Jacques Nadeau <ja...@apache.org>.
Generally, inside a particular storage engine, it is the storage
engine's optimization rules which decide how to represent the what
happens with optimization. So while what you have is reasonable, I
would actually guess that you'd be more likely to have something where
all the ops simply turn into a sql query within the plan (as opposed
to a complicated set of operations). For an example, I'll use HBase.
In that case, HBase has the concept of filters and column family and
column qualifiers. The optimization rules that convert a filter would
be defined to transform the physical filter operator into a an HBase
filter. The HBase filter would be something that could be easily
passed to HBase.
Does that make sense?
Thanks,
Jacques
On Tue, May 7, 2013 at 10:02 AM, Lisen Mu <im...@gmail.com> wrote:
> Hi Jacques,
>
> If you got time, would you like to check the test case at:
>
> git@github.com:immars/incubator-drill.git
>
> branch mysqlse-wip
>
> which forked from your execwork and fixed some protoc problem in pom.
>
> There is a test at MySQLSETest, which only parses physical plans at:
>
> mysql_plan.json
> mysql_plan2.json
>
> Am I representing push down correctly, with .json and MySQLScanPOP? It's
> all in ops field.
>
> Many thanks...
>
> //mysql_plan2.json
> {
> head:{
> type:"APACHE_DRILL_PHYSICAL",
> version:"1",
> generator:{
> type:"manual",
> info:"na"
> }
> },
> graph:[
> {
> @id:1,
> pop:"scan-mysql",
> entries:[
> {id:0, location: "jdbc:mysql://localhost:3306/drilltest", tables:
> ["employee","department"], parent: 1},
> {id:1, location: "jdbc:mysql://localhost:3306/drilltest2", tables:
> ["employee","department"], parent: 1}
> ],
> output: [ //todo: map column to outputs?
> {mode: "VECTOR", type: "INT32"}, //field 0
> {mode: "VECTOR", type: "BYTES"}, //field 1
> {mode: "VECTOR", type: "BYTES"} //field 2
> ],
> ops:[
>
> {
> @id:5,
> child:1,
> pop:"filter",
> expr: "id==16"
> },
> {
> @id: 6,
> pop: "join",
> type: "INNER",
> left: 5,
> right: 1,
> conditions:[{
> relationship: "==",
> left: "employee.deptId",
> right: "department.id"
> }]
> },
> {
> @id: 4,
> child: 1,
> pop:"project",
> exprs: [
> { ref: "output.dept_name", expr: "department.name"},
> { ref: "output.guy_name", expr: "employee.name"},
> { ref: "output.guy_salary", expr: "employee.salary"}
> ]
> }
> ]
> },
> {
> @id:10,
> child:1,
> pop:"log-store"
> }
> ]
> }