You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Russell Jurney <ru...@gmail.com> on 2012/04/28 09:22:01 UTC

DBStorage to LOAD data via a SQL query?

Is it possible to use DBStorage to load data from MySQL by running a
suppled SQL query? Something like:

mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT foo.value1,
bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');


Even if I have to LOAD AS and specify a schema, that would be great.

It is problematic that there are no docs for DBStorage. If someone clues me
in, I'll write it up :)

-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: DBStorage to LOAD data via a SQL query?

Posted by Russell Jurney <ru...@gmail.com>.
Thanks, I'll take a look.

http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.19/src/mapred/org/apache/hadoop/mapred/lib/db/DBInputFormat.java

On Sun, Apr 29, 2012 at 12:53 PM, Prashant Kommireddi
<pr...@gmail.com>wrote:

> Hadoop does it with DBInputFormt, you could take a look at that.
>
> On Sun, Apr 29, 2012 at 12:21 PM, Russell Jurney
> <ru...@gmail.com>wrote:
>
> > Mightn't it be easy to write the SELECT part?  Not sure if the JDBC stuff
> > is convenient that way.
> >
> > On Sun, Apr 29, 2012 at 12:18 PM, Prashant Kommireddi
> > <pr...@gmail.com>wrote:
> >
> > > Ah I see what you are saying. You are right, I missed the SELECT part
> > > entirely.
> > >
> > > On Sun, Apr 29, 2012 at 12:09 PM, Russell Jurney
> > > <ru...@gmail.com>wrote:
> > >
> > > > Prashant, it has an INSERT query, but no SELECT query.  It does not
> > > > implement getNext(), so it looks like it is STORE only, not LOAD.
>  Am I
> > > > mistaken?  I read the source, but it was late :)
> > > >
> > > > On Sun, Apr 29, 2012 at 12:04 PM, Prashant Kommireddi
> > > > <pr...@gmail.com>wrote:
> > > >
> > > > > Russell,
> > > > >
> > > > > Looking at source code for DBStorage, seems like it does exactly
> > that.
> > > > Can
> > > > > you try it out?
> > > > >
> > > > > public DBStorage(String driver, String jdbcURL, String user, String
> > > pass,
> > > > >      String insertQuery, String batchSize)
> > > > >
> > > > > Thanks,
> > > > > Prashant
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
> > > > > <ru...@gmail.com>wrote:
> > > > >
> > > > > > Is it possible to use DBStorage to load data from MySQL by
> running
> > a
> > > > > > suppled SQL query? Something like:
> > > > > >
> > > > > > mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT
> > > > > foo.value1,
> > > > > > bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
> > > > > >
> > > > > >
> > > > > > Even if I have to LOAD AS and specify a schema, that would be
> > great.
> > > > > >
> > > > > > It is problematic that there are no docs for DBStorage. If
> someone
> > > > clues
> > > > > me
> > > > > > in, I'll write it up :)
> > > > > >
> > > > > > --
> > > > > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > > > > datasyndrome.com
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > > datasyndrome.com
> > > >
> > >
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > datasyndrome.com
> >
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: DBStorage to LOAD data via a SQL query?

Posted by Prashant Kommireddi <pr...@gmail.com>.
Hadoop does it with DBInputFormt, you could take a look at that.

On Sun, Apr 29, 2012 at 12:21 PM, Russell Jurney
<ru...@gmail.com>wrote:

> Mightn't it be easy to write the SELECT part?  Not sure if the JDBC stuff
> is convenient that way.
>
> On Sun, Apr 29, 2012 at 12:18 PM, Prashant Kommireddi
> <pr...@gmail.com>wrote:
>
> > Ah I see what you are saying. You are right, I missed the SELECT part
> > entirely.
> >
> > On Sun, Apr 29, 2012 at 12:09 PM, Russell Jurney
> > <ru...@gmail.com>wrote:
> >
> > > Prashant, it has an INSERT query, but no SELECT query.  It does not
> > > implement getNext(), so it looks like it is STORE only, not LOAD.  Am I
> > > mistaken?  I read the source, but it was late :)
> > >
> > > On Sun, Apr 29, 2012 at 12:04 PM, Prashant Kommireddi
> > > <pr...@gmail.com>wrote:
> > >
> > > > Russell,
> > > >
> > > > Looking at source code for DBStorage, seems like it does exactly
> that.
> > > Can
> > > > you try it out?
> > > >
> > > > public DBStorage(String driver, String jdbcURL, String user, String
> > pass,
> > > >      String insertQuery, String batchSize)
> > > >
> > > > Thanks,
> > > > Prashant
> > > >
> > > >
> > > >
> > > > On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
> > > > <ru...@gmail.com>wrote:
> > > >
> > > > > Is it possible to use DBStorage to load data from MySQL by running
> a
> > > > > suppled SQL query? Something like:
> > > > >
> > > > > mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT
> > > > foo.value1,
> > > > > bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
> > > > >
> > > > >
> > > > > Even if I have to LOAD AS and specify a schema, that would be
> great.
> > > > >
> > > > > It is problematic that there are no docs for DBStorage. If someone
> > > clues
> > > > me
> > > > > in, I'll write it up :)
> > > > >
> > > > > --
> > > > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > > > datasyndrome.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > datasyndrome.com
> > >
> >
>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>

Re: DBStorage to LOAD data via a SQL query?

Posted by Russell Jurney <ru...@gmail.com>.
Mightn't it be easy to write the SELECT part?  Not sure if the JDBC stuff
is convenient that way.

On Sun, Apr 29, 2012 at 12:18 PM, Prashant Kommireddi
<pr...@gmail.com>wrote:

> Ah I see what you are saying. You are right, I missed the SELECT part
> entirely.
>
> On Sun, Apr 29, 2012 at 12:09 PM, Russell Jurney
> <ru...@gmail.com>wrote:
>
> > Prashant, it has an INSERT query, but no SELECT query.  It does not
> > implement getNext(), so it looks like it is STORE only, not LOAD.  Am I
> > mistaken?  I read the source, but it was late :)
> >
> > On Sun, Apr 29, 2012 at 12:04 PM, Prashant Kommireddi
> > <pr...@gmail.com>wrote:
> >
> > > Russell,
> > >
> > > Looking at source code for DBStorage, seems like it does exactly that.
> > Can
> > > you try it out?
> > >
> > > public DBStorage(String driver, String jdbcURL, String user, String
> pass,
> > >      String insertQuery, String batchSize)
> > >
> > > Thanks,
> > > Prashant
> > >
> > >
> > >
> > > On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
> > > <ru...@gmail.com>wrote:
> > >
> > > > Is it possible to use DBStorage to load data from MySQL by running a
> > > > suppled SQL query? Something like:
> > > >
> > > > mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT
> > > foo.value1,
> > > > bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
> > > >
> > > >
> > > > Even if I have to LOAD AS and specify a schema, that would be great.
> > > >
> > > > It is problematic that there are no docs for DBStorage. If someone
> > clues
> > > me
> > > > in, I'll write it up :)
> > > >
> > > > --
> > > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > > datasyndrome.com
> > > >
> > >
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > datasyndrome.com
> >
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: DBStorage to LOAD data via a SQL query?

Posted by Prashant Kommireddi <pr...@gmail.com>.
Ah I see what you are saying. You are right, I missed the SELECT part
entirely.

On Sun, Apr 29, 2012 at 12:09 PM, Russell Jurney
<ru...@gmail.com>wrote:

> Prashant, it has an INSERT query, but no SELECT query.  It does not
> implement getNext(), so it looks like it is STORE only, not LOAD.  Am I
> mistaken?  I read the source, but it was late :)
>
> On Sun, Apr 29, 2012 at 12:04 PM, Prashant Kommireddi
> <pr...@gmail.com>wrote:
>
> > Russell,
> >
> > Looking at source code for DBStorage, seems like it does exactly that.
> Can
> > you try it out?
> >
> > public DBStorage(String driver, String jdbcURL, String user, String pass,
> >      String insertQuery, String batchSize)
> >
> > Thanks,
> > Prashant
> >
> >
> >
> > On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
> > <ru...@gmail.com>wrote:
> >
> > > Is it possible to use DBStorage to load data from MySQL by running a
> > > suppled SQL query? Something like:
> > >
> > > mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT
> > foo.value1,
> > > bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
> > >
> > >
> > > Even if I have to LOAD AS and specify a schema, that would be great.
> > >
> > > It is problematic that there are no docs for DBStorage. If someone
> clues
> > me
> > > in, I'll write it up :)
> > >
> > > --
> > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > > datasyndrome.com
> > >
> >
>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>

Re: DBStorage to LOAD data via a SQL query?

Posted by Russell Jurney <ru...@gmail.com>.
Prashant, it has an INSERT query, but no SELECT query.  It does not
implement getNext(), so it looks like it is STORE only, not LOAD.  Am I
mistaken?  I read the source, but it was late :)

On Sun, Apr 29, 2012 at 12:04 PM, Prashant Kommireddi
<pr...@gmail.com>wrote:

> Russell,
>
> Looking at source code for DBStorage, seems like it does exactly that. Can
> you try it out?
>
> public DBStorage(String driver, String jdbcURL, String user, String pass,
>      String insertQuery, String batchSize)
>
> Thanks,
> Prashant
>
>
>
> On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
> <ru...@gmail.com>wrote:
>
> > Is it possible to use DBStorage to load data from MySQL by running a
> > suppled SQL query? Something like:
> >
> > mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT
> foo.value1,
> > bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
> >
> >
> > Even if I have to LOAD AS and specify a schema, that would be great.
> >
> > It is problematic that there are no docs for DBStorage. If someone clues
> me
> > in, I'll write it up :)
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > datasyndrome.com
> >
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: DBStorage to LOAD data via a SQL query?

Posted by Prashant Kommireddi <pr...@gmail.com>.
Russell,

Looking at source code for DBStorage, seems like it does exactly that. Can
you try it out?

public DBStorage(String driver, String jdbcURL, String user, String pass,
      String insertQuery, String batchSize)

Thanks,
Prashant



On Sat, Apr 28, 2012 at 12:22 AM, Russell Jurney
<ru...@gmail.com>wrote:

> Is it possible to use DBStorage to load data from MySQL by running a
> suppled SQL query? Something like:
>
> mydata = LOAD 'jdbc://localhost/enron' USING DBStorage('SELECT foo.value1,
> bar.value2 FROM foo JOIN bar on foo.bar_id = bar.id');
>
>
> Even if I have to LOAD AS and specify a schema, that would be great.
>
> It is problematic that there are no docs for DBStorage. If someone clues me
> in, I'll write it up :)
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>