You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by James Warren <ja...@stanfordalumni.org> on 2012/01/05 01:45:06 UTC

does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Greetings -

Was wondering / hoping if it were possible to configure sqoop to use
SQL_BUFFER_RESULT in the generated select queries.  From my understanding
sqoop uses READ_COMMITTED as the isolation level, but I'd ideally like to
release the write locks as soon as possible to help alleviate any
replication lag.

Many thanks,
-jw

Re: does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
You're welcome sir. I'm glad that sqoop is working for you as you need after all :-)

Jarcec

On Sun, Jan 08, 2012 at 01:20:10AM -0800, James Warren wrote:
> Finally got around to testing this earlier today and it ran smooth as silk.
>  I really should browse the JIRAs more often.  Thanks for the advice (and
> patch), Jarcec!
> 
> cheers,
> -James
> 
> On Thu, Jan 5, 2012 at 11:20 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:
> 
> > Hi James,
> > you can bypass this particular problem by specifying your own boundary
> > query using --boundary-query command line parameter.
> >
> > Jarcec
> >
> > On Thu, Jan 05, 2012 at 06:35:12PM -0800, James Warren wrote:
> > > Hi Arvind -
> > >
> > > Thanks for the suggestions! ?I tried the first option but it
> > > fortunately failed. ?From the logs:
> > >
> > > 12/01/05 18:24:56 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> > > SELECT MIN(id), MAX(id) FROM (SELECT SQL_BUFFER_RESULT id ......
> > >
> > > and SQL_BUFFER_RESULT apparently only works for top-level select
> > > queries. ?Nice idea, though.
> > >
> > > Looks like I'm left with option #2. ?More work, but I like getting my
> > > hands dirty.
> > >
> > > cheers,
> > > -James
> > >
> > > On Thu, Jan 5, 2012 at 10:08 AM, Arvind Prabhakar <ar...@apache.org>
> > wrote:
> > > >
> > > > Hi James,
> > > >
> > > > This has not been tested before - but you could possibly use the
> > > > --query option to supply your own query that contains the necessary
> > > > optimization hints. If you find that not working, or limiting in some
> > > > way - you could also try extending the
> > > > org.apache.sqoop.manager.MySQLManager implementation to create your
> > > > own connector that does understand such optimizations.
> > > >
> > > > Thanks,
> > > > Arvind
> > > >
> > > > On Wed, Jan 4, 2012 at 4:45 PM, James Warren
> > > > <ja...@stanfordalumni.org> wrote:
> > > > > Greetings -
> > > > >
> > > > > Was wondering / hoping if it were possible to configure sqoop to use
> > > > > SQL_BUFFER_RESULT in the generated select queries. ?From my
> > understanding
> > > > > sqoop uses READ_COMMITTED as the isolation level, but I'd ideally
> > like to
> > > > > release the write locks as soon as possible to help alleviate any
> > > > > replication lag.
> > > > >
> > > > > Many thanks,
> > > > > -jw
> >

Re: does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Posted by James Warren <ja...@stanfordalumni.org>.
Finally got around to testing this earlier today and it ran smooth as silk.
 I really should browse the JIRAs more often.  Thanks for the advice (and
patch), Jarcec!

cheers,
-James

On Thu, Jan 5, 2012 at 11:20 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:

> Hi James,
> you can bypass this particular problem by specifying your own boundary
> query using --boundary-query command line parameter.
>
> Jarcec
>
> On Thu, Jan 05, 2012 at 06:35:12PM -0800, James Warren wrote:
> > Hi Arvind -
> >
> > Thanks for the suggestions! ?I tried the first option but it
> > fortunately failed. ?From the logs:
> >
> > 12/01/05 18:24:56 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> > SELECT MIN(id), MAX(id) FROM (SELECT SQL_BUFFER_RESULT id ......
> >
> > and SQL_BUFFER_RESULT apparently only works for top-level select
> > queries. ?Nice idea, though.
> >
> > Looks like I'm left with option #2. ?More work, but I like getting my
> > hands dirty.
> >
> > cheers,
> > -James
> >
> > On Thu, Jan 5, 2012 at 10:08 AM, Arvind Prabhakar <ar...@apache.org>
> wrote:
> > >
> > > Hi James,
> > >
> > > This has not been tested before - but you could possibly use the
> > > --query option to supply your own query that contains the necessary
> > > optimization hints. If you find that not working, or limiting in some
> > > way - you could also try extending the
> > > org.apache.sqoop.manager.MySQLManager implementation to create your
> > > own connector that does understand such optimizations.
> > >
> > > Thanks,
> > > Arvind
> > >
> > > On Wed, Jan 4, 2012 at 4:45 PM, James Warren
> > > <ja...@stanfordalumni.org> wrote:
> > > > Greetings -
> > > >
> > > > Was wondering / hoping if it were possible to configure sqoop to use
> > > > SQL_BUFFER_RESULT in the generated select queries. ?From my
> understanding
> > > > sqoop uses READ_COMMITTED as the isolation level, but I'd ideally
> like to
> > > > release the write locks as soon as possible to help alleviate any
> > > > replication lag.
> > > >
> > > > Many thanks,
> > > > -jw
>

Re: does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi James,
you can bypass this particular problem by specifying your own boundary query using --boundary-query command line parameter.

Jarcec

On Thu, Jan 05, 2012 at 06:35:12PM -0800, James Warren wrote:
> Hi Arvind -
> 
> Thanks for the suggestions! ?I tried the first option but it
> fortunately failed. ?From the logs:
> 
> 12/01/05 18:24:56 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> SELECT MIN(id), MAX(id) FROM (SELECT SQL_BUFFER_RESULT id ......
> 
> and SQL_BUFFER_RESULT apparently only works for top-level select
> queries. ?Nice idea, though.
> 
> Looks like I'm left with option #2. ?More work, but I like getting my
> hands dirty.
> 
> cheers,
> -James
> 
> On Thu, Jan 5, 2012 at 10:08 AM, Arvind Prabhakar <ar...@apache.org> wrote:
> >
> > Hi James,
> >
> > This has not been tested before - but you could possibly use the
> > --query option to supply your own query that contains the necessary
> > optimization hints. If you find that not working, or limiting in some
> > way - you could also try extending the
> > org.apache.sqoop.manager.MySQLManager implementation to create your
> > own connector that does understand such optimizations.
> >
> > Thanks,
> > Arvind
> >
> > On Wed, Jan 4, 2012 at 4:45 PM, James Warren
> > <ja...@stanfordalumni.org> wrote:
> > > Greetings -
> > >
> > > Was wondering / hoping if it were possible to configure sqoop to use
> > > SQL_BUFFER_RESULT in the generated select queries. ?From my understanding
> > > sqoop uses READ_COMMITTED as the isolation level, but I'd ideally like to
> > > release the write locks as soon as possible to help alleviate any
> > > replication lag.
> > >
> > > Many thanks,
> > > -jw

Re: does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Posted by James Warren <ja...@stanfordalumni.org>.
Hi Arvind -

Thanks for the suggestions!  I tried the first option but it
fortunately failed.  From the logs:

12/01/05 18:24:56 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
SELECT MIN(id), MAX(id) FROM (SELECT SQL_BUFFER_RESULT id ......

and SQL_BUFFER_RESULT apparently only works for top-level select
queries.  Nice idea, though.

Looks like I'm left with option #2.  More work, but I like getting my
hands dirty.

cheers,
-James

On Thu, Jan 5, 2012 at 10:08 AM, Arvind Prabhakar <ar...@apache.org> wrote:
>
> Hi James,
>
> This has not been tested before - but you could possibly use the
> --query option to supply your own query that contains the necessary
> optimization hints. If you find that not working, or limiting in some
> way - you could also try extending the
> org.apache.sqoop.manager.MySQLManager implementation to create your
> own connector that does understand such optimizations.
>
> Thanks,
> Arvind
>
> On Wed, Jan 4, 2012 at 4:45 PM, James Warren
> <ja...@stanfordalumni.org> wrote:
> > Greetings -
> >
> > Was wondering / hoping if it were possible to configure sqoop to use
> > SQL_BUFFER_RESULT in the generated select queries.  From my understanding
> > sqoop uses READ_COMMITTED as the isolation level, but I'd ideally like to
> > release the write locks as soon as possible to help alleviate any
> > replication lag.
> >
> > Many thanks,
> > -jw

Re: does sqoop support SQL_BUFFER_RESULT to help with long running queries?

Posted by Arvind Prabhakar <ar...@apache.org>.
Hi James,

This has not been tested before - but you could possibly use the
--query option to supply your own query that contains the necessary
optimization hints. If you find that not working, or limiting in some
way - you could also try extending the
org.apache.sqoop.manager.MySQLManager implementation to create your
own connector that does understand such optimizations.

Thanks,
Arvind

On Wed, Jan 4, 2012 at 4:45 PM, James Warren
<ja...@stanfordalumni.org> wrote:
> Greetings -
>
> Was wondering / hoping if it were possible to configure sqoop to use
> SQL_BUFFER_RESULT in the generated select queries.  From my understanding
> sqoop uses READ_COMMITTED as the isolation level, but I'd ideally like to
> release the write locks as soon as possible to help alleviate any
> replication lag.
>
> Many thanks,
> -jw