You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Zoltán Tóth-Czifra <gp...@vipmail.hu> on 2012/11/05 19:25:01 UTC

Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/
-----------------------------------------------------------

Review request for Sqoop.


Description
-------

Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.


Diffs
-----

  src/docs/user/compatibility.txt 3576fd7 

Diff: https://reviews.apache.org/r/7880/diff/


Testing
-------

Converted to XML with asciidoc, the affected part:

<simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
is under a high load serving random queries from the users of our product.
While data consistency issues during the export can be easily solved with a
staging table, there is still a problem: the performance impact caused by the
heavy export.</simpara>
<simpara>First off, the resources of MySQL dedicated to the import process can affect
the performance of the live product, both on the master and on the slaves.
Second, even if the servers can handle the import with no significant
performance impact (mysqlimport should be relatively "cheap"), importing big
tables can cause serious replication lag in the cluster risking data
inconsistency.</simpara>
<simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
milliseconds, you can let the server relax between checkpoints and the replicas
catch up by pausing the export process after transferring the number of bytes
specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
settings of these two parameters to archieve an export pace that doesn&#8217;t
endanger the stability of your MySQL cluster.</simpara>
<important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
Don&#8217;t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>


Thanks,

Zoltán Tóth-Czifra


Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/#review13251
-----------------------------------------------------------


Hi Zoltan,
thank you very much for your effort to make Sqoop better! I do have just few nits:


src/docs/user/compatibility.txt
<https://reviews.apache.org/r/7880/#comment28492>

    Would you mind dropping the "of your product" and form the sentence into something like "from other users."?



src/docs/user/compatibility.txt
<https://reviews.apache.org/r/7880/#comment28493>

    Would you mind substituting the ":" with "with". E.g. "there is still a problem with the performance impact"



src/docs/user/compatibility.txt
<https://reviews.apache.org/r/7880/#comment28494>

    Would you mind substituting "work" with "are supported" into something like "these parameters are supported only with the"


Jarcec

- Jarek Cecho


On Nov. 5, 2012, 6:25 p.m., Zoltán Tóth-Czifra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7880/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2012, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.
> 
> 
> Diffs
> -----
> 
>   src/docs/user/compatibility.txt 3576fd7 
> 
> Diff: https://reviews.apache.org/r/7880/diff/
> 
> 
> Testing
> -------
> 
> Converted to XML with asciidoc, the affected part:
> 
> <simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
> is under a high load serving random queries from the users of our product.
> While data consistency issues during the export can be easily solved with a
> staging table, there is still a problem: the performance impact caused by the
> heavy export.</simpara>
> <simpara>First off, the resources of MySQL dedicated to the import process can affect
> the performance of the live product, both on the master and on the slaves.
> Second, even if the servers can handle the import with no significant
> performance impact (mysqlimport should be relatively "cheap"), importing big
> tables can cause serious replication lag in the cluster risking data
> inconsistency.</simpara>
> <simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
> milliseconds, you can let the server relax between checkpoints and the replicas
> catch up by pausing the export process after transferring the number of bytes
> specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
> settings of these two parameters to archieve an export pace that doesn&#8217;t
> endanger the stability of your MySQL cluster.</simpara>
> <important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
> parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
> any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
> Don&#8217;t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>
> 
> 
> Thanks,
> 
> Zoltán Tóth-Czifra
> 
>


Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/#review13260
-----------------------------------------------------------

Ship it!


Thank you for your changes Zoltan. Please upload your patch to the JIRA (as a file) and I'll commit it.

Jarcec

- Jarek Cecho


On Nov. 8, 2012, 6:35 p.m., Zoltán Tóth-Czifra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7880/
> -----------------------------------------------------------
> 
> (Updated Nov. 8, 2012, 6:35 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.
> 
> 
> Diffs
> -----
> 
>   src/docs/user/compatibility.txt 3576fd7 
> 
> Diff: https://reviews.apache.org/r/7880/diff/
> 
> 
> Testing
> -------
> 
> Converted to XML with asciidoc, the affected part:
> 
> <simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
> is under a high load serving random queries from the users of our product.
> While data consistency issues during the export can be easily solved with a
> staging table, there is still a problem: the performance impact caused by the
> heavy export.</simpara>
> <simpara>First off, the resources of MySQL dedicated to the import process can affect
> the performance of the live product, both on the master and on the slaves.
> Second, even if the servers can handle the import with no significant
> performance impact (mysqlimport should be relatively "cheap"), importing big
> tables can cause serious replication lag in the cluster risking data
> inconsistency.</simpara>
> <simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
> milliseconds, you can let the server relax between checkpoints and the replicas
> catch up by pausing the export process after transferring the number of bytes
> specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
> settings of these two parameters to archieve an export pace that doesn&#8217;t
> endanger the stability of your MySQL cluster.</simpara>
> <important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
> parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
> any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
> Don&#8217;t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>
> 
> 
> Thanks,
> 
> Zoltán Tóth-Czifra
> 
>


Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

Posted by Zoltán Tóth-Czifra <gp...@vipmail.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/
-----------------------------------------------------------

(Updated Nov. 8, 2012, 6:35 p.m.)


Review request for Sqoop.


Changes
-------

One more patch, missing R in YOUR


Description
-------

Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.


Diffs (updated)
-----

  src/docs/user/compatibility.txt 3576fd7 

Diff: https://reviews.apache.org/r/7880/diff/


Testing
-------

Converted to XML with asciidoc, the affected part:

<simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
is under a high load serving random queries from the users of our product.
While data consistency issues during the export can be easily solved with a
staging table, there is still a problem: the performance impact caused by the
heavy export.</simpara>
<simpara>First off, the resources of MySQL dedicated to the import process can affect
the performance of the live product, both on the master and on the slaves.
Second, even if the servers can handle the import with no significant
performance impact (mysqlimport should be relatively "cheap"), importing big
tables can cause serious replication lag in the cluster risking data
inconsistency.</simpara>
<simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
milliseconds, you can let the server relax between checkpoints and the replicas
catch up by pausing the export process after transferring the number of bytes
specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
settings of these two parameters to archieve an export pace that doesn&#8217;t
endanger the stability of your MySQL cluster.</simpara>
<important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
Don&#8217;t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>


Thanks,

Zoltán Tóth-Czifra


Re: Review Request: SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports

Posted by Zoltán Tóth-Czifra <gp...@vipmail.hu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7880/
-----------------------------------------------------------

(Updated Nov. 8, 2012, 6:33 p.m.)


Review request for Sqoop.


Changes
-------

Thank you for the suggestions! All of them make sense to me, see new patch :)


Description
-------

Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.


Diffs (updated)
-----

  src/docs/user/compatibility.txt 3576fd7 

Diff: https://reviews.apache.org/r/7880/diff/


Testing
-------

Converted to XML with asciidoc, the affected part:

<simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that
is under a high load serving random queries from the users of our product.
While data consistency issues during the export can be easily solved with a
staging table, there is still a problem: the performance impact caused by the
heavy export.</simpara>
<simpara>First off, the resources of MySQL dedicated to the import process can affect
the performance of the live product, both on the master and on the slaves.
Second, even if the servers can handle the import with no significant
performance impact (mysqlimport should be relatively "cheap"), importing big
tables can cause serious replication lag in the cluster risking data
inconsistency.</simpara>
<simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in
milliseconds, you can let the server relax between checkpoints and the replicas
catch up by pausing the export process after transferring the number of bytes
specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different
settings of these two parameters to archieve an export pace that doesn&#8217;t
endanger the stability of your MySQL cluster.</simpara>
<important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D
parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before
any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc).
Don&#8217;t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>


Thanks,

Zoltán Tóth-Czifra