You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Maciej Bryński (JIRA)" <ji...@apache.org> on 2017/07/03 14:00:01 UTC
[jira] [Comment Edited] (SPARK-21287) Cannot use Int.MIN_VALUE as Spark SQL fetchsize

    [ https://issues.apache.org/jira/browse/SPARK-21287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072474#comment-16072474 ] 

Maciej Bryński edited comment on SPARK-21287 at 7/3/17 1:59 PM:
----------------------------------------------------------------

Quote
{quote}
By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate and, due to the design of the MySQL network protocol, is easier to implement. If you are working with ResultSets that have a large number of rows or large values and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.
{quote}
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html


was (Author: maver1ck):
Quote
{code}
By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate and, due to the design of the MySQL network protocol, is easier to implement. If you are working with ResultSets that have a large number of rows or large values and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.
{code}
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html

> Cannot use Int.MIN_VALUE as Spark SQL fetchsize
> -----------------------------------------------
>
>                 Key: SPARK-21287
>                 URL: https://issues.apache.org/jira/browse/SPARK-21287
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.1
>            Reporter: Maciej Bryński
>
> MySQL JDBC driver gives possibility to not store ResultSet in memory.
> We can do this by setting fetchSize to Int.MIN_VALUE.
> Unfortunately this configuration isn't correct in Spark.
> {code}
> java.lang.IllegalArgumentException: requirement failed: Invalid value `-2147483648` for parameter `fetchsize`. The minimum value is 0. When the value is 0, the JDBC driver ignores the value and does the estimates.
> 	at scala.Predef$.require(Predef.scala:224)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:105)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
> 	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
> 	at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:166)
> 	at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:206)
> 	at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> 	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> 	at py4j.Gateway.invoke(Gateway.java:280)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:214)
> 	at java.lang.Thread.run(Thread.java:748)
> {code}
> https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org