You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Christian Schneider <cs...@gmail.com> on 2013/07/03 13:59:12 UTC

Re: Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())

Hi, i browsed through the sources and found a way to tune the JDBC
ResultSet.next() performance.

final Connection con =
DriverManager.getConnection("jdbc:hive2://carolin:10000/default", "hive",
"");
final Statement stmt = con.createStatement();
final String tableName = "bigdata";

sql = "select * from " + tableName + " limit 150000";
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);

// enlarge the FetchSize (default is just 50!)
((HiveQueryResultSet) res).setFetchSize(10000);

Best Regards,
Christian.


2013/6/26 Christian Schneider <cs...@gmail.com>

> I just test the same statement with beeline and got the same bad
> performance.
>
> Any ideas?
>
> Best Regards,
> Chrisitan.
>
>
> 2013/6/26 Christian Schneider <cs...@gmail.com>
>
>> Hi,
>> currently we are using HiveSever1 with the native HiveClient interface.
>> Our application design looks horrible because (for whatever reason) it
>> spawns a dedicated HiveServer for every query.
>>
>> We thought it is a good idea to switch to HiveServer2 (because the
>> MetaStore get used by many different applications).
>>
>> The JDBC setup was straight forward, but the performance is not what we
>> assumed.
>>
>> If we fetch a large result set (with fetchN()  over HiveClient) we read
>> with around 10MB/s.
>>
>> If I use JDBC (with resultSet.next() ) i have a throughput from 1MB/*min*
>> .
>>
>> Any chance to speed this up (like bulk fetching)?
>>
>> Best Regards,
>> Christian.
>>
>
>

Re: Fetching Results from Hive Select (JDBC ResultSet.next() vs HiveClient.fetchN())

Posted by Navis류승우 <na...@nexr.com>.
It seemed stmt.setFetchSize(10000); can be called before execution
(without casting)

2013/7/3 Christian Schneider <cs...@gmail.com>:
> Hi, i browsed through the sources and found a way to tune the JDBC
> ResultSet.next() performance.
>
> final Connection con =
> DriverManager.getConnection("jdbc:hive2://carolin:10000/default", "hive",
> "");
> final Statement stmt = con.createStatement();
> final String tableName = "bigdata";
>
> sql = "select * from " + tableName + " limit 150000";
> System.out.println("Running: " + sql);
> res = stmt.executeQuery(sql);
>
> // enlarge the FetchSize (default is just 50!)
> ((HiveQueryResultSet) res).setFetchSize(10000);
>
> Best Regards,
> Christian.
>
>
> 2013/6/26 Christian Schneider <cs...@gmail.com>
>>
>> I just test the same statement with beeline and got the same bad
>> performance.
>>
>> Any ideas?
>>
>> Best Regards,
>> Chrisitan.
>>
>>
>> 2013/6/26 Christian Schneider <cs...@gmail.com>
>>>
>>> Hi,
>>> currently we are using HiveSever1 with the native HiveClient interface.
>>> Our application design looks horrible because (for whatever reason) it
>>> spawns a dedicated HiveServer for every query.
>>>
>>> We thought it is a good idea to switch to HiveServer2 (because the
>>> MetaStore get used by many different applications).
>>>
>>> The JDBC setup was straight forward, but the performance is not what we
>>> assumed.
>>>
>>> If we fetch a large result set (with fetchN()  over HiveClient) we read
>>> with around 10MB/s.
>>>
>>> If I use JDBC (with resultSet.next() ) i have a throughput from 1MB/min.
>>>
>>> Any chance to speed this up (like bulk fetching)?
>>>
>>> Best Regards,
>>> Christian.
>>
>>
>