You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by "Vladimir St." <vl...@yandex.ru> on 2017/01/31 16:09:04 UTC

Distributed query is too slow.

Hello, everyone!
 
Can I get a piece of help for my querying trouble? I’m measuring performance
of basic SQL-like queries over a cache with various settings and
deployments. Also I compare the results with Oracle11g which is running on
the same hardware. In general, I need something significantly faster than
this DB. I see that local queries and queries on replicated caches are just
perfect. They overtake the DB x5 times. But a problem occurs when I execute
search on a replicated cache at each additional node in the cluster. The
problem is that my search query gets slower sooo much if the cluster has
more that one node :(. This difference in search speed is completely
unreasonable. For example, if the test is executed on single node, it takes
approx. 15 seconds to find, approx. 55 sec. (!) on two local nodes, approx.
65 sec. on three local nodes and approx. 80sec. on 4 nodes :(.The database
does an equal search for about 75 seconds. There are some cores/gc/memory
interferences if I launch more than two nodes on the same machine, but I
take it into account. Sadly, even a bigger lack of performance appears when
I launch my tests on separated machines. Same test might consume x20 slower
time compared to local nodes. I don’t understand what is going on.

The hardware is just an average ones: i5-4570 3.2Ghz, 4 cores, 16Gb, Win7
x64 on Linux x64. The JVM is HotSpot 1.8.0.101. Version of Ignite is 1.8.
The network exhibits 17Mb/sec. real throughput.


*The entity is:*

@Indexed(index = "testRecord")
public class Record implements Serializable {

    @QuerySqlField(index = true, orderedGroups = {@QuerySqlField.Group(name
= "g1", order = 0)})
    private String str;

    @QuerySqlField(index = true, orderedGroups = {@QuerySqlField.Group(name
= "g1", order = 1)})
    private String str2;

    @QuerySqlField(index = true, orderedGroups = {@QuerySqlField.Group(name
= "g1", order = 2)})
    private long l;

    private byte[] bArr;
    private int i;
    private Double d;
    private BigInteger bi;

    public static Record generate() {
        …//Fills fields with random values
    }
…
}

I search for 3 indexed fields: 2 strings and one long. 


*The test routine is pretty simple:*

-	A node starts (first node). Only one node at same time.
-	Node acquires the cache and fills it if empty for 100-500k records
depending on the setting. Only first node fills the cache. The key is a
string.
-	Node preloads all the records from the cache to search for in case of
random fields were generated. Or builds an array of known fields to search
for in case of sequential record fields were generated. 
-	Node runs through this search set several times querying for all the
records one by one. Usually I run this cycle 5 times to make sure all is ok
with the GC and the warming.
-	The measurement begins right before each run (after data preloading) and
ends right after. Average search time is observed.
-	Next node starts
-	Cache rebalancing is performed. The cache is acquired on the next node. No
additional data is put into the cache.
-	New node starts and does the same searching.

I.e. if 300k records were generated, 300k searches are launched over the
cache of 300k records.



*The searching core is:*

new SqlQuery<>(Record.class, “str=? and str2=? and l=?”);

IgniteCache<String, Record> cache = …;

List<Record> dataToFind = …;
long cycleTime, avgTime=0;

for( int attempt=0; attempt<cycles;++attempt ){
   cycleTime = System.currentTimeMillis();

   dataToFind.forEach( r-> {
	List<Entry&lt;String, Record>>> lst = cache.query(query.setArgs(r.getStr(),
r.getStr2(),       r.getL()).getAll()
        …
   });

   avgTime += System.currentTimeMillis() - cycleTime;

}

avgTime /= cycles;


First node always goes well, much faster than the DB. But next ones don’t.
My working machine does the 100k-on-100k-records-search for avg. 2 seconds.
A second node running on equal machine in the network consumes about 45-50
seconds!!! Whhhyyy? The database takes about 35 second to search the table
which has similar structure and indexes.  I mean searching via JDBC
templates.


*The cache settings are:*

setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
setIndexedTypes(String.class, Record.class);
setCacheMode(CacheMode.PARTITIONED);
setBackups(1);
setRebalanceMode(CacheRebalanceMode.SYNC);
setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
setRebalanceBatchSize(4096);
//setOffHeapMaxMemory(maxOffheapMegabytes * 1024L * 1024L);
//setSqlOnheapRowCacheSize(100000);
//setMemoryMode(CacheMemoryMode.OFFHEAP_TIERED);


I’ve tried using setting like: atomic mode, 0 backups, async rebalancing,
asyn write mode, off-heap features and so on. I tested non-grouped indexes,
SqlFieldsQuery, searching for only one field. Nothing has helped.


*The explain-plan shows:*

[[SELECT
    RECORD._KEY AS __C0,
    RECORD._VAL AS __C1,
    RECORD.STR AS __C2,
    RECORD.STR2 AS __C3,
    RECORD.L AS __C4
FROM "testCache".RECORD
    /* "testCache"."g1": L = ?3
        AND STR = ?1
        AND STR2 = ?2
     */
WHERE (L = ?3)
    AND ((STR = ?1)
    AND (STR2 = ?2))], [SELECT
    __C0 AS _KEY,
    __C1 AS _VAL,
    __C2 AS STR,
    __C3 AS STR2,
    __C4 AS L
FROM PUBLIC.__T0
    /* "testCache"."merge_scan" */]]



As mentioned in the beginning, local searches and searching replicated cache
work well. So I decided to implement basic manual map-reduce of splited
local searches in the way as it's said in the documentation. I broadcasted
an IgniteCallable which launched a local query ( Query.setLocal(true) ).
After, the results were collected by the id. It’d worked perfectly, x4.5-5
time faster that the DB! 
Of course, I don’t say it’s a perfect way to search. I guess each callable
works in its own transaction. I wouldn’t imagine what happes if the data is
being changed or if the topology is changing at the search time. I just
tried to find out the problem seed.


So, what’s wrong with the distributed query? Or what I’m doing wrong? What
do I need to ckeck? The profiler says time is spent somewhere inside
H2/indexes just like in case of successfull local searches.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Distributed query is too slow.

Posted by dkarachentsev <dk...@gridgain.com>.

OK, now I see that locally we have equal results. As I understand, remote
means that nodes connected over network and placed on different machines.
Then 5 sec per 10K requests (0.5 ms per request), probably, caused  by
network delays. 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342p10484.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Distributed query is too slow.

Posted by "Vladimir St." <vl...@yandex.ru>.

dkarachentsev wrote
> I've just ran your tests, the project attached.
> Could you please, in turn, send your? This will help understand what goes
> wrong.

My test project is a jumble of many experiments and researches. And it needs
serious cleaning before publishing. I used yours. It works just like mine.
In the zip I attached you can find few shlight modifications but nothing
fundamental.

The nodes I need are server node. You should run the test on each next node
once the previous one is done without termitating finished nodes.

So, the timings I got by your code are:
- searching local 1-4 nodes: 184ms, 548ms, 635ms, 722ms
- searching remote 1-4 nodes: 146ms, 4484ms, 6080ms, 5442ms

It's interesting the the 4th remote node has a bit accelerated the search.

I case of local search we're charged with nearly x3 price to search over 2
nodes insted of 1. In case of remote search this extra cost grows up to x30.

If we launch from a client node, we might not see so huge difference between
the cluster of single node and of two nodes. But my overal goal is to get a
fast search platform against the DB. In my case, Ignite performs searches
over 2-4 remote nodes approximately two times faster than searching remote
Oracle with similar indexed table for the records. Although, it's a bit
frustrating because any kind of local searches shows 10-15 times higher
performance compared to the DB. But the result-gathering procedures break
all my anticipations. They make cache searching not so perfect as desired. I
still hope some setting are not applied somewhere.

ignite-perf-modified-test.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10422/ignite-perf-modified-test.zip>

--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342p10422.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Distributed query is too slow.

Posted by dkarachentsev <dk...@gridgain.com>.

I've just ran your tests, the project attached.
Could you please, in turn, send your? This will help understand what goes
wrong.

Thanks!

ignite-perf-test.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n10411/ignite-perf-test.zip>  

-Dmitry.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342p10411.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Distributed query is too slow.

Posted by "Vladimir St." <vl...@yandex.ru>.

Hi Dmitry. Thanks for the answer.


dkarachentsev wrote
> When you run test on single node it works just locally, without node
> communication overhead. Good point will be to test on client node. In that
> case, even with one server node, query will be sent over network with all
> that communication stuff. Also, if cache is REPLICATED, query always will
> be invoked locally, or, if you query from client - sent to only one node.

If I'm correct, the search is always local. Any node searches only its local
data. In case of partitioned cache the results are afterwards merged within
shared transaction scope. That's why I've tried splitting and reducing query
results manually.


dkarachentsev wrote
> here is my results:
> 10_000 records (request one-by-one), 
> 1 node: 445ms
> 2 nodes: 512ms
> 3 nodes: 696ms
> 4 nodes: 839ms

Your timings are of some sort I expected but didn't get. Of course, there
will be some communication costs when we run on several nodes, especially in
network. But they souldn't be so huge as in my case.


Here is my results for 10 000 records on same machine:

1 node: 391ms
2 nodes:  1470ms
3 nodes: 1689ms
4 nodes: 2223ms

There is a very large gap between one and two nodes, i.e. when the
partitioning really appears.

Can the problem grow from Ignite settings, not cache settings?

How do you run your test? My scenario:
-	First node starts and acquires the cache.
-	Then it sees that the cache is empty and fills it with n records
-	After, the node prepares a list of known-params (fields) to search for
-	Runs through this list and executes the query
-	When first (previous) node is done, I launch next one
-	Next node does the same except filling the cache.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342p10387.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Distributed query is too slow.

Posted by dkarachentsev <dk...@gridgain.com>.

Hi Vladimir,

When you run test on single node it works just locally, without node
communication overhead. Good point will be to test on client node. In that
case, even with one server node, query will be sent over network with all
that communication stuff. Also, if cache is REPLICATED, query always will be
invoked locally, or, if you query from client - sent to only one node.

I tried your example and don't see such big gap in performance with many
nodes, here is my results:
10_000 records (request one-by-one), 
1 node: 445ms
2 nodes: 512ms
3 nodes: 696ms
4 nodes: 839ms

Time to fetch increases, but not so fast, as you mentioned. The root of this
slowdown could be that I run all nodes on single machine (simultaneous
search on all nodes) and it requires a bit more messages.
For replicated cache this value will be constant.

BTW with broadcast compute on 3 nodes I got bigger result: 1489ms.

-Dmitry.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Distributed-query-is-too-slow-tp10342p10380.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.