You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by as...@googlecode.com on 2015/08/08 21:25:56 UTC

Issue 922 in asterixdb: OOM during creating ngram index

Status: Accepted
Owner: wangs...@gmail.com
CC: pouria.p...@gmail.com
Labels: Type-Defect Priority-Medium

New issue 922 by buyingyi@gmail.com: OOM during creating ngram index
https://code.google.com/p/asterixdb/issues/detail?id=922

Use the generated twitter message dataset.
13.75 GB per partition, 4 partitions per machine,  6GB heap size,  256MB  
in-memory components for each index.

Asterix Configuration
nc.java.opts                             :-Xmx6144m   
-XX:MinHeapFreeRatio=5  -XX:MaxHeapFreeRatio=15
cc.java.opts                             :-Xmx4096m -XX:MinHeapFreeRatio=5   
-XX:MaxHeapFreeRatio=15
max.wait.active.cluster                  :60
storage.buffercache.pagesize             :131072
storage.buffercache.size                 :1073741824
storage.buffercache.maxopenfiles         :214748364
storage.memorycomponent.pagesize         :131072
storage.memorycomponent.numpages         :2048
storage.metadata.memorycomponent.numpages:64
storage.memorycomponent.numcomponents    :2
storage.memorycomponent.globalbudget     :2147483648

DDL:
create index tNGramIdx on TwitterMessages(message-text) type ngram(2);

Error:
java.lang.OutOfMemoryError: unable to create new native thread


-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #13 on issue 922 by wangs...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

It is still creating ExternalSortRunGenerator?????.waf files. The number of  
files is 805. And it is still growing. Currently, the temporary files are  
occupying 805 * 320MB = 257GB. Let's see what happens.

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #12 on issue 922 by ima...@uci.edu: OOM during creating ngram index
https://code.google.com/p/asterixdb/issues/detail?id=922

So <1024 is probably a reasonable lower default bound for threads. We  
should also probably lower our current default allowance on open file  
handles, as again <1024 looks like a normal limit for a user. There's a  
question though of how much below 1024 we should go (i.e. how many other  
resources should we expect might be used by that user)

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #11 on issue 922 by wangs...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

I'm testing on two machines. They are independent (not on the cluster).

machine #1:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127645
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127645
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

machine #2:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #10 on issue 922 by buyingyi@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

BTW, here is the "ulimit -a" of my setting:

$ ulimit -a

core file size          (blocks, -c) 2000

data seg size           (kbytes, -d) unlimited

scheduling priority             (-e) 0

file size               (blocks, -f) unlimited

pending signals                 (-i) 63280

max locked memory       (kbytes, -l) 64

max memory size         (kbytes, -m) unlimited

open files                      (-n) 40960

pipe size            (512 bytes, -p) 8

POSIX message queues     (bytes, -q) 819200

real-time priority              (-r) 0

stack size              (kbytes, -s) 10240

cpu time               (seconds, -t) unlimited

max user processes              (-u) 1024

virtual memory          (kbytes, -v) unlimited

file locks                      (-x) unlimited

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #9 on issue 922 by wangs...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

Trying to reproduce the issue on my machine with five partitions - each  
16GB. Let's see what happens.

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #8 on issue 922 by che...@gmail.com: OOM during creating ngram index
https://code.google.com/p/asterixdb/issues/detail?id=922

Hmmm, I think Alex Bhem implemented this ngram index.  Can we reproduce  
this issue using an earlier version of the master?



-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #7 on issue 922 by buyingyi@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

Yes. Both secondary B-Tree and R-Tree work fine.

Best,
Yingyi

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #6 on issue 922 by che...@srch2.com: OOM during creating ngram index
https://code.google.com/p/asterixdb/issues/detail?id=922

Is it only specific to ngram index?  How about a secondary B-tree?

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #5 on issue 922 by zheilb...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

Unable to create native thread usually means ulimit is not set correctly  
for nprocesses. Just a thought

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #4 on issue 922 by buyingyi@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

I think so.  You can let the dataset size be larger than your memory.

Best,
Yingyi

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #3 on issue 922 by wangs...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

Can it be reproduceable in one machine? Otherwise, I can't reproduce this.

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #2 on issue 922 by buyingyi@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

It is deterministic in my setting...

Best,
Yingyi

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #1 on issue 922 by parshim...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

Ah, interesting... Is this deterministic? It seems really similar to the  
issue that started to crop up on the CI server just as the feeds patch was  
merged.

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Re: Issue 922 in asterixdb: OOM during creating ngram index

Posted by as...@googlecode.com.
Comment #15 on issue 922 by wangs...@gmail.com: OOM during creating ngram  
index
https://code.google.com/p/asterixdb/issues/detail?id=922

More than 100 threads are being executed per sec. The CC log shows the  
following message:

Aug 11, 2015 1:53:01 PM  
edu.uci.ics.hyracks.control.common.work.WorkQueue$WorkerThread run
INFO: Executing: GetResultPartitionLocations: JobId@JID:15  
ResultSetId@RSID:0 Known@[127.0.0.1:58800 SUCCESS (empty), 127.0.0.1:58800  
SUCCESS (empty), 127.0.0.1:58800 SUCCESS (empty), null, 127.0.0.1:58800  
SUCCESS (empty)]

-- 
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings