You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Matthew Jacobs (JIRA)" <ji...@apache.org> on 2017/05/09 20:51:04 UTC

[jira] [Created] (IMPALA-5297) free-pool-test may be OOM killed on jenkins.impala.io runs

Matthew Jacobs created IMPALA-5297:
--------------------------------------

             Summary: free-pool-test may be OOM killed on jenkins.impala.io runs
                 Key: IMPALA-5297
                 URL: https://issues.apache.org/jira/browse/IMPALA-5297
             Project: IMPALA
          Issue Type: Bug
          Components: Infrastructure
    Affects Versions: Impala 2.9.0
            Reporter: Matthew Jacobs
            Priority: Critical


On gerrit-verify-dryrun jobs, while attempting to submit [a change to update the Kudu version|https://gerrit.cloudera.org/#/c/6797/] seems to cause the free-pool-test to run out of memory.

The free-pool-test makes some large allocations (I think around 7gb in total), but when there are other processes running, it seems the gerrit jobs may be getting close to the 15gb CommitLimit on these aws hosts.

Here's the output from the kern.log
{code}
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153878] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153882] java cpuset=/ mems_allowed=0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153884] CPU: 1 PID: 19555 Comm: java Not tainted 3.13.0-100-generic #147-Ubuntu
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153886] Hardware name: Xen HVM domU, BIOS 4.2.amazon 02/16/2017
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153887]  0000000000000000 ffff88066708f970 ffffffff8172a4bb ffff88047b3b1800
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153891]  0000000000000000 ffff88066708f9f8 ffffffff81724a5a 0000000000000000
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153894]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153897] Call Trace:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153904]  [<ffffffff8172a4bb>] dump_stack+0x64/0x82
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153908]  [<ffffffff81724a5a>] dump_header+0x7f/0x1f1
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153912]  [<ffffffff81155d11>] oom_kill_process+0x201/0x360
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153917]  [<ffffffff812dcab5>] ? security_capable_noaudit+0x15/0x20
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153919]  [<ffffffff811564a1>] out_of_memory+0x471/0x4b0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153922]  [<ffffffff8115c7bc>] __alloc_pages_nodemask+0xa6c/0xb90
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153926]  [<ffffffff8119ae83>] alloc_pages_current+0xa3/0x160
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153930]  [<ffffffff811527c7>] __page_cache_alloc+0x97/0xc0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153932]  [<ffffffff81154235>] filemap_fault+0x185/0x410
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153936]  [<ffffffff8117944f>] __do_fault+0x6f/0x530
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153941]  [<ffffffff810135db>] ? __switch_to+0x16b/0x4f0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153943]  [<ffffffff8117d2a2>] handle_mm_fault+0x482/0xf00
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153947]  [<ffffffff81090df7>] ? hrtimer_try_to_cancel+0x47/0x100
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153950]  [<ffffffff8172df0e>] ? schedule_hrtimeout_range_clock+0xce/0x170
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153954]  [<ffffffff81736644>] __do_page_fault+0x184/0x560
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153957]  [<ffffffff8120a45f>] ? ep_poll+0x2ff/0x330
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153961]  [<ffffffff8109d2f0>] ? wake_up_state+0x20/0x20
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153964]  [<ffffffff81736a3a>] do_page_fault+0x1a/0x70
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153966]  [<ffffffff8120b5cc>] ? SyS_epoll_wait+0xac/0x100
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153968]  [<ffffffff81732d68>] page_fault+0x28/0x30
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153970] Mem-Info:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153971] Node 0 DMA per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153973] CPU    0: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153974] CPU    1: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153975] CPU    2: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153976] CPU    3: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153977] CPU    4: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153978] CPU    5: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153979] CPU    6: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153980] CPU    7: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153981] CPU    8: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153982] CPU    9: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153984] CPU   10: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153985] CPU   11: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153986] CPU   12: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153987] CPU   13: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153988] CPU   14: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153989] CPU   15: hi:    0, btch:   1 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153990] Node 0 DMA32 per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153991] CPU    0: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153992] CPU    1: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153993] CPU    2: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153995] CPU    3: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153996] CPU    4: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153997] CPU    5: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153998] CPU    6: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.153999] CPU    7: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154001] CPU    8: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154002] CPU    9: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154003] CPU   10: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154004] CPU   11: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154005] CPU   12: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154006] CPU   13: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154007] CPU   14: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154008] CPU   15: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154009] Node 0 Normal per-cpu:
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154010] CPU    0: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154011] CPU    1: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154012] CPU    2: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154013] CPU    3: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154014] CPU    4: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154015] CPU    5: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154016] CPU    6: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154017] CPU    7: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154018] CPU    8: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154019] CPU    9: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154020] CPU   10: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154021] CPU   11: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154023] CPU   12: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154024] CPU   13: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154025] CPU   14: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154026] CPU   15: hi:  186, btch:  31 usd:   0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028] active_anon:7546116 inactive_anon:3718 isolated_anon:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  active_file:405 inactive_file:19 isolated_file:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  unevictable:5 dirty:193 writeback:0 unstable:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  free:47219 slab_reclaimable:15089 slab_unreclaimable:22997
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  mapped:6952 shmem:7403 pagetables:22940 bounce:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154028]  free_cma:0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154031] Node 0 DMA free:15904kB min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154035] lowmem_reserve[]: 0 3744 30129 30129
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154038] Node 0 DMA32 free:113892kB min:8392kB low:10488kB high:12588kB active_anon:3689064kB inactive_anon:1468kB active_file:260kB inactive_file:20kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3915776kB managed:3836720kB mlocked:4kB dirty:104kB writeback:0kB mapped:4468kB shmem:4496kB slab_reclaimable:6640kB slab_unreclaimable:9296kB kernel_stack:3832kB pagetables:10816kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:451 all_unreclaimable? yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154041] lowmem_reserve[]: 0 0 26385 26385
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154044] Node 0 Normal free:59080kB min:59152kB low:73940kB high:88728kB active_anon:26495400kB inactive_anon:13404kB active_file:1360kB inactive_file:56kB unevictable:16kB isolated(anon):0kB isolated(file):0kB present:27525120kB managed:27019008kB mlocked:16kB dirty:668kB writeback:0kB mapped:23340kB shmem:25116kB slab_reclaimable:53716kB slab_unreclaimable:82692kB kernel_stack:33392kB pagetables:80944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2483 all_unreclaimable? yes
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154047] lowmem_reserve[]: 0 0 0 0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154049] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15904kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154058] Node 0 DMA32: 167*4kB (UEM) 1991*8kB (UEM) 590*16kB (UEM) 275*32kB (UEM) 204*64kB (UEM) 139*128kB (UEM) 66*256kB (UEM) 32*512kB (EM) 11*1024kB (UEM) 2*2048kB (EM) 0*4096kB = 114324kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154068] Node 0 Normal: 15098*4kB (UEM) 36*8kB (EM) 3*16kB (EM) 1*32kB (E) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 60760kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154076] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154077] 7651 total pagecache pages
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154078] 0 pages in swap cache
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154079] Swap cache stats: add 0, delete 0, find 0/0
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154080] Free swap  = 0kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154081] Total swap = 0kB
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154082] 7864221 pages RAM
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154083] 0 pages HighMem/MovableOnly
May  9 18:32:31 ip-172-31-7-2 kernel: [ 7498.154084] 126528 pages reserved
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344897] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344918] [  740]     0   740     4868       49      13        0             0 upstart-udev-br
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344920] [  747]     0   747    12521      234      27        0         -1000 systemd-udevd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344922] [  912]     0   912     3814       51      12        0             0 upstart-socket-
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344924] [  960]     0   960     2554      574       8        0             0 dhclient
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344926] [ 1256]     0  1256     3818       55      13        0             0 upstart-file-br
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344928] [ 1402]     0  1402     3633       41      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344930] [ 1405]     0  1405     3633       40      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344932] [ 1407]   101  1407    65017      688      29        0             0 rsyslogd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344934] [ 1410]     0  1410     3633       42      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344935] [ 1411]     0  1411     3633       40      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344937] [ 1413]     0  1413     3633       39      10        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344939] [ 1437]     0  1437    15344      172      34        0         -1000 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344941] [ 1465]     0  1465     4783       40      13        0             0 atd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344943] [ 1466]     0  1466     5912       53      17        0             0 cron
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344944] [ 1481]     0  1481     1091       36       7        0             0 acpid
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344946] [ 1490]   102  1490     9802      100      24        0             0 dbus-daemon
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344948] [ 1501]     0  1501    10861       89      25        0             0 systemd-logind
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344950] [ 1507]     0  1507     4863      112      14        0             0 irqbalance
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344952] [ 1608]     0  1608    26411      252      54        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344954] [ 1729]  1000  1729    26999      847      56        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344956] [ 1936]     0  1936     3633       39      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344957] [ 1937]     0  1937     3195       38      12        0             0 getty
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344959] [ 2445]   106  2445     7863      151      19        0             0 ntpd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344961] [ 3564]   108  3564    33045     1466      55        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344962] [ 3566]   108  3566    33073     5407      65        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344964] [ 3567]   108  3567    33045      331      54        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344966] [ 3568]   108  3568    33045      528      52        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344968] [ 3569]   108  3569    33253      534      54        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344969] [ 3570]   108  3570    25222      376      49        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344971] [ 3797]  1000  3797  2656388    54163     197        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344973] [ 3840]  1000  3840     2826       97       9        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344975] [14391]     0 14391    26410      245      53        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344977] [14454]  1000 14454    26410      252      51        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344979] [14455]  1000 14455     5660      837      16        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344981] [18767]  1000 18767   421689    75153     280        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344982] [18818]  1000 18818   427646    72071     276        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344984] [18845]  1000 18845   422808    76361     287        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344986] [18986]  1000 18986   413940    85847     272        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344988] [19231]  1000 19231   423114    68900     237        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344989] [19255]  1000 19255   422304    98381     297        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344991] [19281]  1000 19281   453916    50170     285        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344993] [19307]  1000 19307   421858    73572     251        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344994] [20004]  1000 20004  2625480    62614     227        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344996] [20059]  1000 20059  2417786   642515    1907        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344998] [20075]  1000 20075   170158     4632     136        0             0 kudu-master
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.344999] [20091]  1000 20091  2534506   681406    2028        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345001] [20100]  1000 20100  2480736   678210    1996        0             0 kudu-tserver
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345003] [21180]  1000 21180     2812       85      10        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345004] [21194]  1000 21194  2131760    52636     186        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345006] [21277]  1000 21277     3354      114      11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345008] [21291]  1000 21291  2176412    96677     367        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345010] [21441]  1000 21441     3354      114      12        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345011] [21455]  1000 21455  2171189    84863     323        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345013] [21619]  1000 21619     3354      115      11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345015] [21633]  1000 21633  2174403   126755     412        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345016] [21773]  1000 21773     3354      115      10        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345018] [21787]  1000 21787  2165621   105043     368        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345019] [22327]  1000 22327   699864    64647     359        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345021] [22650]   108 22650    34060     1947      62        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345023] [22651]   108 22651    34067     1837      61        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345025] [22695]   108 22695    34564     5014      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345026] [22696]   108 22696    34668     5491      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345028] [22701]  1000 22701   499743   165103     749        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345030] [22966]  1000 22966   404221    82336     258        0             0 java
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345031] [49266]   108 49266    34579     5136      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345033] [49267]   108 49267    34487     5004      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345034] [49434]   108 49434    34298     1980      61        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345036] [49435]   108 49435    34140     1836      60        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345037] [49438]   108 49438    34575     5159      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345039] [49439]   108 49439    34505     5146      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345041] [49441]   108 49441    34590     5236      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345042] [49442]   108 49442    34553     5045      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345044] [49486]   108 49486    34507     5106      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345046] [49487]   108 49487    34573     5240      67        0             0 postgres
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345048] [12270]  1000 12270     3345      105      11        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345049] [12634]  1000 12634   106243     2424     105        0             0 statestored
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345051] [12642]  1000 12642  2177880    69774     301        0             0 catalogd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345052] [12708]  1000 12708  2599753    71304     505        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345054] [12775]  1000 12775  2599480    69011     503        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345055] [12844]  1000 12844  2599354    70697     503        0             0 impalad
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345057] [13564]  1000 13564     3411      159      12        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345059] [13565]  1000 13565     2623      114      11        0             0 make
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345060] [13568]  1000 13568     6091      149      16        0             0 ctest
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345062] [70617]     0 70617    26410      246      55        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345063] [71103]  1000 71103    26444      247      53        0             0 sshd
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345065] [71113]  1000 71113     5628      806      16        0             0 bash
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345066] [ 2745]  1000  2745  2286602   253086     810        0             0 buffered-tuple-
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345068] [ 3922]  1000  3921  5818436  3452952    6822        0             0 free-pool-test
May  9 18:35:52 ip-172-31-7-2 kernel: [ 7699.345070] Out of memory: Kill process 3922 (free-pool-test) score 448 or sacrifice child
{code}

and shortly after, the output of meminfo:
{code}
ubuntu@ip-172-31-7-2:~/Impala/logs/be_tests$ cat /proc/meminfo 
MemTotal:       30871632 kB
MemFree:         7011044 kB
Buffers:           40700 kB
Cached:          5438488 kB
SwapCached:            0 kB
Active:         21124408 kB
Inactive:        2125936 kB
Active(anon):   17793440 kB
Inactive(anon):     7516 kB
Active(file):    3330968 kB
Inactive(file):  2118420 kB
Unevictable:          20 kB
Mlocked:              20 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:              1132 kB
Writeback:             0 kB
AnonPages:      17781632 kB
Mapped:           138604 kB
Shmem:             29780 kB
Slab:             276868 kB
SReclaimable:     182520 kB
SUnreclaim:        94348 kB
KernelStack:       39152 kB
PageTables:        66140 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    15435816 kB
Committed_AS:   49437784 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       72396 kB
VmallocChunk:   34359655000 kB
HardwareCorrupted:     0 kB
AnonHugePages:  15386624 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       38912 kB
DirectMap2M:     3237888 kB
DirectMap1G:    28311552 kB
{code}

We should probably have larger VMs for these jobs, but may also need to consider reducing the mem needed for BE tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)