You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Richard Francis <rf...@talis.com> on 2011/07/19 17:08:48 UTC
Re: Loading 650m triples on Ubuntu vs Centos

Hi,

Another update;

We now believe that this is an issue specific to Ubuntu - we were able to
reproduce the same behaviour in 10.04 Lucid, which we believe is down to the
flush mechanism. We have now built a Centos 5.6 image using the latest
Amazon kernel and performance is the same as our Centos 5.4 image, so in
that respect we will be moving back to Centos to run tdbloader.

Using the tips suggested in this email thread we were able to get the
performance to be almost comparable in Ubuntu, but I believe that 600m
triples is about the maximum amount of triples that you can load with
tdbloader(1) on Ubuntu with m2.2xlarge in Amazon ... so I would recommend
trying to move Centos if you are having issues with Ubuntu.

Rich

On Mon, Jun 20, 2011 at 11:04 AM, Richard Francis <rf...@talis.com> wrote:

> Hi Another update,
>
> Initially when the load started I observed a difference with the output of
> iostat - where the ubuntu instance was always showing a value for kB_wrtn/s
> for the partition - In the early part of the load, this was down to having
> the noatime [1] variable on the mount point enabled. Turning this off
> changed the behaviour of tdbloader - reducing the load average, and more
> importantly the behaviour for the vast majority of the load was the same
> (actually a lot better to 165m triples) as centos - as the io pattern was
> the same I was able to see immediately when there was a change in behaviour
> of the load (i.e increased load average, and a slow down in the load. This
> started to happen somewhere in between 164m triples -> 177m triples. iostat
> showed a constant value for kB_wrtn/s, suggesting we had again hit another
> limit to memory - and it was being paged to disk. Looking at lsof -p <pid of
> load> I saw that the accumulated size for all the tdbindexes handled by the
> process was 30Gb (conincidence?) - the same as my dirty_bytes value,
> increasing this to 40Gb saw the load average and io pattern return to
> "normal".
>
> Sadly this wasn't enough to see the load to the end of add phase - and
> despite being 35m triples ahead of the centos machine at 503m triples, it
> had slowed right down by the end of the load and centos completed the add
> phase before the Ubuntu machine.
>
> I running another clean test now - to see whether I can tune anything else
> - but I think the ultimate problem is the way Ubuntu deals with mapped files
> with the flush-202 process (vs. the pdflush mechanism in Centos).
>
> Rich
>
> PS. What didn't work;
>
> nr_hugepages should be 0 (default 0) - this is because whichever value is
> in this gets reserved in memory (memory reserved = nr_hugepages *
> Hugepagesize) and cannot be allocated to the memory mapped files.
>
> [1]
> http://www.howtoforge.com/reducing-disk-io-by-mounting-partitions-with-noatime
>
>
> On Thu, Jun 16, 2011 at 9:52 AM, Richard Francis <rf...@talis.com> wrote:
>
>> Hi Andy,
>>
>> Just a quick update;
>>
>> I tried playing with huge tables and I could see that they weren't getting
>> used by the load - The load promptly stopped to a crawl at C.60million
>> triples again. Turning this off echo "0" > /proc/sys/vm/nr_hugepages
>> promptly sped the load up. It appears that enabling hugepages will prevent
>> the Mapped value from growing larger than the amount of memory left after
>> allocation to hugepages (4Gb in my case) - I had set hugepages to 15360
>> (15360*2048 = 31457280 = 30Gb RAM) ... If only I had read the man pages
>> first :).
>>
>> I suspect that further down the line the Mapped values will hit another
>> limit - both are at around 6.5Gb @ 100m triples atm.
>>
>> Rich
>>
>>
>> On Thu, Jun 16, 2011 at 8:50 AM, Richard Francis <rf...@talis.com> wrote:
>>
>>> Sorry Andy,
>>>
>>> The resulting data set is 94Gb for both.
>>>
>>> Rich
>>>
>>>
>>> On Thu, Jun 16, 2011 at 8:45 AM, Richard Francis <rf...@talis.com> wrote:
>>>
>>>> Hi Andy,
>>>>
>>>> Both file systems are the same - we're using the ephemeral storage on
>>>> the ec2 node - both machines are ext3;
>>>>
>>>> Ubuntu:
>>>>
>>>> df -Th /mnt
>>>> Filesystem    Type    Size  Used Avail Use% Mounted on
>>>> /dev/xvdb     ext3    827G  240G  545G  31% /mnt
>>>>
>>>> Centos;
>>>>
>>>> df -Th /mnt
>>>> Filesystem    Type    Size  Used Avail Use% Mounted on
>>>> /dev/sdb      ext3    827G  191G  595G  25% /mnt
>>>>
>>>> Both the input ntriples and the output indexes are written to this
>>>> partition.
>>>> meminfo does show some differences - I believe mainly because the Ubuntu
>>>> instance is a later kernel (2.6.38-8-virtual vs. 2.6.16.33-xenU). There does
>>>> seem to be a difference between the mapped values, and I think I should
>>>> investigate the HugePages & DirectMap settings.
>>>>
>>>> Ubuntu;
>>>> cat /proc/meminfo
>>>> MemTotal:       35129364 kB
>>>> MemFree:          817100 kB
>>>> Buffers:           70780 kB
>>>> Cached:         32674868 kB
>>>> SwapCached:            0 kB
>>>> Active:         17471436 kB
>>>> Inactive:       15297084 kB
>>>> Active(anon):      25752 kB
>>>> Inactive(anon):       44 kB
>>>> Active(file):   17445684 kB
>>>> Inactive(file): 15297040 kB
>>>> Unevictable:        3800 kB
>>>> Mlocked:            3800 kB
>>>> SwapTotal:             0 kB
>>>> SwapFree:              0 kB
>>>> Dirty:             10664 kB
>>>> Writeback:             0 kB
>>>> AnonPages:         26808 kB
>>>> Mapped:             7012 kB
>>>> Shmem:               176 kB
>>>> Slab:             855516 kB
>>>> SReclaimable:     847652 kB
>>>> SUnreclaim:         7864 kB
>>>> KernelStack:         680 kB
>>>> PageTables:         2044 kB
>>>> NFS_Unstable:          0 kB
>>>> Bounce:                0 kB
>>>> WritebackTmp:          0 kB
>>>> CommitLimit:    17564680 kB
>>>> Committed_AS:      39488 kB
>>>> VmallocTotal:   34359738367 kB
>>>> VmallocUsed:      114504 kB
>>>> VmallocChunk:   34359623800 kB
>>>> HardwareCorrupted:     0 kB
>>>> HugePages_Total:       0
>>>> HugePages_Free:        0
>>>> HugePages_Rsvd:        0
>>>> HugePages_Surp:        0
>>>> Hugepagesize:       2048 kB
>>>> DirectMap4k:    35848192 kB
>>>> DirectMap2M:           0 kB
>>>>
>>>> ***************************************************
>>>> Centos;
>>>> cat /proc/meminfo
>>>> MemTotal:     35840000 kB
>>>> MemFree:         31424 kB
>>>> Buffers:        166428 kB
>>>> Cached:       34658344 kB
>>>> SwapCached:          0 kB
>>>> Active:        1033384 kB
>>>> Inactive:     33803304 kB
>>>> HighTotal:           0 kB
>>>> HighFree:            0 kB
>>>> LowTotal:     35840000 kB
>>>> LowFree:         31424 kB
>>>> SwapTotal:           0 kB
>>>> SwapFree:            0 kB
>>>> Dirty:             220 kB
>>>> Writeback:           0 kB
>>>> Mapped:          17976 kB
>>>> Slab:           223256 kB
>>>> CommitLimit:  17920000 kB
>>>> Committed_AS:    38020 kB
>>>> PageTables:       1528 kB
>>>> VmallocTotal: 34359738367 kB
>>>> VmallocUsed:       164 kB
>>>> VmallocChunk: 34359738203 kB
>>>>
>>>> Thanks,
>>>> Rich
>>>>
>>>> On Wed, Jun 15, 2011 at 10:07 PM, Andy Seaborne <
>>>> andy.seaborne@epimorphics.com> wrote:
>>>> >
>>>> > > So my questions are, has anyone else observed this? - can anyone
>>>> suggest any
>>>> > > further improvements - or things to try? - what is the best OS to
>>>> perform a
>>>> > > tdbload on?
>>>> >
>>>> > Richard - very useful feedback, thank you.
>>>> >
>>>> > I haven't come across this before - and the difference is quite
>>>> surprising.
>>>> >
>>>> > What is the "mapped" value on each machine?
>>>> > Could you "cat /proc/meminfo"?
>>>> >
>>>> > TDB is using memory mapped files - I'm wondering if the amount of RAM
>>>> available to the processes is different in some way.  Together with the
>>>> parameters you have found to have an efefct, this might have an effect
>>>> (speculation I'm afraid).
>>>> >
>>>> > Is the filesystem the same?
>>>> > How big is the resulting dataset?
>>>> >
>>>> > (sorry for all the questions!)
>>>> >
>>>> > tdbloader2 works differently from tdbloader even during the data
>>>> phase. It seems like it is the B+trees slowing down, there is only one in
>>>> tdbloader2 phase one, but two in tdbloader phase one.  That might explain
>>>> the roughly 80 -> 150million (or x2).
>>>> >
>>>> >        Andy
>>>> >
>>>> > On 15/06/11 16:23, Richard Francis wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> I'm using two identical machines in ec2 running tdbloader on centos
>>>> (CentOS
>>>> >> release 5 (Final)) and Ubuntu 11.04 (natty)
>>>> >>
>>>> >> I've observed an issue where Centos will run happily at a consistent
>>>> speed
>>>> >> and complete a load of 650million triples in around 12 hours, whereas
>>>> the
>>>> >> load on Ubuntu, after just 15million triples tails off and runs at an
>>>> ever
>>>> >> increasing slower interval.
>>>> >>
>>>> >> On initial observation of the Ubuntu machine I noticed that the
>>>> flush-202
>>>> >> process was running quite high, also running iostat showed that io
>>>> was the
>>>> >> real bottle neck - with the ubuntu machine showing a constant use of
>>>> the
>>>> >> disk for both reads and writes (the centos machine had periods of no
>>>> usage
>>>> >> followed by periods of writes). This led me to investigate how memory
>>>> was
>>>> >> being used by the Ubuntu machine - and a few blog posts / tutorials
>>>> later I
>>>> >> found a couple of settings to tweak - the first I tried
>>>> >> was dirty_writeback_centisecs - setting this to 0 had an immediate
>>>> positive
>>>> >> effect on the load that I was performing - but after some more
>>>> testing I
>>>> >> found that the problem was just put back to around 80million triples
>>>> before
>>>> >> I saw a drop off on performance.
>>>> >>
>>>> >> This led me investigate whether there was the same issue with
>>>> tdbloader2 -
>>>> >>  From my observations I got the same problem - but this time around
>>>> 150m
>>>> >> triples.
>>>> >>
>>>> >> Again - I focused on "dirty" settings - and this time tweaking
>>>> dirty_bytes
>>>> >> = 30000000000 and dirty_background_bytes = 15000000000 saw a massive
>>>> >> performance increase and for the vast part of add phase of the
>>>> tdbloader it
>>>> >> kept up with the centos machine.
>>>> >>
>>>> >> Finally, last night I stopped all loads, and raced the centos machine
>>>> and
>>>> >> the ubuntu machine - both have completed - but the Centos machine
>>>> (around 12
>>>> >> hours) was still far quicker than the Ubuntu machine (20 hours).
>>>> >>
>>>> >> So my questions are, has anyone else observed this? - can anyone
>>>> suggest any
>>>> >> further improvements - or things to try? - what is the best OS to
>>>> perform a
>>>> >> tdbload on?
>>>> >>
>>>> >> Rich
>>>> >>
>>>> >>
>>>> >> Tests were performed on three different machines 1x Centos and 2 x
>>>> Ubuntu -
>>>> >> to rule out EC2 being a bottle neck - all were  (from
>>>> >> http://aws.amazon.com/ec2/instance-types/)
>>>> >>
>>>> >> High-Memory Double Extra Large Instance
>>>> >>
>>>> >> 34.2 GB of memory
>>>> >> 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units
>>>> each)
>>>> >> 850 GB of instance storage
>>>> >> 64-bit platform
>>>> >> I/O Performance: High
>>>> >> API name: m2.2xlarge
>>>> >> All machines are configured with no swap
>>>> >>
>>>> >> Here's the summary from the only completed load on Ubuntu;
>>>> >>
>>>> >> ** Index SPO->OSP: 685,552,449 slots indexed in 18,337.75 seconds
>>>> [Rate:
>>>> >> 37,384.76 per second]
>>>> >> -- Finish triples index phase
>>>> >> ** 685,552,449 triples indexed in 37,063.51 seconds [Rate: 18,496.69
>>>> per
>>>> >> second]
>>>> >> -- Finish triples load
>>>> >> ** Completed: 685,552,449 triples loaded in 78,626.27 seconds [Rate:
>>>> >> 8,719.13 per second]
>>>> >> -- Finish quads load
>>>> >>
>>>> >> Some resources I used;
>>>> >> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>>>> >> http://arighi.blogspot.com/2008/10/fine-grained-dirtyratio-and.html
>>>> >>
>>>>
>>>>
>>>
>>
>