You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/03/24 01:50:18 UTC
[kudu-CR] cfile set: reduce memory usage of reader map
Hello Adar Dembo,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/6466
to review the following change.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
cfile_set: reduce memory usage of reader map
We previously used an unordered_map<int, shared_ptr<...>>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.
It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.
Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.
Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 7 insertions(+), 6 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/1
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
Patch Set 3: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/6466
to look at the new patch set (#2).
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
cfile_set: reduce memory usage of reader map
We previously used an unordered_map<int, shared_ptr<...>>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.
It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.
Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.
Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/2
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
Patch Set 1:
(3 comments)
http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:
Line 103: gscoped_ptr<CFileReader> reader;
Can you convert to unique_ptr in the affected call sites? Or does that spill out all over the place?
http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:
Line 121: // Map of column ID to reader. These are lazily initialized as needed.
May want to leave a comment justifying flat_map, since it's somewhat unorthodox for us.
PS1, Line 122: >
Nit: eliminate empty space.
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
Patch Set 2: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/6466/2/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:
PS2, Line 122: flat_hash_map
Not "flat_map"?
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/6466
to look at the new patch set (#3).
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
cfile_set: reduce memory usage of reader map
We previously used an unordered_map<int, shared_ptr<...>>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.
It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.
Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.
Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/3
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
cfile_set: reduce memory usage of reader map
We previously used an unordered_map<int, shared_ptr<...>>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.
It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.
Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.
Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Reviewed-on: http://gerrit.cloudera.org:8080/6466
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)
Approvals:
Adar Dembo: Looks good to me, approved
Kudu Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
[kudu-CR] cfile set: reduce memory usage of reader map
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.
Change subject: cfile_set: reduce memory usage of reader map
......................................................................
Patch Set 1:
(3 comments)
http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:
Line 103: gscoped_ptr<CFileReader> reader;
> Can you convert to unique_ptr in the affected call sites? Or does that spil
probably the latter, but I'll give it a go in a second patch on top of this one to keep the patches focused
http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:
Line 121: // Map of column ID to reader. These are lazily initialized as needed.
> May want to leave a comment justifying flat_map, since it's somewhat unorth
Done
PS1, Line 122: >
> Nit: eliminate empty space.
Done
--
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes