You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2017/07/14 20:05:00 UTC

[jira] [Commented] (KUDU-2071) disk size is much large than actually data size

    [ https://issues.apache.org/jira/browse/KUDU-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087949#comment-16087949 ] 

Jean-Daniel Cryans commented on KUDU-2071:
------------------------------------------

Hey [~King Lee], this looks like a classical case of KUDU-1943.

> disk size is much large than actually data size
> -----------------------------------------------
>
>                 Key: KUDU-2071
>                 URL: https://issues.apache.org/jira/browse/KUDU-2071
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.3.0
>         Environment: system version
> 4.9.20-11.31.amzn1.x86_64 #1 SMP Thu Apr 13 01:53:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> software version:
> kudu 1.3.0-cdh5.11.0
> revision 4dcf4a9d516865d249f4cb9b07f93c67e84614ae
> build type RELEASE
> built by jenkins at 12 Apr 2017 14:02:51 PST on kudu-centos66-046c.vpc.cloudera.com
> build id 2017-04-12_13-25-42
>            Reporter: KingLee
>              Labels: patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I ran m -rf on all the data dirs before reinstalling the cluster, and insert 1000000 records to the cluster using yscb, data's size is about 5GB,but it cost disk size 260GB, one of node 's disk as follows:
> before write data:
> [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/ /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/ /data4/server/kudu/tserver_data/data/
> 4.0K    /data1/server/kudu/tserver_wal/wals/
> 24K     /data2/server/kudu/tserver_data/
> 8.0K    /data3/server/kudu/tserver_data/data/
> 8.0K    /data4/server/kudu/tserver_data/data/
> after write data:
> [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/ /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/ /data4/server/kudu/tserver_data/data/
> 2.7G    /data1/server/kudu/tserver_wal/wals/
> 29G     /data2/server/kudu/tserver_data/
> 29G     /data3/server/kudu/tserver_data/data/
> 27G     /data4/server/kudu/tserver_data/data/
> actually data size :
> 9b137115cfaa427a9106c87086f41957 5041MBytes
> kudu tserver configure:
> --fs_wal_dir=/var/lib/kudu/tserver
> --fs_data_dirs=/var/lib/kudu/tserver
> --default_num_replicas=3
> --tserver_master_addrs=192.168.1.22:7051,1192.168.1.23:7051,192.168.1.24:7051,192.168.1.25:7051,192.168.1.26:7051
> --maintenance_manager_num_threads=4
> --block_cache_capacity_mb=10240
> --memory_limit_hard_bytes=60000000000
> --fs_wal_dir=/data1/server/kudu/tserver_wal
> --fs_data_dirs=/data2/server/kudu/tserver_data,/data3/server/kudu/tserver_data,/data4/server/kudu/tserver_data
> --fs_data_dirs_reserved_bytes=10000000000
> --log_segment_size_mb=8
> and our production environment 's data is 25TB, but cost 45TB, where do these disks go?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)