You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by J M <ce...@gmail.com> on 2020/05/14 09:19:16 UTC

Question about Hadoop/HDFS files written and maximun file sizes

Hi,

I don't have much knowledge about Hadoop/HDFS, my question can be simple,
or not...

Then, I have a Hadoop/HDFS environment, but my disks are not very big.

One applicacion is writing in files. But, sometimes the disk is filled with
large file sizes.

Then, my question is:

Exist any form to limitating the maximum file sizes written in HDFS?

I was thinking of something like:
When a file have a size of >= 1Gb, then new data written to this file,
cause that the first data written to this file deleted. In this way the
file size would always be limited, as a rolled file.

Howto do this task?

Regards,
Cesar Jorge

Re: Question about Hadoop/HDFS files written and maximun file sizes

Posted by Deepak Vohra <dv...@yahoo.com.INVALID>.
  
Max file size is not configurable directly but other settings could affect max file size, such as maximum number of blocks per file setting dfs.namenode.fs-limits.max-blocks-per-file. This prevents the creation of extremely large files which can degrade performance.
<property>    <name>dfs.namenode.fs-limits.max-blocks-per-file</name>    <value>1048576</value>    <description>Maximum number of blocks per file, enforced by the Namenode on        write. This prevents the creation of extremely large files which can        degrade performance.</description></property>
  Space Quotas, Storage Type Quotas may also be set.
https://hadoop.apache.org/docs/r3.0.3/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html
https://www.informit.com/articles/article.aspx?p=2755708&seqNum=4

    On Thursday, May 14, 2020, 09:19:44 a.m. UTC, J M <ce...@gmail.com> wrote:  
 
 Hi,
I don't have much knowledge about Hadoop/HDFS, my question can be simple, or not...
Then, I have a Hadoop/HDFS environment, but my disks are not very big.
One applicacion is writing in files. But, sometimes the disk is filled with large file sizes.
Then, my question is:
Exist any form to limitating the maximum file sizes written in HDFS?
I was thinking of something like:
When a file have a size of >= 1Gb, then new data written to this file, cause that the first data written to this file deleted. In this way the file size would always be limited, as a rolled file.
Howto do this task?
Regards,Cesar Jorge