You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Dennis N Brown <db...@lenovo.com> on 2021/01/12 17:12:05 UTC

RE: [External] Re: Having an issue with large files and PutS3Object

Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have now adjusted it to 4.8GB to see if that made any difference, and it does not.  It seems to me that NiFi should be detecting the file size and initiating the multipart upload, without even trying a normal S3 PutObject.  But I’m not seeing any “multipart” messages in the error ( as I have seen in other posts about multipart uploads ).

The Cloudian implementation appears to be using the AWS libraries, as all of the messages appear to have Amazon or AWS in them, and the documentation also states the 5GB limit for file size without using multipart upload.

Regards,

[cid:image001.gif@01D6E8DB.840B8E40]
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

[Email]dbrown@lenovo.com<ma...@lenovo.com>

[2015-NewBrand-Lenovo-Services]


From: Mark Payne <ma...@hotmail.com>
Sent: January 12, 2021 11:53
To: users@nifi.apache.org
Subject: [External] Re: Having an issue with large files and PutS3Object

Dennis,

It appears that the PutS3Object processor looks at the size of the FlowFile and compares that to the value set in the “Multipart Threshold” property. If the size of the FlowFile is larger than that, it will use the multipart upload, with the configured size of the parts. I’m not familiar with the Cloudian implementation, but it may have different thresholds than S3? What do you have configured for the size of the Multipart Threshold?

Thanks
-Mark

On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com>> wrote:

Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to S3 storage ( not AWS, but rather Cloudian implementation ).

From everything I’ve read, it appears NiFi is supposed to revert to a multipart upload if the size of the file is greater than then “Multipart Threshold” defined in the PutS3Object processor.  This is not happening for me, it just errors out with this message:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)

So my question is;  Is NiFi supposed to detect the large file and then initiate the multipart upload, or is the server supposed to respond and cause NiFi to react to the size challenge?

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


RE: [External] Re: Having an issue with large files and PutS3Object

Posted by Dennis N Brown <db...@lenovo.com>.
Thanks so much Paul!  That did the trick!  Set the “Multipart Threshold” to 4.6 GB ( NiFi )  and the upload worked!!

Regards,

[cid:image001.gif@01D6E8FF.5966C460]
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

[Email]dbrown@lenovo.com<ma...@lenovo.com>

[2015-NewBrand-Lenovo-Services]


From: Paul Kelly <pk...@gmail.com>
Sent: January 12, 2021 15:14
To: users@nifi.apache.org
Subject: Re: [External] Re: Having an issue with large files and PutS3Object

Correction... try dropping to 4.6GB since 5.0GB = 4.65GiB, so you need it to be smaller than 4.65.

On Tue, Jan 12, 2021 at 8:11 PM Paul Kelly <pk...@gmail.com>> wrote:
Hi Dennis,

One thing to check... is your file 5.1 gibibytes or gigabytes?  NiFi and Amazon's multipart limits are in GiB, while Cloudian appears to be using GB based on this Github thread [1].  5.1 GB is 4.75 GiB, so I'm thinking NiFi is seeing your file is 4.75 GiB and your threshold is "4.8GB" (which is really GiB), so it's not trying to perform a multipart upload, but 4.75GiB = 5.1GB, which is above Cloudian's single PUT operation limit, so the Cloudian server is rejecting it.  Could you try dropping your limit to "4.7 GB" (which is really interpreted as 4.7 GiB in NiFi) to see if that works?  That should put your file above the multipart threshold for NiFi, which will invoke the multipart upload functionality.

[1] https://github.com/kahing/goofys/issues/139

Paul

On Tue, Jan 12, 2021 at 6:13 PM Dennis N Brown <db...@lenovo.com>> wrote:
Mark,  Sorry… also NiFi version 1.12.1

Regards,

[cid:image001.gif@01D6E8FF.5966C460]
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

[Email]dbrown@lenovo.com<ma...@lenovo.com>

[2015-NewBrand-Lenovo-Services]


From: Mark Payne <ma...@hotmail.com>>
Sent: January 12, 2021 12:20
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: [External] Re: Having an issue with large files and PutS3Object

Dennis,

Do your logs have any stack traces in them? That would probably help to understand what’s happening pretty quickly. Also, which version of NiFi are you running?

Thanks
-Mark


On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com>> wrote:

Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have now adjusted it to 4.8GB to see if that made any difference, and it does not.  It seems to me that NiFi should be detecting the file size and initiating the multipart upload, without even trying a normal S3 PutObject.  But I’m not seeing any “multipart” messages in the error ( as I have seen in other posts about multipart uploads ).

The Cloudian implementation appears to be using the AWS libraries, as all of the messages appear to have Amazon or AWS in them, and the documentation also states the 5GB limit for file size without using multipart upload.

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


From: Mark Payne <ma...@hotmail.com>>
Sent: January 12, 2021 11:53
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: [External] Re: Having an issue with large files and PutS3Object

Dennis,

It appears that the PutS3Object processor looks at the size of the FlowFile and compares that to the value set in the “Multipart Threshold” property. If the size of the FlowFile is larger than that, it will use the multipart upload, with the configured size of the parts. I’m not familiar with the Cloudian implementation, but it may have different thresholds than S3? What do you have configured for the size of the Multipart Threshold?

Thanks
-Mark

On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com>> wrote:

Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to S3 storage ( not AWS, but rather Cloudian implementation ).

From everything I’ve read, it appears NiFi is supposed to revert to a multipart upload if the size of the file is greater than then “Multipart Threshold” defined in the PutS3Object processor.  This is not happening for me, it just errors out with this message:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)

So my question is;  Is NiFi supposed to detect the large file and then initiate the multipart upload, or is the server supposed to respond and cause NiFi to react to the size challenge?

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


Re: [External] Re: Having an issue with large files and PutS3Object

Posted by Paul Kelly <pk...@gmail.com>.
Correction... try dropping to 4.6GB since 5.0GB = 4.65GiB, so you need it
to be smaller than 4.65.

On Tue, Jan 12, 2021 at 8:11 PM Paul Kelly <pk...@gmail.com> wrote:

> Hi Dennis,
>
> One thing to check... is your file 5.1 gibibytes or gigabytes?  NiFi and
> Amazon's multipart limits are in GiB, while Cloudian appears to be using GB
> based on this Github thread [1].  5.1 GB is 4.75 GiB, so I'm thinking NiFi
> is seeing your file is 4.75 GiB and your threshold is "4.8GB" (which is
> really GiB), so it's not trying to perform a multipart upload, but 4.75GiB
> = 5.1GB, which is above Cloudian's single PUT operation limit, so the
> Cloudian server is rejecting it.  Could you try dropping your limit to "4.7
> GB" (which is really interpreted as 4.7 GiB in NiFi) to see if that works?
> That should put your file above the multipart threshold for NiFi, which
> will invoke the multipart upload functionality.
>
> [1] https://github.com/kahing/goofys/issues/139
>
> Paul
>
> On Tue, Jan 12, 2021 at 6:13 PM Dennis N Brown <db...@lenovo.com> wrote:
>
>> Mark,  Sorry… also NiFi version 1.12.1
>>
>>
>>
>> Regards,
>>
>>
>>
>> *Dennis N Brown*
>> Data Architect
>> *Systems **CARE** Team*
>> Lenovo USA
>>
>>
>> US PostgreSQL Association
>>
>>
>> [image: Email]dbrown@lenovo.com <db...@lenovo.com>
>>
>>
>>
>> [image: 2015-NewBrand-Lenovo-Services]
>>
>>
>>
>>
>>
>> *From:* Mark Payne <ma...@hotmail.com>
>> *Sent:* January 12, 2021 12:20
>> *To:* users@nifi.apache.org
>> *Subject:* Re: [External] Re: Having an issue with large files and
>> PutS3Object
>>
>>
>>
>> Dennis,
>>
>>
>>
>> Do your logs have any stack traces in them? That would probably help to
>> understand what’s happening pretty quickly. Also, which version of NiFi are
>> you running?
>>
>>
>>
>> Thanks
>>
>> -Mark
>>
>>
>>
>>
>>
>> On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com> wrote:
>>
>>
>>
>> Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have
>> now adjusted it to 4.8GB to see if that made any difference, and it does
>> not.  It seems to me that NiFi should be detecting the file size and
>> initiating the multipart upload, without even trying a normal S3
>> PutObject.  But I’m not seeing any “multipart” messages in the error ( as I
>> have seen in other posts about multipart uploads ).
>>
>>
>>
>> The Cloudian implementation appears to be using the AWS libraries, as all
>> of the messages appear to have Amazon or AWS in them, and the documentation
>> also states the 5GB limit for file size without using multipart upload.
>>
>>
>>
>> Regards,
>>
>>
>>
>> <image001.gif>
>>
>> *Dennis N Brown*
>> Data Architect
>> *Systems **CARE* *Team*
>> Lenovo USA
>>
>>
>> US PostgreSQL Association
>>
>>
>> <image002.gif>dbrown@lenovo.com <db...@lenovo.com>
>>
>>
>>
>> <image003.gif>
>>
>>
>>
>>
>>
>> *From:* Mark Payne <ma...@hotmail.com>
>> *Sent:* January 12, 2021 11:53
>> *To:* users@nifi.apache.org
>> *Subject:* [External] Re: Having an issue with large files and
>> PutS3Object
>>
>>
>>
>> Dennis,
>>
>>
>>
>> It appears that the PutS3Object processor looks at the size of the
>> FlowFile and compares that to the value set in the “Multipart Threshold”
>> property. If the size of the FlowFile is larger than that, it will use the
>> multipart upload, with the configured size of the parts. I’m not familiar
>> with the Cloudian implementation, but it may have different thresholds than
>> S3? What do you have configured for the size of the Multipart Threshold?
>>
>>
>>
>> Thanks
>>
>> -Mark
>>
>>
>>
>> On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com> wrote:
>>
>>
>>
>> Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to
>> S3 storage ( not AWS, but rather Cloudian implementation ).
>>
>>
>>
>> From everything I’ve read, it appears NiFi is supposed to revert to a
>> multipart upload if the size of the file is greater than then “Multipart
>> Threshold” defined in the PutS3Object processor.  This is not happening for
>> me, it just errors out with this message:
>>
>> ERROR o.a.nifi.processors.aws.s3.PutS3Object
>> PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put
>> StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1610459752598-174464,
>> container=default, section=384], offset=59300,
>> length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due
>> to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
>> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
>> 400; Error Code: EntityTooLarge; Request ID:
>> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
>> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
>> a647d24f02954de69d161d24c3e48081:
>> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
>> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
>> 400; Error Code: EntityTooLarge; Request ID:
>> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
>> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
>> a647d24f02954de69d161d24c3e48081
>>
>> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
>> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
>> 400; Error Code: EntityTooLarge; Request ID:
>> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
>> a647d24f02954de69d161d24c3e48081)
>>
>>
>>
>> So my question is;  Is NiFi supposed to detect the large file and then
>> initiate the multipart upload, or is the server supposed to respond and
>> cause NiFi to react to the size challenge?
>>
>>
>>
>> Regards,
>>
>>
>>
>> <image001.gif>
>>
>> *Dennis N Brown*
>> Data Architect
>> *Systems **CARE* *Team*
>> Lenovo USA
>>
>>
>> US PostgreSQL Association
>>
>>
>> <image002.gif>dbrown@lenovo.com <db...@lenovo.com>
>>
>>
>>
>> <image003.gif>
>>
>>
>>
>

Re: [External] Re: Having an issue with large files and PutS3Object

Posted by Paul Kelly <pk...@gmail.com>.
Hi Dennis,

One thing to check... is your file 5.1 gibibytes or gigabytes?  NiFi and
Amazon's multipart limits are in GiB, while Cloudian appears to be using GB
based on this Github thread [1].  5.1 GB is 4.75 GiB, so I'm thinking NiFi
is seeing your file is 4.75 GiB and your threshold is "4.8GB" (which is
really GiB), so it's not trying to perform a multipart upload, but 4.75GiB
= 5.1GB, which is above Cloudian's single PUT operation limit, so the
Cloudian server is rejecting it.  Could you try dropping your limit to "4.7
GB" (which is really interpreted as 4.7 GiB in NiFi) to see if that works?
That should put your file above the multipart threshold for NiFi, which
will invoke the multipart upload functionality.

[1] https://github.com/kahing/goofys/issues/139

Paul

On Tue, Jan 12, 2021 at 6:13 PM Dennis N Brown <db...@lenovo.com> wrote:

> Mark,  Sorry… also NiFi version 1.12.1
>
>
>
> Regards,
>
>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE** Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> [image: Email]dbrown@lenovo.com <db...@lenovo.com>
>
>
>
> [image: 2015-NewBrand-Lenovo-Services]
>
>
>
>
>
> *From:* Mark Payne <ma...@hotmail.com>
> *Sent:* January 12, 2021 12:20
> *To:* users@nifi.apache.org
> *Subject:* Re: [External] Re: Having an issue with large files and
> PutS3Object
>
>
>
> Dennis,
>
>
>
> Do your logs have any stack traces in them? That would probably help to
> understand what’s happening pretty quickly. Also, which version of NiFi are
> you running?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
> On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com> wrote:
>
>
>
> Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have
> now adjusted it to 4.8GB to see if that made any difference, and it does
> not.  It seems to me that NiFi should be detecting the file size and
> initiating the multipart upload, without even trying a normal S3
> PutObject.  But I’m not seeing any “multipart” messages in the error ( as I
> have seen in other posts about multipart uploads ).
>
>
>
> The Cloudian implementation appears to be using the AWS libraries, as all
> of the messages appear to have Amazon or AWS in them, and the documentation
> also states the 5GB limit for file size without using multipart upload.
>
>
>
> Regards,
>
>
>
> <image001.gif>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE* *Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> <image002.gif>dbrown@lenovo.com <db...@lenovo.com>
>
>
>
> <image003.gif>
>
>
>
>
>
> *From:* Mark Payne <ma...@hotmail.com>
> *Sent:* January 12, 2021 11:53
> *To:* users@nifi.apache.org
> *Subject:* [External] Re: Having an issue with large files and PutS3Object
>
>
>
> Dennis,
>
>
>
> It appears that the PutS3Object processor looks at the size of the
> FlowFile and compares that to the value set in the “Multipart Threshold”
> property. If the size of the FlowFile is larger than that, it will use the
> multipart upload, with the configured size of the parts. I’m not familiar
> with the Cloudian implementation, but it may have different thresholds than
> S3? What do you have configured for the size of the Multipart Threshold?
>
>
>
> Thanks
>
> -Mark
>
>
>
> On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com> wrote:
>
>
>
> Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to
> S3 storage ( not AWS, but rather Cloudian implementation ).
>
>
>
> From everything I’ve read, it appears NiFi is supposed to revert to a
> multipart upload if the size of the file is greater than then “Multipart
> Threshold” defined in the PutS3Object processor.  This is not happening for
> me, it just errors out with this message:
>
> ERROR o.a.nifi.processors.aws.s3.PutS3Object
> PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put
> StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1610459752598-174464,
> container=default, section=384], offset=59300,
> length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due
> to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081:
> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081
>
> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081)
>
>
>
> So my question is;  Is NiFi supposed to detect the large file and then
> initiate the multipart upload, or is the server supposed to respond and
> cause NiFi to react to the size challenge?
>
>
>
> Regards,
>
>
>
> <image001.gif>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE* *Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> <image002.gif>dbrown@lenovo.com <db...@lenovo.com>
>
>
>
> <image003.gif>
>
>
>

RE: [External] Re: Having an issue with large files and PutS3Object

Posted by Dennis N Brown <db...@lenovo.com>.
Mark,  Sorry… also NiFi version 1.12.1

Regards,

[cid:image001.gif@01D6E8E4.9A3D8980]
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

[Email]dbrown@lenovo.com<ma...@lenovo.com>

[2015-NewBrand-Lenovo-Services]


From: Mark Payne <ma...@hotmail.com>
Sent: January 12, 2021 12:20
To: users@nifi.apache.org
Subject: Re: [External] Re: Having an issue with large files and PutS3Object

Dennis,

Do your logs have any stack traces in them? That would probably help to understand what’s happening pretty quickly. Also, which version of NiFi are you running?

Thanks
-Mark



On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com>> wrote:

Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have now adjusted it to 4.8GB to see if that made any difference, and it does not.  It seems to me that NiFi should be detecting the file size and initiating the multipart upload, without even trying a normal S3 PutObject.  But I’m not seeing any “multipart” messages in the error ( as I have seen in other posts about multipart uploads ).

The Cloudian implementation appears to be using the AWS libraries, as all of the messages appear to have Amazon or AWS in them, and the documentation also states the 5GB limit for file size without using multipart upload.

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


From: Mark Payne <ma...@hotmail.com>>
Sent: January 12, 2021 11:53
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: [External] Re: Having an issue with large files and PutS3Object

Dennis,

It appears that the PutS3Object processor looks at the size of the FlowFile and compares that to the value set in the “Multipart Threshold” property. If the size of the FlowFile is larger than that, it will use the multipart upload, with the configured size of the parts. I’m not familiar with the Cloudian implementation, but it may have different thresholds than S3? What do you have configured for the size of the Multipart Threshold?

Thanks
-Mark

On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com>> wrote:

Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to S3 storage ( not AWS, but rather Cloudian implementation ).

From everything I’ve read, it appears NiFi is supposed to revert to a multipart upload if the size of the file is greater than then “Multipart Threshold” defined in the PutS3Object processor.  This is not happening for me, it just errors out with this message:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)

So my question is;  Is NiFi supposed to detect the large file and then initiate the multipart upload, or is the server supposed to respond and cause NiFi to react to the size challenge?

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


RE: [External] Re: Having an issue with large files and PutS3Object

Posted by Dennis N Brown <db...@lenovo.com>.
Hi Mark,  Here is the full error and stack trace:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4926)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4872)
        at com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:390)
        at com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5806)
        at com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1794)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1754)
        at org.apache.nifi.processors.aws.s3.PutS3Object$1.process(PutS3Object.java:563)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2324)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2292)
        at org.apache.nifi.processors.aws.s3.PutS3Object.onTrigger(PutS3Object.java:474)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Regards,

[cid:image001.gif@01D6E8E4.5641D1F0]
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

[Email]dbrown@lenovo.com<ma...@lenovo.com>

[2015-NewBrand-Lenovo-Services]


From: Mark Payne <ma...@hotmail.com>
Sent: January 12, 2021 12:20
To: users@nifi.apache.org
Subject: Re: [External] Re: Having an issue with large files and PutS3Object

Dennis,

Do your logs have any stack traces in them? That would probably help to understand what’s happening pretty quickly. Also, which version of NiFi are you running?

Thanks
-Mark



On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com>> wrote:

Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have now adjusted it to 4.8GB to see if that made any difference, and it does not.  It seems to me that NiFi should be detecting the file size and initiating the multipart upload, without even trying a normal S3 PutObject.  But I’m not seeing any “multipart” messages in the error ( as I have seen in other posts about multipart uploads ).

The Cloudian implementation appears to be using the AWS libraries, as all of the messages appear to have Amazon or AWS in them, and the documentation also states the 5GB limit for file size without using multipart upload.

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


From: Mark Payne <ma...@hotmail.com>>
Sent: January 12, 2021 11:53
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: [External] Re: Having an issue with large files and PutS3Object

Dennis,

It appears that the PutS3Object processor looks at the size of the FlowFile and compares that to the value set in the “Multipart Threshold” property. If the size of the FlowFile is larger than that, it will use the multipart upload, with the configured size of the parts. I’m not familiar with the Cloudian implementation, but it may have different thresholds than S3? What do you have configured for the size of the Multipart Threshold?

Thanks
-Mark

On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com>> wrote:

Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to S3 storage ( not AWS, but rather Cloudian implementation ).

From everything I’ve read, it appears NiFi is supposed to revert to a multipart upload if the size of the file is greater than then “Multipart Threshold” defined in the PutS3Object processor.  This is not happening for me, it just errors out with this message:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)

So my question is;  Is NiFi supposed to detect the large file and then initiate the multipart upload, or is the server supposed to respond and cause NiFi to react to the size challenge?

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


Re: [External] Re: Having an issue with large files and PutS3Object

Posted by Mark Payne <ma...@hotmail.com>.
Dennis,

Do your logs have any stack traces in them? That would probably help to understand what’s happening pretty quickly. Also, which version of NiFi are you running?

Thanks
-Mark


On Jan 12, 2021, at 12:12 PM, Dennis N Brown <db...@lenovo.com>> wrote:

Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have now adjusted it to 4.8GB to see if that made any difference, and it does not.  It seems to me that NiFi should be detecting the file size and initiating the multipart upload, without even trying a normal S3 PutObject.  But I’m not seeing any “multipart” messages in the error ( as I have seen in other posts about multipart uploads ).

The Cloudian implementation appears to be using the AWS libraries, as all of the messages appear to have Amazon or AWS in them, and the documentation also states the 5GB limit for file size without using multipart upload.

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>


From: Mark Payne <ma...@hotmail.com>>
Sent: January 12, 2021 11:53
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: [External] Re: Having an issue with large files and PutS3Object

Dennis,

It appears that the PutS3Object processor looks at the size of the FlowFile and compares that to the value set in the “Multipart Threshold” property. If the size of the FlowFile is larger than that, it will use the multipart upload, with the configured size of the parts. I’m not familiar with the Cloudian implementation, but it may have different thresholds than S3? What do you have configured for the size of the Multipart Threshold?

Thanks
-Mark

On Jan 12, 2021, at 11:32 AM, Dennis N Brown <db...@lenovo.com>> wrote:

Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to S3 storage ( not AWS, but rather Cloudian implementation ).

From everything I’ve read, it appears NiFi is supposed to revert to a multipart upload if the size of the file is greater than then “Multipart Threshold” defined in the PutS3Object processor.  This is not happening for me, it just errors out with this message:

ERROR o.a.nifi.processors.aws.s3.PutS3Object PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1610459752598-174464, container=default, section=384], offset=59300, length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081: com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081), S3 Extended Request ID: a647d24f02954de69d161d24c3e48081
com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload exceeds the maximum allowed object size. (Service: Amazon S3; Status Code: 400; Error Code: EntityTooLarge; Request ID: 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID: a647d24f02954de69d161d24c3e48081)

So my question is;  Is NiFi supposed to detect the large file and then initiate the multipart upload, or is the server supposed to respond and cause NiFi to react to the size challenge?

Regards,

<image001.gif>
Dennis N Brown
Data Architect
Systems CARE Team
Lenovo USA


US PostgreSQL Association

<im...@lenovo.com>

<image003.gif>