You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Lorenzo Peder <Lo...@ds-iq.com> on 2016/09/02 21:19:33 UTC

Issue writing file (~50mb) to azure data lake with Nifi

Hi All,

We've run into an issue uploading a larger file (~50Mb) into an Azure Data Lake using a custom processor in nifi 0.7-1.0. This custom processor has worked consistently for smaller files, but once encountered with this larger file, it spits http error 404 (file not found). Eventually a minor portion of the file wrote to the data lake.
We used fiddler to capture network traffic between Nifi and the Azure data lake while the processor was running and captured http error 204 (no contents).
Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully.
These findings lead us to believe that this issue is occurring within Nifi, if someone could please point us in the right direction in resolving this issue it would be greatly appreciated.

Thank you,

Lorenzo Peder
Operations Analyst, Campaign Operations & Services

425.974.1363 : Office
425.260.5027 : Mobile
www.ds-iq.com<http://www.ds-iq.com>

[cid:image002.png@01D20525.04D5B950]
Dynamic Shopper Intelligence

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete this message.



Re: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Joe Witt <jo...@gmail.com>.
Right.  So NiFi is giving access to an InputStream.  Things like mark and
reset type operations will need to be wrapped.  If you have access to the
client code being used or a pointer to it that would be helpful in terms of
figuring out what it is doing.  I am confident this is not any sort of
issue with NiFi itself but we can still try to help find the root issue.

thanks

On Wed, Sep 7, 2016 at 10:05 AM, Tony Kurc <tr...@gmail.com> wrote:

> I won't have access to an IDE for the most of today, or an active azure
> account so I won't be able to debug myself in short order, but I'm
> suspecting the same thing joe alluded to at this point - this being an
> issue with the client API, possibly complicated by with the way NiFi allows
> access to the underlying bytes to the file.
>
> On Tue, Sep 6, 2016 at 6:21 PM, Kumiko Yada <Ku...@ds-iq.com> wrote:
>
>> Lorenzo was not clear how we tested.  I wrote the sample java program
>> using the Azure SDK, then uploaded the 50 MB file and it’s worked without
>> error.   Nifi custom processor used the same SDK code; however, it’s
>> failing when the custom processor is tried to uploaded 50 MB file.
>>
>>
>>
>> Thanks
>>
>> Kumiko
>>
>>
>>
>> *From:* Tony Kurc [mailto:trkurc@gmail.com]
>> *Sent:* Tuesday, September 6, 2016 3:02 PM
>> *To:* Kumiko Yada <Ku...@ds-iq.com>
>> *Cc:* users@nifi.apache.org; Joe Witt <jo...@gmail.com>; #Operations
>> Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <
>> Kevin.Verhoeven@ds-iq.com>
>>
>> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>>
>>
>>
>> I apologize if I'm missing something, as I'm trying to read the code on
>> my phone, but it looks like the test script is using a different api call
>> to perform the upload - did you already test using the same call in your
>> script?
>>
>>
>>
>> On Sep 6, 2016 5:48 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>>
>> Here is the code:  https://github.com/kyada1/ADL_UploadFile.
>>
>>
>>
>> I removed the following values for the security reason.
>>
>>
>>
>> *final static *String *ADLS_ACCOUNT_NAME *= *""*;
>> *final static *String *RESOURCE_GROUP_NAME *= *""*;
>> *final static *String *LOCATION *= *""*;
>> *final static *String *TENANT_ID *= *""*;
>> *final static *String *SUBSCRIPTION_ID *=  *""*;
>> *final static *String *CLIENT_ID *= *""*;
>> *final static *String *CLIENT_SECRET *= *""*; */*
>>
>>
>>
>> Thanks
>>
>> Kumiko
>>
>>
>>
>> *From:* Tony Kurc [mailto:trkurc@gmail.com]
>> *Sent:* Tuesday, September 6, 2016 2:41 PM
>> *To:* Kumiko Yada <Ku...@ds-iq.com>
>> *Cc:* Joe Witt <jo...@gmail.com>; users@nifi.apache.org; #Operations
>> Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <
>> Kevin.Verhoeven@ds-iq.com>
>> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>>
>>
>>
>> I was referring to this: "Then we wrote a java sdk script to upload this
>> same file without Nifi into the data lake and it worked successfully."
>>
>> Is that code somewhere?
>>
>>
>>
>> On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>>
>> I didn’t add any test code.  This custom controller and processor is
>> working for a small size file.
>>
>>
>>
>> Thanks
>>
>> Kumiko
>>
>>
>>
>> *From:* Tony Kurc [mailto:trkurc@gmail.com]
>> *Sent:* Tuesday, September 6, 2016 2:32 PM
>> *To:* users@nifi.apache.org
>> *Cc:* Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#
>> opstools@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
>> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>>
>>
>>
>> I didn't see the test script that worked in the source code - did I miss
>> it, or is it not in the tree?
>>
>>
>>
>> On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>>
>> Joe,
>>
>>
>>
>> Here is the log (there was no callstack related to this error) and code,
>> https://github.com/kyada1/dl_sdkworkaround/tree/master/nifi-
>> azure-dlstore-bundle.
>>
>>
>>
>> 2016-09-06 12:06:50,508 INFO [NiFi Web Server-19]
>> c.s.j.s.i.application.WebApplicationImpl Initiating Jersey application,
>> version 'Jersey: 1.19 02/11/2015 03:25 AM'
>>
>> 2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1]
>> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
>> PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to run with
>> 1 threads
>>
>> 2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1]
>> o.a.nifi.controller.StandardFlowService Saved flow controller
>> org.apache.nifi.controller.FlowController@28f414a1 // Another save
>> pending = false
>>
>> 2016-09-06 12:07:01,904 INFO [pool-27-thread-1]
>> c.m.aad.adal4j.AuthenticationAuthority [Correlation ID:
>> 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery was successful
>>
>> 2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1]
>> n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55]
>> File was not created: /kumiko/test/20160906120701022.txt
>> com.microsoft.azure.management.datalake.store.models.AdlsErrorException:
>> Invalid status code 404
>>
>> 2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1]
>> o.a.n.p.PersistentProvenanceRepository Created new Provenance Event
>> Writers for events starting with ID 618
>>
>> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
>> o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal
>> files (1 records) into single Provenance Log File .\provenance_repository\
>> 617.prov in 305 milliseconds
>>
>> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
>> o.a.n.p.PersistentProvenanceRepository Successfully Rolled over
>> Provenance Event file containing 1 records
>>
>> 2016-09-06 12:08:21,148 INFO [NiFi Web Server-28]
>> o.a.n.controller.StandardProcessorNode Stopping processor: class
>> nifi.azure.dlstore.processors.PutFileAzureDLStore
>>
>>
>>
>> Thanks
>>
>> Kumiko
>>
>>
>>
>> *From:* Joe Witt [mailto:joe.witt@gmail.com]
>> *Sent:* Friday, September 2, 2016 2:56 PM
>> *To:* users@nifi.apache.org
>> *Cc:* #Operations Automation and Tools <#o...@ds-iq.com>; Kevin
>> Verhoeven <Ke...@ds-iq.com>
>> *Subject:* Re: Issue writing file (~50mb) to azure data lake with Nifi
>>
>>
>>
>> Lorenzo
>>
>> Without seeing the code and logs it would be very difficult to help.
>> nifi has no trouble by design writing large files (GBs) to many things
>> including hdfs so the issue is probably in how this client library
>> interacts with the data stream.
>>
>>
>>
>> On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:
>>
>> Hi All,
>>
>>
>>
>> We’ve run into an issue uploading a larger file (~50Mb) into an Azure
>> Data Lake using a custom processor in nifi 0.7-1.0. This custom processor
>> has worked consistently for smaller files, but once encountered with this
>> larger file, it spits http error 404 (file not found). Eventually a minor
>> portion of the file wrote to the data lake.
>>
>> We used fiddler to capture network traffic between Nifi and the Azure
>> data lake while the processor was running and captured http error 204 (no
>> contents).
>>
>> Then we wrote a java sdk script to upload this same file without Nifi
>> into the data lake and it worked successfully.
>>
>> These findings lead us to believe that this issue is occurring within
>> Nifi, if someone could please point us in the right direction in resolving
>> this issue it would be greatly appreciated.
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Lorenzo Peder
>>
>> Operations Analyst, Campaign Operations & Services
>>
>>
>>
>> 425.974.1363 : Office
>>
>> 425.260.5027 : Mobile
>>
>> www.ds-iq.com
>>
>>
>>
>> Dynamic Shopper Intelligence
>>
>>
>>
>> This e-mail may contain confidential or privileged information.
>>
>> If you are not the intended recipient, please notify the sender
>> immediately and then delete this message.
>>
>>
>>
>>
>>
>>
>

Re: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Tony Kurc <tr...@gmail.com>.
I won't have access to an IDE for the most of today, or an active azure
account so I won't be able to debug myself in short order, but I'm
suspecting the same thing joe alluded to at this point - this being an
issue with the client API, possibly complicated by with the way NiFi allows
access to the underlying bytes to the file.

On Tue, Sep 6, 2016 at 6:21 PM, Kumiko Yada <Ku...@ds-iq.com> wrote:

> Lorenzo was not clear how we tested.  I wrote the sample java program
> using the Azure SDK, then uploaded the 50 MB file and it’s worked without
> error.   Nifi custom processor used the same SDK code; however, it’s
> failing when the custom processor is tried to uploaded 50 MB file.
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 3:02 PM
> *To:* Kumiko Yada <Ku...@ds-iq.com>
> *Cc:* users@nifi.apache.org; Joe Witt <jo...@gmail.com>; #Operations
> Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <
> Kevin.Verhoeven@ds-iq.com>
>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I apologize if I'm missing something, as I'm trying to read the code on my
> phone, but it looks like the test script is using a different api call to
> perform the upload - did you already test using the same call in your
> script?
>
>
>
> On Sep 6, 2016 5:48 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> Here is the code:  https://github.com/kyada1/ADL_UploadFile.
>
>
>
> I removed the following values for the security reason.
>
>
>
> *final static *String *ADLS_ACCOUNT_NAME *= *""*;
> *final static *String *RESOURCE_GROUP_NAME *= *""*;
> *final static *String *LOCATION *= *""*;
> *final static *String *TENANT_ID *= *""*;
> *final static *String *SUBSCRIPTION_ID *=  *""*;
> *final static *String *CLIENT_ID *= *""*;
> *final static *String *CLIENT_SECRET *= *""*; */*
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 2:41 PM
> *To:* Kumiko Yada <Ku...@ds-iq.com>
> *Cc:* Joe Witt <jo...@gmail.com>; users@nifi.apache.org; #Operations
> Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <
> Kevin.Verhoeven@ds-iq.com>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I was referring to this: "Then we wrote a java sdk script to upload this
> same file without Nifi into the data lake and it worked successfully."
>
> Is that code somewhere?
>
>
>
> On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> I didn’t add any test code.  This custom controller and processor is
> working for a small size file.
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 2:32 PM
> *To:* users@nifi.apache.org
> *Cc:* Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#
> opstools@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I didn't see the test script that worked in the source code - did I miss
> it, or is it not in the tree?
>
>
>
> On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> Joe,
>
>
>
> Here is the log (there was no callstack related to this error) and code,
> https://github.com/kyada1/dl_sdkworkaround/tree/master/
> nifi-azure-dlstore-bundle.
>
>
>
> 2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl
> Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
>
> 2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent
> Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to
> run with 1 threads
>
> 2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1]
> o.a.nifi.controller.StandardFlowService Saved flow controller
> org.apache.nifi.controller.FlowController@28f414a1 // Another save
> pending = false
>
> 2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority
> [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery
> was successful
>
> 2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1]
> n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=
> 00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt
> com.microsoft.azure.management.datalake.store.models.AdlsErrorException:
> Invalid status code 404
>
> 2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository
> Created new Provenance Event Writers for events starting with ID 618
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal
> files (1 records) into single Provenance Log File
> .\provenance_repository\617.prov in 305 milliseconds
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully Rolled over
> Provenance Event file containing 1 records
>
> 2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode
> Stopping processor: class nifi.azure.dlstore.processors.
> PutFileAzureDLStore
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Joe Witt [mailto:joe.witt@gmail.com]
> *Sent:* Friday, September 2, 2016 2:56 PM
> *To:* users@nifi.apache.org
> *Cc:* #Operations Automation and Tools <#o...@ds-iq.com>; Kevin
> Verhoeven <Ke...@ds-iq.com>
> *Subject:* Re: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> Lorenzo
>
> Without seeing the code and logs it would be very difficult to help.
> nifi has no trouble by design writing large files (GBs) to many things
> including hdfs so the issue is probably in how this client library
> interacts with the data stream.
>
>
>
> On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:
>
> Hi All,
>
>
>
> We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data
> Lake using a custom processor in nifi 0.7-1.0. This custom processor has
> worked consistently for smaller files, but once encountered with this
> larger file, it spits http error 404 (file not found). Eventually a minor
> portion of the file wrote to the data lake.
>
> We used fiddler to capture network traffic between Nifi and the Azure data
> lake while the processor was running and captured http error 204 (no
> contents).
>
> Then we wrote a java sdk script to upload this same file without Nifi into
> the data lake and it worked successfully.
>
> These findings lead us to believe that this issue is occurring within
> Nifi, if someone could please point us in the right direction in resolving
> this issue it would be greatly appreciated.
>
>
>
> Thank you,
>
>
>
> Lorenzo Peder
>
> Operations Analyst, Campaign Operations & Services
>
>
>
> 425.974.1363 : Office
>
> 425.260.5027 : Mobile
>
> www.ds-iq.com
>
>
>
> Dynamic Shopper Intelligence
>
>
>
> This e-mail may contain confidential or privileged information.
>
> If you are not the intended recipient, please notify the sender
> immediately and then delete this message.
>
>
>
>
>
>

RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Kumiko Yada <Ku...@ds-iq.com>.
Lorenzo was not clear how we tested.  I wrote the sample java program using the Azure SDK, then uploaded the 50 MB file and it’s worked without error.   Nifi custom processor used the same SDK code; however, it’s failing when the custom processor is tried to uploaded 50 MB file.

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com]
Sent: Tuesday, September 6, 2016 3:02 PM
To: Kumiko Yada <Ku...@ds-iq.com>
Cc: users@nifi.apache.org; Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I apologize if I'm missing something, as I'm trying to read the code on my phone, but it looks like the test script is using a different api call to perform the upload - did you already test using the same call in your script?

On Sep 6, 2016 5:48 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
Here is the code:  https://github.com/kyada1/ADL_UploadFile.

I removed the following values for the security reason.

final static String ADLS_ACCOUNT_NAME = "";
final static String RESOURCE_GROUP_NAME = "";
final static String LOCATION = "";
final static String TENANT_ID = "";
final static String SUBSCRIPTION_ID =  "";
final static String CLIENT_ID = "";
final static String CLIENT_SECRET = ""; /

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com<ma...@gmail.com>]
Sent: Tuesday, September 6, 2016 2:41 PM
To: Kumiko Yada <Ku...@ds-iq.com>>
Cc: Joe Witt <jo...@gmail.com>>; users@nifi.apache.org<ma...@nifi.apache.org>; #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I was referring to this: "Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully."

Is that code somewhere?

On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
I didn’t add any test code.  This custom controller and processor is working for a small size file.

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com<ma...@gmail.com>]
Sent: Tuesday, September 6, 2016 2:32 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Cc: Joe Witt <jo...@gmail.com>>; #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I didn't see the test script that worked in the source code - did I miss it, or is it not in the tree?

On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
Joe,

Here is the log (there was no callstack related to this error) and code, https://github.com/kyada1/dl_sdkworkaround/tree/master/nifi-azure-dlstore-bundle.

2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to run with 1 threads
2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@28f414a1<ma...@28f414a1> // Another save pending = false
2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery was successful
2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1] n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt com.microsoft.azure.management.datalake.store.models.AdlsErrorException: Invalid status code 404
2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository Created new Provenance Event Writers for events starting with ID 618
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files (1 records) into single Provenance Log File .\provenance_repository\617.prov in 305 milliseconds
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully Rolled over Provenance Event file containing 1 records
2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode Stopping processor: class nifi.azure.dlstore.processors.PutFileAzureDLStore

Thanks
Kumiko

From: Joe Witt [mailto:joe.witt@gmail.com<ma...@gmail.com>]
Sent: Friday, September 2, 2016 2:56 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Cc: #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: Re: Issue writing file (~50mb) to azure data lake with Nifi


Lorenzo

Without seeing the code and logs it would be very difficult to help.   nifi has no trouble by design writing large files (GBs) to many things including hdfs so the issue is probably in how this client library interacts with the data stream.

On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com>> wrote:
Hi All,

We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data Lake using a custom processor in nifi 0.7-1.0. This custom processor has worked consistently for smaller files, but once encountered with this larger file, it spits http error 404 (file not found). Eventually a minor portion of the file wrote to the data lake.
We used fiddler to capture network traffic between Nifi and the Azure data lake while the processor was running and captured http error 204 (no contents).
Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully.
These findings lead us to believe that this issue is occurring within Nifi, if someone could please point us in the right direction in resolving this issue it would be greatly appreciated.

Thank you,

Lorenzo Peder
Operations Analyst, Campaign Operations & Services

425.974.1363<tel:425.974.1363> : Office
425.260.5027<tel:425.260.5027> : Mobile
www.ds-iq.com<http://www.ds-iq.com>

[cid:image001.png@01D20852.5DBD4030]
Dynamic Shopper Intelligence

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete this message.



RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Tony Kurc <tr...@gmail.com>.
I apologize if I'm missing something, as I'm trying to read the code on my
phone, but it looks like the test script is using a different api call to
perform the upload - did you already test using the same call in your
script?

On Sep 6, 2016 5:48 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:

> Here is the code:  https://github.com/kyada1/ADL_UploadFile.
>
>
>
> I removed the following values for the security reason.
>
>
>
> *final static *String *ADLS_ACCOUNT_NAME *= *""*;
> *final static *String *RESOURCE_GROUP_NAME *= *""*;
> *final static *String *LOCATION *= *""*;
> *final static *String *TENANT_ID *= *""*;
> *final static *String *SUBSCRIPTION_ID *=  *""*;
> *final static *String *CLIENT_ID *= *""*;
> *final static *String *CLIENT_SECRET *= *""*; */*
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 2:41 PM
> *To:* Kumiko Yada <Ku...@ds-iq.com>
> *Cc:* Joe Witt <jo...@gmail.com>; users@nifi.apache.org; #Operations
> Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <
> Kevin.Verhoeven@ds-iq.com>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I was referring to this: "Then we wrote a java sdk script to upload this
> same file without Nifi into the data lake and it worked successfully."
>
> Is that code somewhere?
>
>
>
> On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> I didn’t add any test code.  This custom controller and processor is
> working for a small size file.
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 2:32 PM
> *To:* users@nifi.apache.org
> *Cc:* Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#
> opstools@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I didn't see the test script that worked in the source code - did I miss
> it, or is it not in the tree?
>
>
>
> On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> Joe,
>
>
>
> Here is the log (there was no callstack related to this error) and code,
> https://github.com/kyada1/dl_sdkworkaround/tree/master/
> nifi-azure-dlstore-bundle.
>
>
>
> 2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl
> Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
>
> 2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent
> Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to
> run with 1 threads
>
> 2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1]
> o.a.nifi.controller.StandardFlowService Saved flow controller
> org.apache.nifi.controller.FlowController@28f414a1 // Another save
> pending = false
>
> 2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority
> [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery
> was successful
>
> 2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1]
> n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=
> 00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt
> com.microsoft.azure.management.datalake.store.models.AdlsErrorException:
> Invalid status code 404
>
> 2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository
> Created new Provenance Event Writers for events starting with ID 618
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal
> files (1 records) into single Provenance Log File
> .\provenance_repository\617.prov in 305 milliseconds
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully Rolled over
> Provenance Event file containing 1 records
>
> 2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode
> Stopping processor: class nifi.azure.dlstore.processors.
> PutFileAzureDLStore
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Joe Witt [mailto:joe.witt@gmail.com]
> *Sent:* Friday, September 2, 2016 2:56 PM
> *To:* users@nifi.apache.org
> *Cc:* #Operations Automation and Tools <#o...@ds-iq.com>; Kevin
> Verhoeven <Ke...@ds-iq.com>
> *Subject:* Re: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> Lorenzo
>
> Without seeing the code and logs it would be very difficult to help.
> nifi has no trouble by design writing large files (GBs) to many things
> including hdfs so the issue is probably in how this client library
> interacts with the data stream.
>
>
>
> On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:
>
> Hi All,
>
>
>
> We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data
> Lake using a custom processor in nifi 0.7-1.0. This custom processor has
> worked consistently for smaller files, but once encountered with this
> larger file, it spits http error 404 (file not found). Eventually a minor
> portion of the file wrote to the data lake.
>
> We used fiddler to capture network traffic between Nifi and the Azure data
> lake while the processor was running and captured http error 204 (no
> contents).
>
> Then we wrote a java sdk script to upload this same file without Nifi into
> the data lake and it worked successfully.
>
> These findings lead us to believe that this issue is occurring within
> Nifi, if someone could please point us in the right direction in resolving
> this issue it would be greatly appreciated.
>
>
>
> Thank you,
>
>
>
> Lorenzo Peder
>
> Operations Analyst, Campaign Operations & Services
>
>
>
> 425.974.1363 : Office
>
> 425.260.5027 : Mobile
>
> www.ds-iq.com
>
>
>
> Dynamic Shopper Intelligence
>
>
>
> This e-mail may contain confidential or privileged information.
>
> If you are not the intended recipient, please notify the sender
> immediately and then delete this message.
>
>
>
>
>
>

RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Kumiko Yada <Ku...@ds-iq.com>.
Here is the code:  https://github.com/kyada1/ADL_UploadFile.

I removed the following values for the security reason.

final static String ADLS_ACCOUNT_NAME = "";
final static String RESOURCE_GROUP_NAME = "";
final static String LOCATION = "";
final static String TENANT_ID = "";
final static String SUBSCRIPTION_ID =  "";
final static String CLIENT_ID = "";
final static String CLIENT_SECRET = ""; /

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com]
Sent: Tuesday, September 6, 2016 2:41 PM
To: Kumiko Yada <Ku...@ds-iq.com>
Cc: Joe Witt <jo...@gmail.com>; users@nifi.apache.org; #Operations Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I was referring to this: "Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully."

Is that code somewhere?

On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
I didn’t add any test code.  This custom controller and processor is working for a small size file.

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com<ma...@gmail.com>]
Sent: Tuesday, September 6, 2016 2:32 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Cc: Joe Witt <jo...@gmail.com>>; #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I didn't see the test script that worked in the source code - did I miss it, or is it not in the tree?

On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
Joe,

Here is the log (there was no callstack related to this error) and code, https://github.com/kyada1/dl_sdkworkaround/tree/master/nifi-azure-dlstore-bundle.

2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to run with 1 threads
2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@28f414a1<ma...@28f414a1> // Another save pending = false
2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery was successful
2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1] n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt com.microsoft.azure.management.datalake.store.models.AdlsErrorException: Invalid status code 404
2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository Created new Provenance Event Writers for events starting with ID 618
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files (1 records) into single Provenance Log File .\provenance_repository\617.prov in 305 milliseconds
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully Rolled over Provenance Event file containing 1 records
2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode Stopping processor: class nifi.azure.dlstore.processors.PutFileAzureDLStore

Thanks
Kumiko

From: Joe Witt [mailto:joe.witt@gmail.com<ma...@gmail.com>]
Sent: Friday, September 2, 2016 2:56 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Cc: #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: Re: Issue writing file (~50mb) to azure data lake with Nifi


Lorenzo

Without seeing the code and logs it would be very difficult to help.   nifi has no trouble by design writing large files (GBs) to many things including hdfs so the issue is probably in how this client library interacts with the data stream.

On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com>> wrote:
Hi All,

We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data Lake using a custom processor in nifi 0.7-1.0. This custom processor has worked consistently for smaller files, but once encountered with this larger file, it spits http error 404 (file not found). Eventually a minor portion of the file wrote to the data lake.
We used fiddler to capture network traffic between Nifi and the Azure data lake while the processor was running and captured http error 204 (no contents).
Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully.
These findings lead us to believe that this issue is occurring within Nifi, if someone could please point us in the right direction in resolving this issue it would be greatly appreciated.

Thank you,

Lorenzo Peder
Operations Analyst, Campaign Operations & Services

425.974.1363<tel:425.974.1363> : Office
425.260.5027<tel:425.260.5027> : Mobile
www.ds-iq.com<http://www.ds-iq.com>

[cid:image001.png@01D2084D.A7D565D0]
Dynamic Shopper Intelligence

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete this message.



RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Tony Kurc <tr...@gmail.com>.
I was referring to this: "Then we wrote a java sdk script to upload this
same file without Nifi into the data lake and it worked successfully."

Is that code somewhere?

On Sep 6, 2016 5:38 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:

> I didn’t add any test code.  This custom controller and processor is
> working for a small size file.
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Tony Kurc [mailto:trkurc@gmail.com]
> *Sent:* Tuesday, September 6, 2016 2:32 PM
> *To:* users@nifi.apache.org
> *Cc:* Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#
> opstools@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
> *Subject:* RE: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> I didn't see the test script that worked in the source code - did I miss
> it, or is it not in the tree?
>
>
>
> On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:
>
> Joe,
>
>
>
> Here is the log (there was no callstack related to this error) and code,
> https://github.com/kyada1/dl_sdkworkaround/tree/master/
> nifi-azure-dlstore-bundle.
>
>
>
> 2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl
> Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
>
> 2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent
> Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to
> run with 1 threads
>
> 2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1]
> o.a.nifi.controller.StandardFlowService Saved flow controller
> org.apache.nifi.controller.FlowController@28f414a1 // Another save
> pending = false
>
> 2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority
> [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery
> was successful
>
> 2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1]
> n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=
> 00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt
> com.microsoft.azure.management.datalake.store.models.AdlsErrorException:
> Invalid status code 404
>
> 2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository
> Created new Provenance Event Writers for events starting with ID 618
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal
> files (1 records) into single Provenance Log File
> .\provenance_repository\617.prov in 305 milliseconds
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully Rolled over
> Provenance Event file containing 1 records
>
> 2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode
> Stopping processor: class nifi.azure.dlstore.processors.
> PutFileAzureDLStore
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Joe Witt [mailto:joe.witt@gmail.com]
> *Sent:* Friday, September 2, 2016 2:56 PM
> *To:* users@nifi.apache.org
> *Cc:* #Operations Automation and Tools <#o...@ds-iq.com>; Kevin
> Verhoeven <Ke...@ds-iq.com>
> *Subject:* Re: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> Lorenzo
>
> Without seeing the code and logs it would be very difficult to help.
> nifi has no trouble by design writing large files (GBs) to many things
> including hdfs so the issue is probably in how this client library
> interacts with the data stream.
>
>
>
> On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:
>
> Hi All,
>
>
>
> We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data
> Lake using a custom processor in nifi 0.7-1.0. This custom processor has
> worked consistently for smaller files, but once encountered with this
> larger file, it spits http error 404 (file not found). Eventually a minor
> portion of the file wrote to the data lake.
>
> We used fiddler to capture network traffic between Nifi and the Azure data
> lake while the processor was running and captured http error 204 (no
> contents).
>
> Then we wrote a java sdk script to upload this same file without Nifi into
> the data lake and it worked successfully.
>
> These findings lead us to believe that this issue is occurring within
> Nifi, if someone could please point us in the right direction in resolving
> this issue it would be greatly appreciated.
>
>
>
> Thank you,
>
>
>
> Lorenzo Peder
>
> Operations Analyst, Campaign Operations & Services
>
>
>
> 425.974.1363 : Office
>
> 425.260.5027 : Mobile
>
> www.ds-iq.com
>
>
>
> Dynamic Shopper Intelligence
>
>
>
> This e-mail may contain confidential or privileged information.
>
> If you are not the intended recipient, please notify the sender
> immediately and then delete this message.
>
>
>
>
>
>

RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Kumiko Yada <Ku...@ds-iq.com>.
I didn’t add any test code.  This custom controller and processor is working for a small size file.

Thanks
Kumiko

From: Tony Kurc [mailto:trkurc@gmail.com]
Sent: Tuesday, September 6, 2016 2:32 PM
To: users@nifi.apache.org
Cc: Joe Witt <jo...@gmail.com>; #Operations Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
Subject: RE: Issue writing file (~50mb) to azure data lake with Nifi


I didn't see the test script that worked in the source code - did I miss it, or is it not in the tree?

On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com>> wrote:
Joe,

Here is the log (there was no callstack related to this error) and code, https://github.com/kyada1/dl_sdkworkaround/tree/master/nifi-azure-dlstore-bundle.

2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to run with 1 threads
2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@28f414a1<ma...@28f414a1> // Another save pending = false
2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery was successful
2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1] n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt com.microsoft.azure.management.datalake.store.models.AdlsErrorException: Invalid status code 404
2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository Created new Provenance Event Writers for events starting with ID 618
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files (1 records) into single Provenance Log File .\provenance_repository\617.prov in 305 milliseconds
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully Rolled over Provenance Event file containing 1 records
2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode Stopping processor: class nifi.azure.dlstore.processors.PutFileAzureDLStore

Thanks
Kumiko

From: Joe Witt [mailto:joe.witt@gmail.com<ma...@gmail.com>]
Sent: Friday, September 2, 2016 2:56 PM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Cc: #Operations Automation and Tools <#o...@ds-iq.com>>; Kevin Verhoeven <Ke...@ds-iq.com>>
Subject: Re: Issue writing file (~50mb) to azure data lake with Nifi


Lorenzo

Without seeing the code and logs it would be very difficult to help.   nifi has no trouble by design writing large files (GBs) to many things including hdfs so the issue is probably in how this client library interacts with the data stream.

On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com>> wrote:
Hi All,

We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data Lake using a custom processor in nifi 0.7-1.0. This custom processor has worked consistently for smaller files, but once encountered with this larger file, it spits http error 404 (file not found). Eventually a minor portion of the file wrote to the data lake.
We used fiddler to capture network traffic between Nifi and the Azure data lake while the processor was running and captured http error 204 (no contents).
Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully.
These findings lead us to believe that this issue is occurring within Nifi, if someone could please point us in the right direction in resolving this issue it would be greatly appreciated.

Thank you,

Lorenzo Peder
Operations Analyst, Campaign Operations & Services

425.974.1363<tel:425.974.1363> : Office
425.260.5027<tel:425.260.5027> : Mobile
www.ds-iq.com<http://www.ds-iq.com>

[cid:image001.png@01D2084C.57E02610]
Dynamic Shopper Intelligence

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete this message.



RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Tony Kurc <tr...@gmail.com>.
I didn't see the test script that worked in the source code - did I miss
it, or is it not in the tree?

On Sep 6, 2016 3:17 PM, "Kumiko Yada" <Ku...@ds-iq.com> wrote:

> Joe,
>
>
>
> Here is the log (there was no callstack related to this error) and code,
> https://github.com/kyada1/dl_sdkworkaround/tree/master/
> nifi-azure-dlstore-bundle.
>
>
>
> 2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl
> Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
>
> 2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent
> Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to
> run with 1 threads
>
> 2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1]
> o.a.nifi.controller.StandardFlowService Saved flow controller
> org.apache.nifi.controller.FlowController@28f414a1 // Another save
> pending = false
>
> 2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority
> [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery
> was successful
>
> 2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1]
> n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=
> 00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt
> com.microsoft.azure.management.datalake.store.models.AdlsErrorException:
> Invalid status code 404
>
> 2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository
> Created new Provenance Event Writers for events starting with ID 618
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal
> files (1 records) into single Provenance Log File
> .\provenance_repository\617.prov in 305 milliseconds
>
> 2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1]
> o.a.n.p.PersistentProvenanceRepository Successfully Rolled over
> Provenance Event file containing 1 records
>
> 2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode
> Stopping processor: class nifi.azure.dlstore.processors.
> PutFileAzureDLStore
>
>
>
> Thanks
>
> Kumiko
>
>
>
> *From:* Joe Witt [mailto:joe.witt@gmail.com]
> *Sent:* Friday, September 2, 2016 2:56 PM
> *To:* users@nifi.apache.org
> *Cc:* #Operations Automation and Tools <#o...@ds-iq.com>; Kevin
> Verhoeven <Ke...@ds-iq.com>
> *Subject:* Re: Issue writing file (~50mb) to azure data lake with Nifi
>
>
>
> Lorenzo
>
> Without seeing the code and logs it would be very difficult to help.
> nifi has no trouble by design writing large files (GBs) to many things
> including hdfs so the issue is probably in how this client library
> interacts with the data stream.
>
>
>
> On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:
>
> Hi All,
>
>
>
> We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data
> Lake using a custom processor in nifi 0.7-1.0. This custom processor has
> worked consistently for smaller files, but once encountered with this
> larger file, it spits http error 404 (file not found). Eventually a minor
> portion of the file wrote to the data lake.
>
> We used fiddler to capture network traffic between Nifi and the Azure data
> lake while the processor was running and captured http error 204 (no
> contents).
>
> Then we wrote a java sdk script to upload this same file without Nifi into
> the data lake and it worked successfully.
>
> These findings lead us to believe that this issue is occurring within
> Nifi, if someone could please point us in the right direction in resolving
> this issue it would be greatly appreciated.
>
>
>
> Thank you,
>
>
>
> Lorenzo Peder
>
> Operations Analyst, Campaign Operations & Services
>
>
>
> 425.974.1363 : Office
>
> 425.260.5027 : Mobile
>
> www.ds-iq.com
>
>
>
> Dynamic Shopper Intelligence
>
>
>
> This e-mail may contain confidential or privileged information.
>
> If you are not the intended recipient, please notify the sender
> immediately and then delete this message.
>
>
>
>
>
>

RE: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Kumiko Yada <Ku...@ds-iq.com>.
Joe,

Here is the log (there was no callstack related to this error) and code, https://github.com/kyada1/dl_sdkworkaround/tree/master/nifi-azure-dlstore-bundle.

2016-09-06 12:06:50,508 INFO [NiFi Web Server-19] c.s.j.s.i.application.WebApplicationImpl Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
2016-09-06 12:07:00,991 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] to run with 1 threads
2016-09-06 12:07:01,545 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@28f414a1 // Another save pending = false
2016-09-06 12:07:01,904 INFO [pool-27-thread-1] c.m.aad.adal4j.AuthenticationAuthority [Correlation ID: 564fb5ec-643b-43e6-ab68-59f259a4843a] Instance discovery was successful
2016-09-06 12:08:05,988 ERROR [Timer-Driven Process Thread-1] n.a.d.processors.PutFileAzureDLStore PutFileAzureDLStore[id=00dd95dc-0157-1000-8ab1-2de88c159b55] File was not created: /kumiko/test/20160906120701022.txt com.microsoft.azure.management.datalake.store.models.AdlsErrorException: Invalid status code 404
2016-09-06 12:08:12,541 INFO [Provenance Maintenance Thread-1] o.a.n.p.PersistentProvenanceRepository Created new Provenance Event Writers for events starting with ID 618
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files (1 records) into single Provenance Log File .\provenance_repository\617.prov in 305 milliseconds
2016-09-06 12:08:12,838 INFO [Provenance Repository Rollover Thread-1] o.a.n.p.PersistentProvenanceRepository Successfully Rolled over Provenance Event file containing 1 records
2016-09-06 12:08:21,148 INFO [NiFi Web Server-28] o.a.n.controller.StandardProcessorNode Stopping processor: class nifi.azure.dlstore.processors.PutFileAzureDLStore

Thanks
Kumiko

From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: Friday, September 2, 2016 2:56 PM
To: users@nifi.apache.org
Cc: #Operations Automation and Tools <#o...@ds-iq.com>; Kevin Verhoeven <Ke...@ds-iq.com>
Subject: Re: Issue writing file (~50mb) to azure data lake with Nifi


Lorenzo

Without seeing the code and logs it would be very difficult to help.   nifi has no trouble by design writing large files (GBs) to many things including hdfs so the issue is probably in how this client library interacts with the data stream.

On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com>> wrote:
Hi All,

We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data Lake using a custom processor in nifi 0.7-1.0. This custom processor has worked consistently for smaller files, but once encountered with this larger file, it spits http error 404 (file not found). Eventually a minor portion of the file wrote to the data lake.
We used fiddler to capture network traffic between Nifi and the Azure data lake while the processor was running and captured http error 204 (no contents).
Then we wrote a java sdk script to upload this same file without Nifi into the data lake and it worked successfully.
These findings lead us to believe that this issue is occurring within Nifi, if someone could please point us in the right direction in resolving this issue it would be greatly appreciated.

Thank you,

Lorenzo Peder
Operations Analyst, Campaign Operations & Services

425.974.1363<tel:425.974.1363> : Office
425.260.5027<tel:425.260.5027> : Mobile
www.ds-iq.com<http://www.ds-iq.com>

[cid:image001.png@01D20837.F7850880]
Dynamic Shopper Intelligence

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete this message.



Re: Issue writing file (~50mb) to azure data lake with Nifi

Posted by Joe Witt <jo...@gmail.com>.
Lorenzo

Without seeing the code and logs it would be very difficult to help.   nifi
has no trouble by design writing large files (GBs) to many things including
hdfs so the issue is probably in how this client library interacts with the
data stream.

On Sep 2, 2016 4:19 PM, "Lorenzo Peder" <Lo...@ds-iq.com> wrote:

> Hi All,
>
>
>
> We’ve run into an issue uploading a larger file (~50Mb) into an Azure Data
> Lake using a custom processor in nifi 0.7-1.0. This custom processor has
> worked consistently for smaller files, but once encountered with this
> larger file, it spits http error 404 (file not found). Eventually a minor
> portion of the file wrote to the data lake.
>
> We used fiddler to capture network traffic between Nifi and the Azure data
> lake while the processor was running and captured http error 204 (no
> contents).
>
> Then we wrote a java sdk script to upload this same file without Nifi into
> the data lake and it worked successfully.
>
> These findings lead us to believe that this issue is occurring within
> Nifi, if someone could please point us in the right direction in resolving
> this issue it would be greatly appreciated.
>
>
>
> Thank you,
>
>
>
> Lorenzo Peder
>
> Operations Analyst, Campaign Operations & Services
>
>
>
> 425.974.1363 : Office
>
> 425.260.5027 : Mobile
>
> www.ds-iq.com
>
>
>
> Dynamic Shopper Intelligence
>
>
>
> This e-mail may contain confidential or privileged information.
>
> If you are not the intended recipient, please notify the sender
> immediately and then delete this message.
>
>
>
>
>