You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Tomislav Novosel <to...@gmail.com> on 2019/10/03 06:32:24 UTC

Nifi errors - FetchFile and UnpackContent

Hi all,

I'm getting errors from FetchFile and UnpackContent processors.
I have pipeline where I fetch zip files as they come continuously on shared
network drive
with Minimum file age set to 30 sec to avoid fetching file before it is
written to disk completely.

Sometimes I get this error from FetchFile:

FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
\\avl01\ATGRZ\TestFactory\02 Dep Service\01
Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
from file system for
StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
because the existence of the file cannot be verified; routing to failure


And from UnpackContent sometimes I get this error:


UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1570052741201-5000,
container=default, section=904], offset=1651,
length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
IOException thrown from
UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
java.io.IOException: Truncated ZIP file; routing to failure:

org.apache.nifi.processor.exception.ProcessException: IOException thrown
from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
java.io.IOException: Truncated ZIP file


After getting this error from UnpackContent I tried to fetch file again and
to unpack it. It went well, without any errors.
So what does this errors mean? I spoke to colleagues who are using this
files on the source side and they said files are ok, not corrupted or
something.

Please help or give advice.

Thanks in advance.
Tom

Re: Nifi errors - FetchFile and UnpackContent

Posted by Tomislav Novosel <to...@gmail.com>.
Any more suggestions to this situation?

Thanks,
Tom

On Thu, 3 Oct 2019 at 19:54, Tomislav Novosel <to...@gmail.com> wrote:

> Hi Jeff,
>
> None of this is applied in pipeline and FetchFile processor.
> It is not on cluster, it runs only on one standalone Nifi instance.
> Completion strategy is on None, nor deleting, nor Moving.
>
> Only thing that can be is that someone else uses the file at the same time
> because that shared disk is used by other people to
> who are reading the files and doing some analysis.
>
> Can that be also the cause for Truncated ZIP file on UnpackContent
> processor?
>
> I applied loopback relationship on that processors for failure flowfiles
> to retry on failure.
>
> Thanks.
> Tom
>
> On Thu, 3 Oct 2019 at 17:18, Jeff <jt...@gmail.com> wrote:
>
>> Hello Tomislav,
>>
>> Are these processors running in a multi-node cluster?  Is FetchFile
>> downstream from a ListFile processor that is scheduled to run on all nodes
>> versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
>> File" or "Delete File"?  Typically, source processors should be scheduled
>> to run on the primary node, otherwise when reading from the same source
>> across multiple nodes, for example a shared network drive, each source
>> processor might pull the same data.  In a situation like this, the same
>> file could be listed by each node, and the FetchFile processor on each node
>> may attempt to fetch the same file.
>>
>> If you set the source processor to run on Primary Node only, you can
>> load-balance the connection between the source processor and FetchFile to
>> distribute the load of fetching the files across the cluster.
>>
>> On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel <to...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm getting errors from FetchFile and UnpackContent processors.
>>> I have pipeline where I fetch zip files as they come continuously on
>>> shared network drive
>>> with Minimum file age set to 30 sec to avoid fetching file before it is
>>> written to disk completely.
>>>
>>> Sometimes I get this error from FetchFile:
>>>
>>> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
>>> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
>>> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
>>> from file system for
>>> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
>>> because the existence of the file cannot be verified; routing to failure
>>>
>>>
>>> And from UnpackContent sometimes I get this error:
>>>
>>>
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
>>> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
>>> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
>>> container=default, section=904], offset=1651,
>>> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
>>> IOException thrown from
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file; routing to failure:
>>>
>>> org.apache.nifi.processor.exception.ProcessException: IOException thrown
>>> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file
>>>
>>>
>>> After getting this error from UnpackContent I tried to fetch file again
>>> and to unpack it. It went well, without any errors.
>>> So what does this errors mean? I spoke to colleagues who are using this
>>> files on the source side and they said files are ok, not corrupted or
>>> something.
>>>
>>> Please help or give advice.
>>>
>>> Thanks in advance.
>>> Tom
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Re: Nifi errors - FetchFile and UnpackContent

Posted by Tomislav Novosel <to...@gmail.com>.
Hi Jeff,

None of this is applied in pipeline and FetchFile processor.
It is not on cluster, it runs only on one standalone Nifi instance.
Completion strategy is on None, nor deleting, nor Moving.

Only thing that can be is that someone else uses the file at the same time
because that shared disk is used by other people to
who are reading the files and doing some analysis.

Can that be also the cause for Truncated ZIP file on UnpackContent
processor?

I applied loopback relationship on that processors for failure flowfiles to
retry on failure.

Thanks.
Tom

On Thu, 3 Oct 2019 at 17:18, Jeff <jt...@gmail.com> wrote:

> Hello Tomislav,
>
> Are these processors running in a multi-node cluster?  Is FetchFile
> downstream from a ListFile processor that is scheduled to run on all nodes
> versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
> File" or "Delete File"?  Typically, source processors should be scheduled
> to run on the primary node, otherwise when reading from the same source
> across multiple nodes, for example a shared network drive, each source
> processor might pull the same data.  In a situation like this, the same
> file could be listed by each node, and the FetchFile processor on each node
> may attempt to fetch the same file.
>
> If you set the source processor to run on Primary Node only, you can
> load-balance the connection between the source processor and FetchFile to
> distribute the load of fetching the files across the cluster.
>
> On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel <to...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I'm getting errors from FetchFile and UnpackContent processors.
>> I have pipeline where I fetch zip files as they come continuously on
>> shared network drive
>> with Minimum file age set to 30 sec to avoid fetching file before it is
>> written to disk completely.
>>
>> Sometimes I get this error from FetchFile:
>>
>> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
>> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
>> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
>> from file system for
>> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
>> because the existence of the file cannot be verified; routing to failure
>>
>>
>> And from UnpackContent sometimes I get this error:
>>
>>
>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
>> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
>> container=default, section=904], offset=1651,
>> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
>> IOException thrown from
>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>> java.io.IOException: Truncated ZIP file; routing to failure:
>>
>> org.apache.nifi.processor.exception.ProcessException: IOException thrown
>> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>> java.io.IOException: Truncated ZIP file
>>
>>
>> After getting this error from UnpackContent I tried to fetch file again
>> and to unpack it. It went well, without any errors.
>> So what does this errors mean? I spoke to colleagues who are using this
>> files on the source side and they said files are ok, not corrupted or
>> something.
>>
>> Please help or give advice.
>>
>> Thanks in advance.
>> Tom
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>

Re: Nifi errors - FetchFile and UnpackContent

Posted by Jeff <jt...@gmail.com>.
Hello Tomislav,

Are these processors running in a multi-node cluster?  Is FetchFile
downstream from a ListFile processor that is scheduled to run on all nodes
versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
File" or "Delete File"?  Typically, source processors should be scheduled
to run on the primary node, otherwise when reading from the same source
across multiple nodes, for example a shared network drive, each source
processor might pull the same data.  In a situation like this, the same
file could be listed by each node, and the FetchFile processor on each node
may attempt to fetch the same file.

If you set the source processor to run on Primary Node only, you can
load-balance the connection between the source processor and FetchFile to
distribute the load of fetching the files across the cluster.

On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel <to...@gmail.com>
wrote:

> Hi all,
>
> I'm getting errors from FetchFile and UnpackContent processors.
> I have pipeline where I fetch zip files as they come continuously on
> shared network drive
> with Minimum file age set to 30 sec to avoid fetching file before it is
> written to disk completely.
>
> Sometimes I get this error from FetchFile:
>
> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
> from file system for
> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
> because the existence of the file cannot be verified; routing to failure
>
>
> And from UnpackContent sometimes I get this error:
>
>
> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
> container=default, section=904], offset=1651,
> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
> IOException thrown from
> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
> java.io.IOException: Truncated ZIP file; routing to failure:
>
> org.apache.nifi.processor.exception.ProcessException: IOException thrown
> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
> java.io.IOException: Truncated ZIP file
>
>
> After getting this error from UnpackContent I tried to fetch file again
> and to unpack it. It went well, without any errors.
> So what does this errors mean? I spoke to colleagues who are using this
> files on the source side and they said files are ok, not corrupted or
> something.
>
> Please help or give advice.
>
> Thanks in advance.
> Tom
>
>
>
>
>
>
>
>
>
>
>