You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Ameer Mawia <am...@gmail.com> on 2019/07/19 18:03:57 UTC

Bug/Issue with ReplaceTextWithMapping

Guys,

It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
Refreshing its Mapped file. We are using its functionality in PROD and
getting odd behaviour.

Our USAGE Scenario:

   - We use NIFI primarily as a TRANSFORMATION Tool.
   - Our flow involves:


   1. Getting a raw csv file.
      2. Split the file on per line basis:
         1. So from one source flow file - we may 10000 flows
         generated/splitted out.
      3. For each of the splitted flow file(flowfiles for individual lines)
      we perform transformation on the attributes.
      4. We merge these flowfiles back and write the Output file.


As part of the transformation in Step#4, we do some mapping for one of the
field in the csv. For this we use ReplaceTextWithMapping  Processor. Also
to note we update ourmapping file just before starting our flow(ie. Step #1)

Our Issue:


   - We have noted for SAME key we get two DIFFERENT values in two
   different flowfile.
   - We noted: that one the value mapped existed in older Mapping file.
   - So in essence: ReplaceTextWithMapping Processor didn't refresh its
   cash uptil certain time. And thus return old value for few mapping file and
   then - once in the meanwhile it has refreshed it cache - returned new
   mapped value.
   - So this cause the issue.

Question:

   - Is this a known issue with  ReplaceTextWithMapping Processor?
   - If not how can I create an issue for this?
   - How can I confirm this behaviour?

Thanks,
Ameer Mawia




-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
That make sense. Thanks Koji for prompt replies. Appreciate it.

Thankyou.

On Tue, Jul 30, 2019 at 6:20 AM Koji Kawamura <ij...@gmail.com>
wrote:

> The tryLock method does not block if a lock is already acquired by other
> thead.
>
> https://github.com/apache/nifi/blob/f8e93186f53917b1fddbc2ae3de26b65a99b9246/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java#L239
>
> On Mon, Jul 29, 2019, 23:24 Ameer Mawia <am...@gmail.com> wrote:
>
>> Adding reference link
>> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>(to
>> the code).
>>
>> On Mon, Jul 29, 2019 at 10:21 AM Ameer Mawia <am...@gmail.com>
>> wrote:
>>
>>> Thanks for reply.
>>>
>>> Hmm, that should explain the behavior we noted.
>>>
>>> But I see(here
>>> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>)
>>> an instance level lock which is protecting the update Mapping method. *Shouldn't
>>> that eventually block other threads from accessing the old mapping?*
>>>
>>> Or may that this locking was added later -  version 1.9 or something? We
>>> are using 1.8.
>>>
>>> Thanks,
>>> Ameer Mawia
>>>
>>> On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura <ij...@gmail.com>
>>> wrote:
>>>
>>>> Hi Ameer,
>>>>
>>>> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
>>>> Since ReplaceTextWithMapping only reload at a single thread, other
>>>> threads may use old mapping until the loading thread complete
>>>> refreshing mapping definition.
>>>>
>>>> Thanks,
>>>> Koji
>>>>
>>>> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <am...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Inline.
>>>> >
>>>> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> Hi Ameer,
>>>> >>
>>>> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
>>>> configured?
>>>> >
>>>> > [Ameer] It is configured to 1sec - the lowest value allowed.
>>>> >>
>>>> >> By default, it's set to '60s'. So,
>>>> >> 1. If ReplaceTextWithMapping ran with the old mapping file
>>>> >
>>>> > [Ameer] First Processing took place on Day-1. A new Mapping was
>>>> dropped on Day-1, after Day-1 Processing was over.
>>>> >>
>>>> >> 2. and the mapping file was updated for the next processing
>>>> >
>>>> > [Ameer] Second Processing took place on Day-2.
>>>> > [Ameer] Here assumption was CACHE will be refreshed from the new
>>>> mapping file dropped a day earlier. But ti diddnt happend. Cache got
>>>> refreshed in the middle of the flow - not at the very beginnning. Thus few
>>>> flowfile got old value and later flowfile got new value.
>>>> >>
>>>> >> 3. then the flow started processing another CSV file right away line
>>>> by line
>>>> >>
>>>> >> In above scenario, some lines in the CSV might get processed with the
>>>> >> old mapping file. After 60s passed from 1, some other lines may get
>>>> >> processed with the new mappings. Is that what you're seeing?
>>>> >>
>>>> > [Ameer] This is what is happening. But it shouldn't have - becuase
>>>> new mapping file was already existing before the next processing begin. It
>>>> should have refresh right at the start - as also suggested by the code of
>>>> the ReplaceTextWithMapping processor.
>>>> >>
>>>> >> BTW, please avoid posting the same question to users and dev at the
>>>> >> same time. I've removed dev address.
>>>> >> [Ameer] Got it.
>>>> >> Thanks,
>>>> >> Koji
>>>> >>
>>>> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com>
>>>> wrote:
>>>> >> >
>>>> >> > Correcting Typo.
>>>> >> >
>>>> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> Guys,
>>>> >> >>
>>>> >> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG
>>>> with Refreshing its Mapped file. We are using its functionality in PROD and
>>>> getting odd behaviour.
>>>> >> >>
>>>> >> >> Our USAGE Scenario:
>>>> >> >>
>>>> >> >> We use NIFI primarily as a TRANSFORMATION Tool.
>>>> >> >> Our flow involves:
>>>> >> >>
>>>> >> >> Getting a raw csv file.
>>>> >> >> Split the file on per line basis:
>>>> >> >>
>>>> >> >> So from one source flowfile - we may have 10000 flowfile
>>>> generated/splitted out.
>>>> >> >>
>>>> >> >> For each of the splitted flow file(flowfiles for individual
>>>> lines) we perform transformation on the attributes.
>>>> >> >> We merge these flowfiles back and write the Output file.
>>>> >> >>
>>>> >> >>
>>>> >> >> As part of the transformation in Step#3, we do some mapping for
>>>> one of the field in the csv. For this we use ReplaceTextWithMapping
>>>> Processor. Also to note we update our mapping file just before starting our
>>>> flow(ie. Step #1)
>>>> >> >>
>>>> >> >> Our Issue:
>>>> >> >>
>>>> >> >> We have noted for SAME key we get two DIFFERENT values in two
>>>> different flowfiles.
>>>> >> >> We noted that one of the value mapped, existed in an older
>>>> Mapping file.
>>>> >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh
>>>> its cash uptill certain time. And thus return the old value for few mapping
>>>> file and then - once in the meanwhile it has refreshed it cache - returned
>>>> new updated value.
>>>> >> >> And this cause the issue?
>>>> >> >>
>>>> >> >> Question:
>>>> >> >>
>>>> >> >> Is this a known issue with  ReplaceTextWithMapping Processor?
>>>> >> >> If not how can I create an issue for this?
>>>> >> >> How can I confirm this behaviour?
>>>> >> >>
>>>> >> >> Thanks,
>>>> >> >> Ameer Mawia
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> http://ca.linkedin.com/in/ameermawia
>>>> >> >> Toronto, ON
>>>> >> >>
>>>> >> >
>>>> >> >
>>>> >> > --
>>>> >> > http://ca.linkedin.com/in/ameermawia
>>>> >> > Toronto, ON
>>>> >> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > http://ca.linkedin.com/in/ameermawia
>>>> > Toronto, ON
>>>> >
>>>>
>>>
>>>
>>> --
>>> http://ca.linkedin.com/in/ameermawia
>>> Toronto, ON
>>>
>>>
>>
>> --
>> http://ca.linkedin.com/in/ameermawia
>> Toronto, ON
>>
>>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Koji Kawamura <ij...@gmail.com>.
The tryLock method does not block if a lock is already acquired by other
thead.
https://github.com/apache/nifi/blob/f8e93186f53917b1fddbc2ae3de26b65a99b9246/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java#L239

On Mon, Jul 29, 2019, 23:24 Ameer Mawia <am...@gmail.com> wrote:

> Adding reference link
> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>(to
> the code).
>
> On Mon, Jul 29, 2019 at 10:21 AM Ameer Mawia <am...@gmail.com>
> wrote:
>
>> Thanks for reply.
>>
>> Hmm, that should explain the behavior we noted.
>>
>> But I see(here
>> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>)
>> an instance level lock which is protecting the update Mapping method. *Shouldn't
>> that eventually block other threads from accessing the old mapping?*
>>
>> Or may that this locking was added later -  version 1.9 or something? We
>> are using 1.8.
>>
>> Thanks,
>> Ameer Mawia
>>
>> On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura <ij...@gmail.com>
>> wrote:
>>
>>> Hi Ameer,
>>>
>>> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
>>> Since ReplaceTextWithMapping only reload at a single thread, other
>>> threads may use old mapping until the loading thread complete
>>> refreshing mapping definition.
>>>
>>> Thanks,
>>> Koji
>>>
>>> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <am...@gmail.com>
>>> wrote:
>>> >
>>> > Inline.
>>> >
>>> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hi Ameer,
>>> >>
>>> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
>>> configured?
>>> >
>>> > [Ameer] It is configured to 1sec - the lowest value allowed.
>>> >>
>>> >> By default, it's set to '60s'. So,
>>> >> 1. If ReplaceTextWithMapping ran with the old mapping file
>>> >
>>> > [Ameer] First Processing took place on Day-1. A new Mapping was
>>> dropped on Day-1, after Day-1 Processing was over.
>>> >>
>>> >> 2. and the mapping file was updated for the next processing
>>> >
>>> > [Ameer] Second Processing took place on Day-2.
>>> > [Ameer] Here assumption was CACHE will be refreshed from the new
>>> mapping file dropped a day earlier. But ti diddnt happend. Cache got
>>> refreshed in the middle of the flow - not at the very beginnning. Thus few
>>> flowfile got old value and later flowfile got new value.
>>> >>
>>> >> 3. then the flow started processing another CSV file right away line
>>> by line
>>> >>
>>> >> In above scenario, some lines in the CSV might get processed with the
>>> >> old mapping file. After 60s passed from 1, some other lines may get
>>> >> processed with the new mappings. Is that what you're seeing?
>>> >>
>>> > [Ameer] This is what is happening. But it shouldn't have - becuase new
>>> mapping file was already existing before the next processing begin. It
>>> should have refresh right at the start - as also suggested by the code of
>>> the ReplaceTextWithMapping processor.
>>> >>
>>> >> BTW, please avoid posting the same question to users and dev at the
>>> >> same time. I've removed dev address.
>>> >> [Ameer] Got it.
>>> >> Thanks,
>>> >> Koji
>>> >>
>>> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com>
>>> wrote:
>>> >> >
>>> >> > Correcting Typo.
>>> >> >
>>> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Guys,
>>> >> >>
>>> >> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG
>>> with Refreshing its Mapped file. We are using its functionality in PROD and
>>> getting odd behaviour.
>>> >> >>
>>> >> >> Our USAGE Scenario:
>>> >> >>
>>> >> >> We use NIFI primarily as a TRANSFORMATION Tool.
>>> >> >> Our flow involves:
>>> >> >>
>>> >> >> Getting a raw csv file.
>>> >> >> Split the file on per line basis:
>>> >> >>
>>> >> >> So from one source flowfile - we may have 10000 flowfile
>>> generated/splitted out.
>>> >> >>
>>> >> >> For each of the splitted flow file(flowfiles for individual lines)
>>> we perform transformation on the attributes.
>>> >> >> We merge these flowfiles back and write the Output file.
>>> >> >>
>>> >> >>
>>> >> >> As part of the transformation in Step#3, we do some mapping for
>>> one of the field in the csv. For this we use ReplaceTextWithMapping
>>> Processor. Also to note we update our mapping file just before starting our
>>> flow(ie. Step #1)
>>> >> >>
>>> >> >> Our Issue:
>>> >> >>
>>> >> >> We have noted for SAME key we get two DIFFERENT values in two
>>> different flowfiles.
>>> >> >> We noted that one of the value mapped, existed in an older Mapping
>>> file.
>>> >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its
>>> cash uptill certain time. And thus return the old value for few mapping
>>> file and then - once in the meanwhile it has refreshed it cache - returned
>>> new updated value.
>>> >> >> And this cause the issue?
>>> >> >>
>>> >> >> Question:
>>> >> >>
>>> >> >> Is this a known issue with  ReplaceTextWithMapping Processor?
>>> >> >> If not how can I create an issue for this?
>>> >> >> How can I confirm this behaviour?
>>> >> >>
>>> >> >> Thanks,
>>> >> >> Ameer Mawia
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> http://ca.linkedin.com/in/ameermawia
>>> >> >> Toronto, ON
>>> >> >>
>>> >> >
>>> >> >
>>> >> > --
>>> >> > http://ca.linkedin.com/in/ameermawia
>>> >> > Toronto, ON
>>> >> >
>>> >
>>> >
>>> >
>>> > --
>>> > http://ca.linkedin.com/in/ameermawia
>>> > Toronto, ON
>>> >
>>>
>>
>>
>> --
>> http://ca.linkedin.com/in/ameermawia
>> Toronto, ON
>>
>>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>
>

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
Adding reference link
<https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>(to
the code).

On Mon, Jul 29, 2019 at 10:21 AM Ameer Mawia <am...@gmail.com> wrote:

> Thanks for reply.
>
> Hmm, that should explain the behavior we noted.
>
> But I see(here
> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceTextWithMapping.java>)
> an instance level lock which is protecting the update Mapping method. *Shouldn't
> that eventually block other threads from accessing the old mapping?*
>
> Or may that this locking was added later -  version 1.9 or something? We
> are using 1.8.
>
> Thanks,
> Ameer Mawia
>
> On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura <ij...@gmail.com>
> wrote:
>
>> Hi Ameer,
>>
>> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
>> Since ReplaceTextWithMapping only reload at a single thread, other
>> threads may use old mapping until the loading thread complete
>> refreshing mapping definition.
>>
>> Thanks,
>> Koji
>>
>> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <am...@gmail.com>
>> wrote:
>> >
>> > Inline.
>> >
>> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com>
>> wrote:
>> >>
>> >> Hi Ameer,
>> >>
>> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
>> configured?
>> >
>> > [Ameer] It is configured to 1sec - the lowest value allowed.
>> >>
>> >> By default, it's set to '60s'. So,
>> >> 1. If ReplaceTextWithMapping ran with the old mapping file
>> >
>> > [Ameer] First Processing took place on Day-1. A new Mapping was dropped
>> on Day-1, after Day-1 Processing was over.
>> >>
>> >> 2. and the mapping file was updated for the next processing
>> >
>> > [Ameer] Second Processing took place on Day-2.
>> > [Ameer] Here assumption was CACHE will be refreshed from the new
>> mapping file dropped a day earlier. But ti diddnt happend. Cache got
>> refreshed in the middle of the flow - not at the very beginnning. Thus few
>> flowfile got old value and later flowfile got new value.
>> >>
>> >> 3. then the flow started processing another CSV file right away line
>> by line
>> >>
>> >> In above scenario, some lines in the CSV might get processed with the
>> >> old mapping file. After 60s passed from 1, some other lines may get
>> >> processed with the new mappings. Is that what you're seeing?
>> >>
>> > [Ameer] This is what is happening. But it shouldn't have - becuase new
>> mapping file was already existing before the next processing begin. It
>> should have refresh right at the start - as also suggested by the code of
>> the ReplaceTextWithMapping processor.
>> >>
>> >> BTW, please avoid posting the same question to users and dev at the
>> >> same time. I've removed dev address.
>> >> [Ameer] Got it.
>> >> Thanks,
>> >> Koji
>> >>
>> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com>
>> wrote:
>> >> >
>> >> > Correcting Typo.
>> >> >
>> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com>
>> wrote:
>> >> >>
>> >> >> Guys,
>> >> >>
>> >> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG
>> with Refreshing its Mapped file. We are using its functionality in PROD and
>> getting odd behaviour.
>> >> >>
>> >> >> Our USAGE Scenario:
>> >> >>
>> >> >> We use NIFI primarily as a TRANSFORMATION Tool.
>> >> >> Our flow involves:
>> >> >>
>> >> >> Getting a raw csv file.
>> >> >> Split the file on per line basis:
>> >> >>
>> >> >> So from one source flowfile - we may have 10000 flowfile
>> generated/splitted out.
>> >> >>
>> >> >> For each of the splitted flow file(flowfiles for individual lines)
>> we perform transformation on the attributes.
>> >> >> We merge these flowfiles back and write the Output file.
>> >> >>
>> >> >>
>> >> >> As part of the transformation in Step#3, we do some mapping for one
>> of the field in the csv. For this we use ReplaceTextWithMapping  Processor.
>> Also to note we update our mapping file just before starting our flow(ie.
>> Step #1)
>> >> >>
>> >> >> Our Issue:
>> >> >>
>> >> >> We have noted for SAME key we get two DIFFERENT values in two
>> different flowfiles.
>> >> >> We noted that one of the value mapped, existed in an older Mapping
>> file.
>> >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its
>> cash uptill certain time. And thus return the old value for few mapping
>> file and then - once in the meanwhile it has refreshed it cache - returned
>> new updated value.
>> >> >> And this cause the issue?
>> >> >>
>> >> >> Question:
>> >> >>
>> >> >> Is this a known issue with  ReplaceTextWithMapping Processor?
>> >> >> If not how can I create an issue for this?
>> >> >> How can I confirm this behaviour?
>> >> >>
>> >> >> Thanks,
>> >> >> Ameer Mawia
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> http://ca.linkedin.com/in/ameermawia
>> >> >> Toronto, ON
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > http://ca.linkedin.com/in/ameermawia
>> >> > Toronto, ON
>> >> >
>> >
>> >
>> >
>> > --
>> > http://ca.linkedin.com/in/ameermawia
>> > Toronto, ON
>> >
>>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>
>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
Thanks for reply.

Hmm, that should explain the behavior we noted.

But I see an instance level lock which is protecting the update Mapping
method. *Shouldn't that eventually block other threads from accessing the
old mapping?*

Or may that this locking was added later -  version 1.9 or something? We
are using 1.8.

Thanks,
Ameer Mawia

On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura <ij...@gmail.com>
wrote:

> Hi Ameer,
>
> Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
> Since ReplaceTextWithMapping only reload at a single thread, other
> threads may use old mapping until the loading thread complete
> refreshing mapping definition.
>
> Thanks,
> Koji
>
> On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <am...@gmail.com> wrote:
> >
> > Inline.
> >
> > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com>
> wrote:
> >>
> >> Hi Ameer,
> >>
> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval'
> configured?
> >
> > [Ameer] It is configured to 1sec - the lowest value allowed.
> >>
> >> By default, it's set to '60s'. So,
> >> 1. If ReplaceTextWithMapping ran with the old mapping file
> >
> > [Ameer] First Processing took place on Day-1. A new Mapping was dropped
> on Day-1, after Day-1 Processing was over.
> >>
> >> 2. and the mapping file was updated for the next processing
> >
> > [Ameer] Second Processing took place on Day-2.
> > [Ameer] Here assumption was CACHE will be refreshed from the new mapping
> file dropped a day earlier. But ti diddnt happend. Cache got refreshed in
> the middle of the flow - not at the very beginnning. Thus few flowfile got
> old value and later flowfile got new value.
> >>
> >> 3. then the flow started processing another CSV file right away line by
> line
> >>
> >> In above scenario, some lines in the CSV might get processed with the
> >> old mapping file. After 60s passed from 1, some other lines may get
> >> processed with the new mappings. Is that what you're seeing?
> >>
> > [Ameer] This is what is happening. But it shouldn't have - becuase new
> mapping file was already existing before the next processing begin. It
> should have refresh right at the start - as also suggested by the code of
> the ReplaceTextWithMapping processor.
> >>
> >> BTW, please avoid posting the same question to users and dev at the
> >> same time. I've removed dev address.
> >> [Ameer] Got it.
> >> Thanks,
> >> Koji
> >>
> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com>
> wrote:
> >> >
> >> > Correcting Typo.
> >> >
> >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com>
> wrote:
> >> >>
> >> >> Guys,
> >> >>
> >> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG
> with Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
> >> >>
> >> >> Our USAGE Scenario:
> >> >>
> >> >> We use NIFI primarily as a TRANSFORMATION Tool.
> >> >> Our flow involves:
> >> >>
> >> >> Getting a raw csv file.
> >> >> Split the file on per line basis:
> >> >>
> >> >> So from one source flowfile - we may have 10000 flowfile
> generated/splitted out.
> >> >>
> >> >> For each of the splitted flow file(flowfiles for individual lines)
> we perform transformation on the attributes.
> >> >> We merge these flowfiles back and write the Output file.
> >> >>
> >> >>
> >> >> As part of the transformation in Step#3, we do some mapping for one
> of the field in the csv. For this we use ReplaceTextWithMapping  Processor.
> Also to note we update our mapping file just before starting our flow(ie.
> Step #1)
> >> >>
> >> >> Our Issue:
> >> >>
> >> >> We have noted for SAME key we get two DIFFERENT values in two
> different flowfiles.
> >> >> We noted that one of the value mapped, existed in an older Mapping
> file.
> >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its
> cash uptill certain time. And thus return the old value for few mapping
> file and then - once in the meanwhile it has refreshed it cache - returned
> new updated value.
> >> >> And this cause the issue?
> >> >>
> >> >> Question:
> >> >>
> >> >> Is this a known issue with  ReplaceTextWithMapping Processor?
> >> >> If not how can I create an issue for this?
> >> >> How can I confirm this behaviour?
> >> >>
> >> >> Thanks,
> >> >> Ameer Mawia
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> http://ca.linkedin.com/in/ameermawia
> >> >> Toronto, ON
> >> >>
> >> >
> >> >
> >> > --
> >> > http://ca.linkedin.com/in/ameermawia
> >> > Toronto, ON
> >> >
> >
> >
> >
> > --
> > http://ca.linkedin.com/in/ameermawia
> > Toronto, ON
> >
>


-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Koji Kawamura <ij...@gmail.com>.
Hi Ameer,

Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1?
Since ReplaceTextWithMapping only reload at a single thread, other
threads may use old mapping until the loading thread complete
refreshing mapping definition.

Thanks,
Koji

On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <am...@gmail.com> wrote:
>
> Inline.
>
> On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com> wrote:
>>
>> Hi Ameer,
>>
>> How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured?
>
> [Ameer] It is configured to 1sec - the lowest value allowed.
>>
>> By default, it's set to '60s'. So,
>> 1. If ReplaceTextWithMapping ran with the old mapping file
>
> [Ameer] First Processing took place on Day-1. A new Mapping was dropped on Day-1, after Day-1 Processing was over.
>>
>> 2. and the mapping file was updated for the next processing
>
> [Ameer] Second Processing took place on Day-2.
> [Ameer] Here assumption was CACHE will be refreshed from the new mapping file dropped a day earlier. But ti diddnt happend. Cache got refreshed in the middle of the flow - not at the very beginnning. Thus few flowfile got old value and later flowfile got new value.
>>
>> 3. then the flow started processing another CSV file right away line by line
>>
>> In above scenario, some lines in the CSV might get processed with the
>> old mapping file. After 60s passed from 1, some other lines may get
>> processed with the new mappings. Is that what you're seeing?
>>
> [Ameer] This is what is happening. But it shouldn't have - becuase new mapping file was already existing before the next processing begin. It should have refresh right at the start - as also suggested by the code of the ReplaceTextWithMapping processor.
>>
>> BTW, please avoid posting the same question to users and dev at the
>> same time. I've removed dev address.
>> [Ameer] Got it.
>> Thanks,
>> Koji
>>
>> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com> wrote:
>> >
>> > Correcting Typo.
>> >
>> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com> wrote:
>> >>
>> >> Guys,
>> >>
>> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with Refreshing its Mapped file. We are using its functionality in PROD and getting odd behaviour.
>> >>
>> >> Our USAGE Scenario:
>> >>
>> >> We use NIFI primarily as a TRANSFORMATION Tool.
>> >> Our flow involves:
>> >>
>> >> Getting a raw csv file.
>> >> Split the file on per line basis:
>> >>
>> >> So from one source flowfile - we may have 10000 flowfile generated/splitted out.
>> >>
>> >> For each of the splitted flow file(flowfiles for individual lines) we perform transformation on the attributes.
>> >> We merge these flowfiles back and write the Output file.
>> >>
>> >>
>> >> As part of the transformation in Step#3, we do some mapping for one of the field in the csv. For this we use ReplaceTextWithMapping  Processor. Also to note we update our mapping file just before starting our flow(ie. Step #1)
>> >>
>> >> Our Issue:
>> >>
>> >> We have noted for SAME key we get two DIFFERENT values in two different flowfiles.
>> >> We noted that one of the value mapped, existed in an older Mapping file.
>> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash uptill certain time. And thus return the old value for few mapping file and then - once in the meanwhile it has refreshed it cache - returned new updated value.
>> >> And this cause the issue?
>> >>
>> >> Question:
>> >>
>> >> Is this a known issue with  ReplaceTextWithMapping Processor?
>> >> If not how can I create an issue for this?
>> >> How can I confirm this behaviour?
>> >>
>> >> Thanks,
>> >> Ameer Mawia
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> http://ca.linkedin.com/in/ameermawia
>> >> Toronto, ON
>> >>
>> >
>> >
>> > --
>> > http://ca.linkedin.com/in/ameermawia
>> > Toronto, ON
>> >
>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
Inline.

On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <ij...@gmail.com>
wrote:

> Hi Ameer,
>
> How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured?
>
[Ameer] It is configured to 1sec - the lowest value allowed.

> By default, it's set to '60s'. So,
> 1. If ReplaceTextWithMapping ran with the old mapping file
>
[Ameer] First Processing took place on Day-1. A new Mapping was dropped on
Day-1, after Day-1 Processing was over.

> 2. and the mapping file was updated for the next processing
>
[Ameer] Second Processing took place on Day-2.
[Ameer] Here assumption was CACHE will be refreshed from the new mapping
file dropped a day earlier. But ti diddnt happend. Cache got refreshed in
the middle of the flow - not at the very beginnning. Thus few flowfile got
old value and later flowfile got new value.

> 3. then the flow started processing another CSV file right away line by
> line
>
> In above scenario, some lines in the CSV might get processed with the
> old mapping file. After 60s passed from 1, some other lines may get
> processed with the new mappings. Is that what you're seeing?
>
> [Ameer] This is what is happening. But it shouldn't have - becuase new
mapping file was already existing before the next processing begin. It
should have refresh right at the start - as also suggested by the code of
the ReplaceTextWithMapping processor.

> BTW, please avoid posting the same question to users and dev at the
> same time. I've removed dev address.
> [Ameer] Got it.
> Thanks,
> Koji
>
> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com> wrote:
> >
> > Correcting Typo.
> >
> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com>
> wrote:
> >>
> >> Guys,
> >>
> >> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
> Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
> >>
> >> Our USAGE Scenario:
> >>
> >> We use NIFI primarily as a TRANSFORMATION Tool.
> >> Our flow involves:
> >>
> >> Getting a raw csv file.
> >> Split the file on per line basis:
> >>
> >> So from one source flowfile - we may have 10000 flowfile
> generated/splitted out.
> >>
> >> For each of the splitted flow file(flowfiles for individual lines) we
> perform transformation on the attributes.
> >> We merge these flowfiles back and write the Output file.
> >>
> >>
> >> As part of the transformation in Step#3, we do some mapping for one of
> the field in the csv. For this we use ReplaceTextWithMapping  Processor.
> Also to note we update our mapping file just before starting our flow(ie.
> Step #1)
> >>
> >> Our Issue:
> >>
> >> We have noted for SAME key we get two DIFFERENT values in two different
> flowfiles.
> >> We noted that one of the value mapped, existed in an older Mapping file.
> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash
> uptill certain time. And thus return the old value for few mapping file and
> then - once in the meanwhile it has refreshed it cache - returned new
> updated value.
> >> And this cause the issue?
> >>
> >> Question:
> >>
> >> Is this a known issue with  ReplaceTextWithMapping Processor?
> >> If not how can I create an issue for this?
> >> How can I confirm this behaviour?
> >>
> >> Thanks,
> >> Ameer Mawia
> >>
> >>
> >>
> >>
> >> --
> >> http://ca.linkedin.com/in/ameermawia
> >> Toronto, ON
> >>
> >
> >
> > --
> > http://ca.linkedin.com/in/ameermawia
> > Toronto, ON
> >
>


-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Koji Kawamura <ij...@gmail.com>.
Hi Ameer,

How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured?
By default, it's set to '60s'. So,
1. If ReplaceTextWithMapping ran with the old mapping file
2. and the mapping file was updated for the next processing
3. then the flow started processing another CSV file right away line by line

In above scenario, some lines in the CSV might get processed with the
old mapping file. After 60s passed from 1, some other lines may get
processed with the new mappings. Is that what you're seeing?

BTW, please avoid posting the same question to users and dev at the
same time. I've removed dev address.

Thanks,
Koji

On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <am...@gmail.com> wrote:
>
> Correcting Typo.
>
> On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com> wrote:
>>
>> Guys,
>>
>> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with Refreshing its Mapped file. We are using its functionality in PROD and getting odd behaviour.
>>
>> Our USAGE Scenario:
>>
>> We use NIFI primarily as a TRANSFORMATION Tool.
>> Our flow involves:
>>
>> Getting a raw csv file.
>> Split the file on per line basis:
>>
>> So from one source flowfile - we may have 10000 flowfile generated/splitted out.
>>
>> For each of the splitted flow file(flowfiles for individual lines) we perform transformation on the attributes.
>> We merge these flowfiles back and write the Output file.
>>
>>
>> As part of the transformation in Step#3, we do some mapping for one of the field in the csv. For this we use ReplaceTextWithMapping  Processor. Also to note we update our mapping file just before starting our flow(ie. Step #1)
>>
>> Our Issue:
>>
>> We have noted for SAME key we get two DIFFERENT values in two different flowfiles.
>> We noted that one of the value mapped, existed in an older Mapping file.
>> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash uptill certain time. And thus return the old value for few mapping file and then - once in the meanwhile it has refreshed it cache - returned new updated value.
>> And this cause the issue?
>>
>> Question:
>>
>> Is this a known issue with  ReplaceTextWithMapping Processor?
>> If not how can I create an issue for this?
>> How can I confirm this behaviour?
>>
>> Thanks,
>> Ameer Mawia
>>
>>
>>
>>
>> --
>> http://ca.linkedin.com/in/ameermawia
>> Toronto, ON
>>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
Correcting Typo.

On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com> wrote:

> Guys,
>
> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
> Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
>
> Our USAGE Scenario:
>
>    - We use NIFI primarily as a TRANSFORMATION Tool.
>    - Our flow involves:
>
>
>    1. Getting a raw csv file.
>       2. Split the file on per line basis:
>          1. So from one source flowfile - we may have 10000 flowfile
>          generated/splitted out.
>       3. For each of the splitted flow file(flowfiles for individual
>       lines) we perform transformation on the attributes.
>       4. We merge these flowfiles back and write the Output file.
>
>
> As part of the transformation in Step#3, we do some mapping for one of the
> field in the csv. For this we use ReplaceTextWithMapping  Processor. Also
> to note we update our mapping file just before starting our flow(ie. Step
> #1)
>
> Our Issue:
>
>
>    - We have noted for SAME key we get two DIFFERENT values in two
>    different flowfiles.
>    - We noted that one of the value mapped, existed in an older Mapping
>    file.
>    - So in essence: ReplaceTextWithMapping Processor didn't refresh its
>    cash uptill certain time. And thus return the old value for few mapping
>    file and then - once in the meanwhile it has refreshed it cache - returned
>    new updated value.
>    - And this cause the issue?
>
> Question:
>
>    - Is this a known issue with  ReplaceTextWithMapping Processor?
>    - If not how can I create an issue for this?
>    - How can I confirm this behaviour?
>
> Thanks,
> Ameer Mawia
>
>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>
>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON

Re: Bug/Issue with ReplaceTextWithMapping

Posted by Ameer Mawia <am...@gmail.com>.
Correcting Typo.

On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <am...@gmail.com> wrote:

> Guys,
>
> It seems that NIFI  ReplaceTextWithMapping   Processors has a BUG with
> Refreshing its Mapped file. We are using its functionality in PROD and
> getting odd behaviour.
>
> Our USAGE Scenario:
>
>    - We use NIFI primarily as a TRANSFORMATION Tool.
>    - Our flow involves:
>
>
>    1. Getting a raw csv file.
>       2. Split the file on per line basis:
>          1. So from one source flowfile - we may have 10000 flowfile
>          generated/splitted out.
>       3. For each of the splitted flow file(flowfiles for individual
>       lines) we perform transformation on the attributes.
>       4. We merge these flowfiles back and write the Output file.
>
>
> As part of the transformation in Step#3, we do some mapping for one of the
> field in the csv. For this we use ReplaceTextWithMapping  Processor. Also
> to note we update our mapping file just before starting our flow(ie. Step
> #1)
>
> Our Issue:
>
>
>    - We have noted for SAME key we get two DIFFERENT values in two
>    different flowfiles.
>    - We noted that one of the value mapped, existed in an older Mapping
>    file.
>    - So in essence: ReplaceTextWithMapping Processor didn't refresh its
>    cash uptill certain time. And thus return the old value for few mapping
>    file and then - once in the meanwhile it has refreshed it cache - returned
>    new updated value.
>    - And this cause the issue?
>
> Question:
>
>    - Is this a known issue with  ReplaceTextWithMapping Processor?
>    - If not how can I create an issue for this?
>    - How can I confirm this behaviour?
>
> Thanks,
> Ameer Mawia
>
>
>
>
> --
> http://ca.linkedin.com/in/ameermawia
> Toronto, ON
>
>

-- 
http://ca.linkedin.com/in/ameermawia
Toronto, ON