You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by David Jencks <da...@yahoo.com.INVALID> on 2016/06/01 00:15:43 UTC

Re: Race condition on configuration create delete?

Fabian sent me some logs off-list that helped me figure out what is going on.  I’ve opened https://issues.apache.org/jira/browse/FELIX-5270.

Meanwhile if it would not break the rest of your system I think you might avoid triggering the problem by explicitly setting the bundle location, for instance to “?” which will not tie the configuration to any bundle.  Then the initial getConfiguration from DS when it first encounters the configuration won’t trigger a CM_LOCATION_CHANGED event.

thanks!
david jencks


> On May 31, 2016, at 2:15 PM, David Jencks <da...@yahoo.com.INVALID> wrote:
> 
> Hi Fabian,
> 
> That would be the good info in case it happens again :-)
> 
> Other than the logging changes, I don’t think anything has changed here since 2.0.2.
> 
> Meanwhile I will try to think about whether there can be a race condition we don’t currently deal with.  There might be, because we do a combination of querying config admin and config event tracking.  For bundles and services, you need a Tracker to get this right.  I once started down the path of writing a config tracker, but got bogged down and convinced myself that the change count tracking would be good enough.  So thinking about it again seems like a good idea.
> 
> thanks!
> david jencks
> 
>> On May 31, 2016, at 2:04 PM, Fabian Lange <fa...@codecentric.de> wrote:
>> 
>> Hi David,
>> I made some progress. I took the current trunk and changed version to 2.0.2
>> and compiled it, so i could just drop the dependency into my instances.
>> 
>> I added ds.loglevel=debug, but I am unsure if i capture the logs you might
>> be interested in. I see stuff like:
>> 
>> 2016-05-31T23:01:48.530+0200 | DEBUG | fb-f0d164a085de) | scr
>>              | 34 - org.apache.felix.scr - 2.0.2 | LocationChanged event,
>> same targetedPID factoryPid, location now mvn:bundle, change count 2
>> 2016-05-31T23:03:19.746+0200 | DEBUG | 2a-bba852d57901) | scr
>>              | 34 - org.apache.felix.scr - 2.0.2 | configurationEvent:
>> Handling DELETED  of Configuration
>> PID=factoryPid.8c4b462e-c7ea-4fcf-9d2a-bba852d57901 for component holders
>> [[ImmediateComponentHolder:factoryPid]]
>> 
>> However the error no longer shows up. either it is random chance, or your
>> recent changes fixed it?
>> 
>> Fabian
>> 
>> 
>> On Tue, May 31, 2016 at 10:08 PM, Fabian Lange <fa...@codecentric.de>
>> wrote:
>> 
>>> Hi David,
>>> some info is in the trace:
>>> org.apache.felix.configadmin:1.8.8
>>> org.apache.felix.scr:2.0.2
>>> 
>>> Unfortunately I cannot get the trunk version into production. but I can
>>> try the debug log anyway. But first I need to figure out where I can set
>>> this
>>> 
>>> Fabian
>>> 
>>> 
>>> On Tue, May 31, 2016 at 10:04 PM, David Jencks <
>>> david_jencks@yahoo.com.invalid> wrote:
>>> 
>>>> You might know what’s in Karaf, but I don’t.  What are the versions of
>>>> config admin and DS that you are using?
>>>> 
>>>> Since you say this is occurring in production it might not be possible,
>>>> but if you can get a log with debug trace on (setting ds.loglevel=debug)
>>>> with the trunk version of DS that would be extremely helpful figuring out
>>>> what is going on…. I added some logging of the config change count since
>>>> the last release.
>>>> 
>>>> thanks
>>>> david jencks
>>>> 
>>>>> On May 31, 2016, at 12:53 PM, Fabian Lange <fa...@codecentric.de>
>>>> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I am observing an issue in production which looks to me like a race
>>>>> condition, but I am unable to reproduce it so far.
>>>>> 
>>>>> 
>>>>> The exception is:
>>>>> java.lang.IllegalStateException: Existing Configuration with pid
>>>>> factoryPid.a5b6fe71-556c-4ce9-b09c-c19542586fc5 has had its properties
>>>> set
>>>>> to null and location changed.  We expected a delete event first.
>>>>>      at
>>>>> 
>>>> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:351)[34:org.apache.felix.scr:2.0.2]
>>>>>      at
>>>>> 
>>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2046)[4:org.apache.felix.configadmin:1.8.8]
>>>>>      at
>>>>> 
>>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2014)[4:org.apache.felix.configadmin:1.8.8]
>>>>>      at
>>>>> 
>>>> org.apache.felix.cm.impl.UpdateThread.run0(UpdateThread.java:143)[4:org.apache.felix.configadmin:1.8.8]
>>>>>      at
>>>>> 
>>>> org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:110)[4:org.apache.felix.configadmin:1.8.8]
>>>>>      at java.lang.Thread.run(Thread.java:745)[:1.8.0_91]
>>>>> 
>>>>> 
>>>>> What is happening:
>>>>> I am creating configurations in Karaf (4.0.5) using ConfigAdmin:
>>>>> 
>>>>> Configuration configuration = configAdmin.createFactoryConfiguration(
>>>>> "factoryPid", null);
>>>>> configuration.update(dictionary);
>>>>> 
>>>>> It is possible that due to validation shortly after the same config is
>>>>> deleted:
>>>>> 
>>>>> configAdmin.listConfigurations("(service.factoryPid=factoryPid)"
>>>>> )[0].delete()
>>>>> 
>>>>> My theory is that the update is still processing while it is already
>>>>> deleted again, and then the scr listener complains.
>>>>> 
>>>>> 
>>>>> Ideas?
>>>>> Best regards,
>>>>> Fabian
>>>>> 
>>>>> --
>>>>> Fabian Lange | Performance Expert
>>>>> mobil: +49 (0) 160.3673393
>>>>> 
>>>>> codecentric AG | Merscheider Straße 1 | 42699 Solingen | Deutschland
>>>>> 
>>>>> Sitz der Gesellschaft: Solingen | HRB 25917| Amtsgericht Wuppertal
>>>>> Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
>>>>> Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen
>>>> Schütz
>>>> 
>>>> 
>>> 
> 


Re: Race condition on configuration create delete?

Posted by Fabian Lange <fa...@codecentric.de>.
Hi David,
many thanks for your quick analysis!

Indeed changing bundle location to "?" makes the problem go away. It works
for this application because the configuration by design is anyway just
intended for a specific bundle. The factories use unique pids.
I will keep my reproduction scenario alive as long as possible, so I can
verify a fix for felix-5270. But no need to hurry for me. I have a
workaround now

Fabin

On Wed, Jun 1, 2016 at 2:15 AM, David Jencks <david_jencks@yahoo.com.invalid
> wrote:

> Fabian sent me some logs off-list that helped me figure out what is going
> on.  I’ve opened https://issues.apache.org/jira/browse/FELIX-5270.
>
> Meanwhile if it would not break the rest of your system I think you might
> avoid triggering the problem by explicitly setting the bundle location, for
> instance to “?” which will not tie the configuration to any bundle.  Then
> the initial getConfiguration from DS when it first encounters the
> configuration won’t trigger a CM_LOCATION_CHANGED event.
>
> thanks!
> david jencks
>
>
> > On May 31, 2016, at 2:15 PM, David Jencks <da...@yahoo.com.INVALID>
> wrote:
> >
> > Hi Fabian,
> >
> > That would be the good info in case it happens again :-)
> >
> > Other than the logging changes, I don’t think anything has changed here
> since 2.0.2.
> >
> > Meanwhile I will try to think about whether there can be a race
> condition we don’t currently deal with.  There might be, because we do a
> combination of querying config admin and config event tracking.  For
> bundles and services, you need a Tracker to get this right.  I once started
> down the path of writing a config tracker, but got bogged down and
> convinced myself that the change count tracking would be good enough.  So
> thinking about it again seems like a good idea.
> >
> > thanks!
> > david jencks
> >
> >> On May 31, 2016, at 2:04 PM, Fabian Lange <fa...@codecentric.de>
> wrote:
> >>
> >> Hi David,
> >> I made some progress. I took the current trunk and changed version to
> 2.0.2
> >> and compiled it, so i could just drop the dependency into my instances.
> >>
> >> I added ds.loglevel=debug, but I am unsure if i capture the logs you
> might
> >> be interested in. I see stuff like:
> >>
> >> 2016-05-31T23:01:48.530+0200 | DEBUG | fb-f0d164a085de) | scr
> >>              | 34 - org.apache.felix.scr - 2.0.2 | LocationChanged
> event,
> >> same targetedPID factoryPid, location now mvn:bundle, change count 2
> >> 2016-05-31T23:03:19.746+0200 | DEBUG | 2a-bba852d57901) | scr
> >>              | 34 - org.apache.felix.scr - 2.0.2 | configurationEvent:
> >> Handling DELETED  of Configuration
> >> PID=factoryPid.8c4b462e-c7ea-4fcf-9d2a-bba852d57901 for component
> holders
> >> [[ImmediateComponentHolder:factoryPid]]
> >>
> >> However the error no longer shows up. either it is random chance, or
> your
> >> recent changes fixed it?
> >>
> >> Fabian
> >>
> >>
> >> On Tue, May 31, 2016 at 10:08 PM, Fabian Lange <
> fabian.lange@codecentric.de>
> >> wrote:
> >>
> >>> Hi David,
> >>> some info is in the trace:
> >>> org.apache.felix.configadmin:1.8.8
> >>> org.apache.felix.scr:2.0.2
> >>>
> >>> Unfortunately I cannot get the trunk version into production. but I can
> >>> try the debug log anyway. But first I need to figure out where I can
> set
> >>> this
> >>>
> >>> Fabian
> >>>
> >>>
> >>> On Tue, May 31, 2016 at 10:04 PM, David Jencks <
> >>> david_jencks@yahoo.com.invalid> wrote:
> >>>
> >>>> You might know what’s in Karaf, but I don’t.  What are the versions of
> >>>> config admin and DS that you are using?
> >>>>
> >>>> Since you say this is occurring in production it might not be
> possible,
> >>>> but if you can get a log with debug trace on (setting
> ds.loglevel=debug)
> >>>> with the trunk version of DS that would be extremely helpful figuring
> out
> >>>> what is going on…. I added some logging of the config change count
> since
> >>>> the last release.
> >>>>
> >>>> thanks
> >>>> david jencks
> >>>>
> >>>>> On May 31, 2016, at 12:53 PM, Fabian Lange <
> fabian.lange@codecentric.de>
> >>>> wrote:
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I am observing an issue in production which looks to me like a race
> >>>>> condition, but I am unable to reproduce it so far.
> >>>>>
> >>>>>
> >>>>> The exception is:
> >>>>> java.lang.IllegalStateException: Existing Configuration with pid
> >>>>> factoryPid.a5b6fe71-556c-4ce9-b09c-c19542586fc5 has had its
> properties
> >>>> set
> >>>>> to null and location changed.  We expected a delete event first.
> >>>>>      at
> >>>>>
> >>>>
> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:351)[34:org.apache.felix.scr:2.0.2]
> >>>>>      at
> >>>>>
> >>>>
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2046)[4:org.apache.felix.configadmin:1.8.8]
> >>>>>      at
> >>>>>
> >>>>
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2014)[4:org.apache.felix.configadmin:1.8.8]
> >>>>>      at
> >>>>>
> >>>>
> org.apache.felix.cm.impl.UpdateThread.run0(UpdateThread.java:143)[4:org.apache.felix.configadmin:1.8.8]
> >>>>>      at
> >>>>>
> >>>>
> org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:110)[4:org.apache.felix.configadmin:1.8.8]
> >>>>>      at java.lang.Thread.run(Thread.java:745)[:1.8.0_91]
> >>>>>
> >>>>>
> >>>>> What is happening:
> >>>>> I am creating configurations in Karaf (4.0.5) using ConfigAdmin:
> >>>>>
> >>>>> Configuration configuration = configAdmin.createFactoryConfiguration(
> >>>>> "factoryPid", null);
> >>>>> configuration.update(dictionary);
> >>>>>
> >>>>> It is possible that due to validation shortly after the same config
> is
> >>>>> deleted:
> >>>>>
> >>>>> configAdmin.listConfigurations("(service.factoryPid=factoryPid)"
> >>>>> )[0].delete()
> >>>>>
> >>>>> My theory is that the update is still processing while it is already
> >>>>> deleted again, and then the scr listener complains.
> >>>>>
> >>>>>
> >>>>> Ideas?
> >>>>> Best regards,
> >>>>> Fabian
> >>>>>
> >>>>> --
> >>>>> Fabian Lange | Performance Expert
> >>>>> mobil: +49 (0) 160.3673393
> >>>>>
> >>>>> codecentric AG | Merscheider Straße 1 | 42699 Solingen | Deutschland
> >>>>>
> >>>>> Sitz der Gesellschaft: Solingen | HRB 25917| Amtsgericht Wuppertal
> >>>>> Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
> >>>>> Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen
> >>>> Schütz
> >>>>
> >>>>
> >>>
> >
>
>