You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Dan LaRocque <da...@hopcount.org> on 2014/02/26 16:27:31 UTC
Synchronization needed on Schema.cfIdMap?
Hi,
I think cfIdMap in config/Schema.java may be subject to unsynchronized
access by distinct threads.
Say one thread adds a CF, maybe triggering resizes on cfIdMap's internal
tables. What guarantees that other threads calling
Schema.instance.getId concurrent with CF addition see an
internally-consistent cfIdMap? HashBiMap is not threadsafe, and
Schema's methods that touch cfIdMap have no explicit synchronization or
locking, except for clear(). I think this scenario could lead to
spurious and rare "Unknown table/cf" exceptions on reads/writes during
unrelated schema migrations in 1.2 (reworded to "Unknown keyspace/cf" in
2.0), which is how I got here in the first place.
I could be misreading the access pattern, maybe by missing external
synchronization somewhere. I brought this to the list instead of JIRA
because I'm uncertain about the problem. I'm hoping for a sanity check.
If this is actually a bug and not a misunderstanding, then a fix should
be pretty straightforward. Even though Maps.synchronizedBiMap could be
deemed unacceptable for read throughput reasons, it should be possible
to get decent reads by changing cfIdMap into a volatile reference to an
unmodifiable bimap and guarding all modifications with a single write lock.
thanks,
Dan
Re: Synchronization needed on Schema.cfIdMap?
Posted by "J. Ryan Earl" <os...@jryanearl.us>.
We've seen a lot of errors that could be exactly this. Would this fit the
mold? If so, I can confirm this happens in the wild, we've gotten hundreds
of these in the last few months:
ERROR [ReadStage:58705] 2014-02-12 01:06:18,254 CassandraDaemon.java (line
192) Exception in thread Thread[ReadStage:58705,5,main]
java.lang.IllegalArgumentException: Unknown CF d2225e32-bec7-373d-bdf8-
4642896f0755
On Wed, Feb 26, 2014 at 9:27 AM, Dan LaRocque <da...@hopcount.org> wrote:
> Hi,
>
> I think cfIdMap in config/Schema.java may be subject to unsynchronized
> access by distinct threads.
>
> Say one thread adds a CF, maybe triggering resizes on cfIdMap's internal
> tables. What guarantees that other threads calling Schema.instance.getId
> concurrent with CF addition see an internally-consistent cfIdMap?
> HashBiMap is not threadsafe, and Schema's methods that touch cfIdMap have
> no explicit synchronization or locking, except for clear(). I think this
> scenario could lead to spurious and rare "Unknown table/cf" exceptions on
> reads/writes during unrelated schema migrations in 1.2 (reworded to
> "Unknown keyspace/cf" in 2.0), which is how I got here in the first place.
>
> I could be misreading the access pattern, maybe by missing external
> synchronization somewhere. I brought this to the list instead of JIRA
> because I'm uncertain about the problem. I'm hoping for a sanity check.
>
> If this is actually a bug and not a misunderstanding, then a fix should be
> pretty straightforward. Even though Maps.synchronizedBiMap could be deemed
> unacceptable for read throughput reasons, it should be possible to get
> decent reads by changing cfIdMap into a volatile reference to an
> unmodifiable bimap and guarding all modifications with a single write lock.
>
> thanks,
> Dan
>