You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Dan LaRocque <da...@hopcount.org> on 2014/02/26 16:27:31 UTC

Synchronization needed on Schema.cfIdMap?

Hi,

I think cfIdMap in config/Schema.java may be subject to unsynchronized 
access by distinct threads.

Say one thread adds a CF, maybe triggering resizes on cfIdMap's internal 
tables.  What guarantees that other threads calling 
Schema.instance.getId concurrent with CF addition see an 
internally-consistent cfIdMap?  HashBiMap is not threadsafe, and 
Schema's methods that touch cfIdMap have no explicit synchronization or 
locking, except for clear().  I think this scenario could lead to 
spurious and rare "Unknown table/cf" exceptions on reads/writes during 
unrelated schema migrations in 1.2 (reworded to "Unknown keyspace/cf" in 
2.0), which is how I got here in the first place.

I could be misreading the access pattern, maybe by missing external 
synchronization somewhere.  I brought this to the list instead of JIRA 
because I'm uncertain about the problem.  I'm hoping for a sanity check.

If this is actually a bug and not a misunderstanding, then a fix should 
be pretty straightforward.  Even though Maps.synchronizedBiMap could be 
deemed unacceptable for read throughput reasons, it should be possible 
to get decent reads by changing cfIdMap into a volatile reference to an 
unmodifiable bimap and guarding all modifications with a single write lock.

thanks,
Dan

Re: Synchronization needed on Schema.cfIdMap?

Posted by "J. Ryan Earl" <os...@jryanearl.us>.
We've seen a lot of errors that could be exactly this.  Would this fit the
mold?  If so, I can confirm this happens in the wild, we've gotten hundreds
of these in the last few months:

ERROR [ReadStage:58705] 2014-02-12 01:06:18,254 CassandraDaemon.java (line
192) Exception in thread Thread[ReadStage:58705,5,main]
java.lang.IllegalArgumentException: Unknown CF d2225e32-bec7-373d-bdf8-
4642896f0755


On Wed, Feb 26, 2014 at 9:27 AM, Dan LaRocque <da...@hopcount.org> wrote:

> Hi,
>
> I think cfIdMap in config/Schema.java may be subject to unsynchronized
> access by distinct threads.
>
> Say one thread adds a CF, maybe triggering resizes on cfIdMap's internal
> tables.  What guarantees that other threads calling Schema.instance.getId
> concurrent with CF addition see an internally-consistent cfIdMap?
>  HashBiMap is not threadsafe, and Schema's methods that touch cfIdMap have
> no explicit synchronization or locking, except for clear().  I think this
> scenario could lead to spurious and rare "Unknown table/cf" exceptions on
> reads/writes during unrelated schema migrations in 1.2 (reworded to
> "Unknown keyspace/cf" in 2.0), which is how I got here in the first place.
>
> I could be misreading the access pattern, maybe by missing external
> synchronization somewhere.  I brought this to the list instead of JIRA
> because I'm uncertain about the problem.  I'm hoping for a sanity check.
>
> If this is actually a bug and not a misunderstanding, then a fix should be
> pretty straightforward.  Even though Maps.synchronizedBiMap could be deemed
> unacceptable for read throughput reasons, it should be possible to get
> decent reads by changing cfIdMap into a volatile reference to an
> unmodifiable bimap and guarding all modifications with a single write lock.
>
> thanks,
> Dan
>