-
Ole John Aske authored
There is a possible race condition between the schema distribution coordinator holding a ref-lock on its schema_object, and the binlog-injector thread (participant-role) possibly being late to unref-lock the same schema_object. This could potentially result in another schema distr operation getting a ref to the not-yet-released schema_object held by the injector thread instead of having to create a new schema_object as normally expected. The SLOCK-bitmap, which keeps track of which participants the coordinator is still waiting for, was set to 'all-1' when created. However, it will be 'all-0' immediately before the injector thread is about the release (unref) it. Thus, if the coordinator managed to 're-get' this schema_object before being released (and destructed) by the injector-thread, we got a schema_object->slocks with 'all-0' instead of 'all-1' as expected. - This caused a total breakdown of the schema distribution protocol. The fix is to move the schema_object->slock 'all-1' setting from the creation of new schema_object() to the place where the schema distr coordinator initate waiting for schema operations to be distributed. This will cover both scenarios where we either had to create a new schema_object, or we reuse an existing not-yet-released schema object.
Ole John Aske authoredThere is a possible race condition between the schema distribution coordinator holding a ref-lock on its schema_object, and the binlog-injector thread (participant-role) possibly being late to unref-lock the same schema_object. This could potentially result in another schema distr operation getting a ref to the not-yet-released schema_object held by the injector thread instead of having to create a new schema_object as normally expected. The SLOCK-bitmap, which keeps track of which participants the coordinator is still waiting for, was set to 'all-1' when created. However, it will be 'all-0' immediately before the injector thread is about the release (unref) it. Thus, if the coordinator managed to 're-get' this schema_object before being released (and destructed) by the injector-thread, we got a schema_object->slocks with 'all-0' instead of 'all-1' as expected. - This caused a total breakdown of the schema distribution protocol. The fix is to move the schema_object->slock 'all-1' setting from the creation of new schema_object() to the place where the schema distr coordinator initate waiting for schema operations to be distributed. This will cover both scenarios where we either had to create a new schema_object, or we reuse an existing not-yet-released schema object.
Loading