Skip to content
  • Ole John Aske's avatar
    f027c66f
    Bug#22494024, SQL NODE FAILS TO RECONNECT AFTER NETWORK OUTAGE; BLOCKS OTHER NODES · f027c66f
    Ole John Aske authored
    Failure of completing ndb_binlog_setup() as an 'atomic' operation,
    could bring other mysqlds into a state where they believed
    this mysqld is fully participate in schema change distribution, while
    it had not yet completed its ndb_binlog_setup().
    
    This could cause a deadlock on the global MDL (Meta Data Lock)
    where the mysqld still retrying ndb_binlog_setup() needs this lock,
    while another mysqld has grabbed it as part of a schema
    change operation. The later mysqld will then wait for the other
    mysqld to act on the schema distr changes, which it is not able to yet.
    
    This bug fix introduce 2 important changes:
    1) Introduce the Thd_ndb::option: 'TNO_ALLOW_BINLOG_SETUP'
       which is set by ndb_binlog_setup() to request additional
       'setup rights'
       ndbcluster_create_binlog_setup() is modified to not let
       THDs without the 'TNO_ALLOW_BINLOG_SETUP' initiate
       the schema distrubution.
       (When we are in state '!ndb_schema_dist_is_ready()' )
    
    2) ndb_binlog_setup() is enhanced to tear down any partial
       setup event subscriptions when it fail to complete.
       (See: remove_all_event_operations() )
    
    In addition to this there are some:
    - Refactorication to clean up in related code.
    - Added correct counting of 'unhandled' if  ndbcluster_find_all_fileds()
      failed to ndbcluster_create_binlog_setup(). -> Required for
      ndb_binlog_setup() to correctly catch setup failures.
    - Corrected an incorrect return value when create_cluster_sys_table()
      can't create due to missing connections. (Required by 'setup')
    - Removed now obsolete static functions prev used by 'setup'
      Previously required as 'setup' tried to handle leftovers from
      a previous failed 'setup'. Should not be possible anymore.
      (Asserts added)
    - Moved ndb_schema_dist_is_ready() from ndb_schema_dist.cc ->
      ha_ndbcluster_binlog.cc. Added missing mutex.
    f027c66f
    Bug#22494024, SQL NODE FAILS TO RECONNECT AFTER NETWORK OUTAGE; BLOCKS OTHER NODES
    Ole John Aske authored
    Failure of completing ndb_binlog_setup() as an 'atomic' operation,
    could bring other mysqlds into a state where they believed
    this mysqld is fully participate in schema change distribution, while
    it had not yet completed its ndb_binlog_setup().
    
    This could cause a deadlock on the global MDL (Meta Data Lock)
    where the mysqld still retrying ndb_binlog_setup() needs this lock,
    while another mysqld has grabbed it as part of a schema
    change operation. The later mysqld will then wait for the other
    mysqld to act on the schema distr changes, which it is not able to yet.
    
    This bug fix introduce 2 important changes:
    1) Introduce the Thd_ndb::option: 'TNO_ALLOW_BINLOG_SETUP'
       which is set by ndb_binlog_setup() to request additional
       'setup rights'
       ndbcluster_create_binlog_setup() is modified to not let
       THDs without the 'TNO_ALLOW_BINLOG_SETUP' initiate
       the schema distrubution.
       (When we are in state '!ndb_schema_dist_is_ready()' )
    
    2) ndb_binlog_setup() is enhanced to tear down any partial
       setup event subscriptions when it fail to complete.
       (See: remove_all_event_operations() )
    
    In addition to this there are some:
    - Refactorication to clean up in related code.
    - Added correct counting of 'unhandled' if  ndbcluster_find_all_fileds()
      failed to ndbcluster_create_binlog_setup(). -> Required for
      ndb_binlog_setup() to correctly catch setup failures.
    - Corrected an incorrect return value when create_cluster_sys_table()
      can't create due to missing connections. (Required by 'setup')
    - Removed now obsolete static functions prev used by 'setup'
      Previously required as 'setup' tried to handle leftovers from
      a previous failed 'setup'. Should not be possible anymore.
      (Asserts added)
    - Moved ndb_schema_dist_is_ready() from ndb_schema_dist.cc ->
      ha_ndbcluster_binlog.cc. Added missing mutex.
Loading