How To : Recover the InnoDB Cluster from complete outage.

I’ve been playing with InnoDB cluster the whole weekend. on this post i’m trying to start my cluster from complete outage.

i don’t have idea which is the primary to bootstrap. using the manual way i know how to do it.
But now, i want to use MySQL Shell.

after shutdown all my 3 node then start it again. below is the current status.

+--------------------------------------+----------------+-------------+--------------+-------------+
| MEMBER_ID                            | MEMBER_HOST    | MEMBER_PORT | MEMBER_STATE | MEMBER_TYPE |
+--------------------------------------+----------------+-------------+--------------+-------------+
| a1e3e0e0-2715-11e7-b2cd-0800274048c8 | 192.168.56.101 |        3306 | OFFLINE      | SECONDARY   |
+--------------------------------------+----------------+-------------+--------------+-------------+
1 row in set (0.00 sec)

root@localhost [(none)]>

+--------------------------------------+----------------+-------------+--------------+-------------+
| MEMBER_ID                            | MEMBER_HOST    | MEMBER_PORT | MEMBER_STATE | MEMBER_TYPE |
+--------------------------------------+----------------+-------------+--------------+-------------+
| a293fead-2715-11e7-bdf9-080027d1779e | 192.168.56.102 |        3306 | OFFLINE      | SECONDARY   |
+--------------------------------------+----------------+-------------+--------------+-------------+
1 row in set (0.01 sec)

root@localhost [(none)]>

+--------------------------------------+----------------+-------------+--------------+-------------+
| MEMBER_ID                            | MEMBER_HOST    | MEMBER_PORT | MEMBER_STATE | MEMBER_TYPE |
+--------------------------------------+----------------+-------------+--------------+-------------+
| a1c5e25e-2715-11e7-bbe4-0800273fb9a2 | 192.168.56.103 |        3306 | OFFLINE      | SECONDARY   |
+--------------------------------------+----------------+-------------+--------------+-------------+
1 row in set (0.00 sec)

root@localhost [(none)]>

On one of my router its shows below.

2017-04-23 07:07:33 ERROR   [7f154effd700] Failed connecting to metadata servers
2017-04-23 07:12:33 ERROR   [7f154effd700] Failed connecting with Metadata Server 192.168.56.103:3306: Can't connect to MySQL server on '192.168.56.103' (113) (2003)
2017-04-23 07:12:33 ERROR   [7f154effd700] Failed connecting with Metadata Server 192.168.56.101:3306: Can't connect to MySQL server on '192.168.56.101' (113) (2003)
2017-04-23 07:12:33 ERROR   [7f154effd700] Failed connecting with Metadata Server 192.168.56.102:3306: Can't connect to MySQL server on '192.168.56.102' (113) (2003)
2017-04-23 07:12:33 ERROR   [7f154effd700] Failed connecting with any of the bootstrap servers
2017-04-23 07:12:33 ERROR   [7f154effd700] Failed connecting to metadata servers

Now, using MySQL Shell let’s our cluster again.i’ll connect to node 1.

[admin@lnxgrprouter tmp]$ mysqlsh -u root -p -h 192.168.56.101
Creating a Session to 'root@192.168.56.101'
Enter password:
Classic Session successfully established. No default schema selected.
Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.
mysql-js> cluster.status();
ReferenceError: cluster is not defined
mysql-js> var cluster = dba.rebootClusterFromCompleteOutage();
Reconfiguring the default cluster from complete outage...

The instance '192.168.56.103:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y

The instance '192.168.56.102:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y

Dba.rebootClusterFromCompleteOutage: Dba.rebootClusterFromCompleteOutage: The active session instance isn't the most updated in comparison with the ONLINE instances of the Cluster's metadata. Please use the most up to date instance: '192.168.56.103:3306'. (RuntimeError)
mysql-js>

Shell is telling us to connect to node 3.

[admin@lnxgrprouter tmp]$ mysqlsh -u root -p -h 192.168.56.103
Creating a Session to 'root@192.168.56.103'
Enter password:
Classic Session successfully established. No default schema selected.
Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.
mysql-js> var cluster = dba.rebootClusterFromCompleteOutage();
Reconfiguring the default cluster from complete outage...

The instance '192.168.56.101:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y

The instance '192.168.56.102:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y


The cluster was successfully rebooted.

mysql-js>

Let’s check our cluster status.

mysql-js> cluster.status();
{
    "clusterName": "enets2_cluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "192.168.56.103:3306",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "192.168.56.101:3306": {
                "address": "192.168.56.101:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "192.168.56.102:3306": {
                "address": "192.168.56.102:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "192.168.56.103:3306": {
                "address": "192.168.56.103:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}
mysql-js> \sql
Switching to SQL mode... Commands end with ;
mysql-sql> 

+--------------------------------------+----------------+-------------+--------------+-------------+
| MEMBER_ID                            | MEMBER_HOST    | MEMBER_PORT | MEMBER_STATE | MEMBER_TYPE |
+--------------------------------------+----------------+-------------+--------------+-------------+
| a1c5e25e-2715-11e7-bbe4-0800273fb9a2 | 192.168.56.103 |        3306 | ONLINE       | PRIMARY     |
| a1e3e0e0-2715-11e7-b2cd-0800274048c8 | 192.168.56.101 |        3306 | ONLINE       | SECONDARY   |
| a293fead-2715-11e7-bdf9-080027d1779e | 192.168.56.102 |        3306 | ONLINE       | SECONDARY   |
+--------------------------------------+----------------+-------------+--------------+-------------+
3 rows in set (0.00 sec)
mysql-sql>

Now life is easier with MySQL Shell.

Advertisements

3 thoughts on “How To : Recover the InnoDB Cluster from complete outage.

  1. Not working for me when I did my tests. Say my 3 servers are named mysql01, mysql02 and mysql03. When I issue the command to reboot the cluster from total outage on mysql01 I get:

    Dba.rebootClusterFromCompleteOutage: Dba.rebootClusterFromCompleteOutage: The active session instance isn’t the most updated in comparison with the ONLINE instances of the Cluster’s metadata. Please use the most up to date instance: ‘mysql03:3306’.

    Ok, So I perform the same operation on mysql03 and I get:

    Dba.rebootClusterFromCompleteOutage: Dba.rebootClusterFromCompleteOutage: The active session instance isn’t the most updated in comparison with the ONLINE instances of the Cluster’s metadata. Please use the most up to date instance: ‘mysql01:3306’.

    Awesome! Now what? I’ve been searching for hours and not finding anything helpful on how to overcome this and get my cluster online.

    • thanks, in my case it worked. but recently i also have same problem again when using MySQL Shell.
      i ended up using the manual way. same a i do here https://ronniethedba.wordpress.com/2017/04/22/this-member-has-more-executed-transactions-than-those-present-in-the-group/

      anyway below info form my oracle sr.

      Hello Ronnie,

      If you getCluster doesn’t work, then you should use this one:
      var cluster = dba.rebootClusterFromCompleteOutage();

      After that you can try rescan (you might need to do it for each instance you reconfigured manually):
      cluster.rescan()

      If it still doesn’t work as you want, then you can dissolve this cluster with “cluster.dissolve()” (please note, however, that it can stop Group Replication on this instance, if it was running, so be careful with it). After that you can recreate Cluster using standard steps. If you already have properly configured Group Replication, then you can use these steps:
      https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-working-with-cluster.html#mysql-innodb-cluster-working-with-group-replication

      I would like to highlight that when you start using MySQL Shell and create a cluster there, you are not suppose to configure those instances manually anymore. It’s not recommended to combine manual configuring/managing and managing cluster with MySQL Shell.

      hope it helps.

      • The weird thing is that both “dba.rebootClusterFromCompleteOutage();” and “var cluster = dba.rebootClusterFromCompleteOutage();” were failing on my nodes with the same behavior I described above. Because “var cluster = dba.rebootClusterFromCompleteOutage();” was also failing, I couldn’t do either cluster.rescan() or cluster.dissove(), because the cluster variable failed to be set. It was really weird. I gave up trying to fix it since I needed the databases back online so I’m running on a single mysql instance. But I’ll look at the other link in your reply and recreate my test to see if a combination of both allows me to recover the cluster. Thanks for the additional info.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s