Managing Percona Xtradb Cluster with Puppet

Last month I spoke at the Percona Live conference about MySQL and puppet. There was a lot of interest in the talk, so I figured I'd write a blog post about it as well. I used the galera module I wrote as an example in the session, so this post will be specifically about galera.

Prerequisites

Setting up virtualbox

We have used specific network settings for virtualbox in our vagrant file, so we'll need to make sure it's configured properly. Inside VirtualBox, go to preferences -> network -> Host Only Networks (on a Mac, may be different on other host OSes. Edit vboxnet0, or add it if the list is empty. Use the following settings to make sure your vm's will be using the ip's defined in the vagrantfile:

Adapter tab:

  • IPv4 Address: 192.168.56.1
  • IPv4 Network Mask: 255.255.255.0
  • IPv6 Settings can stay on their defaults

DHCP tab:

  • tick "Enable server"
  • Server Address: 192.168.56.100
  • Server Mask: 255.255.255.0
  • Lower Address Bound: 192.168.56.101
  • Upper Address Bound: 192.168.56.254

DHCP is not strictly needed (we set static ips in the vagrantfile), but if you add other servers to your testing later on, it's convenient to have them in the same subnet.

Getting the puppet master up and running

Now that virtualbox is ready for action, lets grab the code and fire up the puppet master with vagrant:

$ git clone https://github.com/olindata/olindata-galera-puppet-demo.git olindata-galera-demo
$ cd olindata-galera-demo/vagrant
$ vagrant up master
[..Wait a few minutes, grab coffee and read the rest of this post..]

Note that the vagrant up command throws errors here and there, but they are okay as they are corrected later in the master_setup.sh script. To check that everything completed, log into the master and check for which process is listening on port 8140. This should be httpd. In addition, a puppet agent -t run should complete without problems:

$ vagrant ssh master
[vagrant@master ~]$ sudo su -
[root@master ~]# netstat -plant | grep 8140
tcp        0      0 :::8140                     :::*                        LISTEN      5305/httpd
[root@master ~]# puppet agent -t
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/etckepper_puppet.rb
Info: Caching catalog for master.olindata.vm
Info: Applying configuration version '1397908319'
Notice: Finished catalog run in 5.30 seconds

If the output of the commands is as shown above, the Puppet master is now ready for the agents to be brought up.

Bringing up the galera nodes

First node

The galera puppet module is quite nice, but it has one big caveat at the moment: bootstrapping a cluster. The problem is that when puppet runs on a node, it has a hard time figuring out if that node is the first node in a cluster (and thus needs to be bootstrapped) or if it is joining a cluster. A solution to this would be to write a little script that checks all the nodes in the wsrep_cluster_address variable to see if they are already up, but that is neither very nice (we're trying to prevent needing that in Puppet) nor implemented at present.

Since the majority of the times we'll be adding nodes to an already existing cluster, we have opted for that to be the default with the Galera module. This in turn means that for this demo we need to bring up one vm first, bootstrap galera on it and then bring up the other nodes. (Note: Elegant solutions to this problem welcome in the comments!)

Let's start by bringing up the vm and ssh'ing in as root:

$ vagrant up galera000
Bringing machine 'galera000' up with 'virtualbox' provider...
==> galera000: Importing base box 'debian-73-x64-virtualbox-puppet'...
==> galera000: Matching MAC address for NAT networking...
==> galera000: Setting the name of the VM: vagrant_galera000_1397908792529_17411
==> galera000: Fixed port collision for 22 => 2222. Now on port 2200.
==> galera000: Clearing any previously set network interfaces...
==> galera000: Preparing network interfaces based on configuration...
    galera000: Adapter 1: nat
    galera000: Adapter 2: hostonly
==> galera000: Forwarding ports...
    galera000: 22 => 2200 (adapter 1)
==> galera000: Booting VM...
==> galera000: Waiting for machine to boot. This may take a few minutes...
    galera000: SSH address: 127.0.0.1:2200
    galera000: SSH username: vagrant
    galera000: SSH auth method: private key
    galera000: Error: Connection timeout. Retrying...
==> galera000: Machine booted and ready!
==> galera000: Checking for guest additions in VM...
==> galera000: Setting hostname...
==> galera000: Configuring and enabling network interfaces...
==> galera000: Mounting shared folders...
    galera000: /vagrant => /Users/walterheck/Source/olindata-galera-demo/vagrant
==> galera000: Running provisioner: shell...
    galera000: Running: /var/folders/4x/366j5zl15b1b4z7t6l7jf6zw0000gn/T/vagrant-shell20140419-2728-1wfc0hm
stdin: is not a tty
$ vagrant ssh galera000
Linux vagrant 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64
Last login: Wed Feb  5 12:49:09 2014 from 10.0.2.2
vagrant@galera000:~$ sudo su -
root@galera000:~#

Next up, we run puppet agent on it. Note that since we have autosigning turned on on the puppetmaster, the first run doesn't need to wait for a signed certificate. The puppet run will have some errors, but we can live with that:

root@galera000:~# puppet agent -t

In the output (too much to display here), you'll see red lines that complain about not being able to start mysql:

Error: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: /Stage[main]/Mysql::Server::Service/Service[mysqld]/ensure: change from stopped to running failed: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: /Stage[main]/Mysql::Server::Service/Service[mysqld]/ensure: change from stopped to running failed: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:

This is not actually true, when you check for the mysql process after the puppet run it's there:

root@galera000:~# ps aux | grep mysql
root      9881  0.0  0.0   4176   440 ?        S    05:05   0:00 /bin/sh /usr/bin/mysqld_safe
mysql    10209  0.2 64.8 830292 330188 ?       Sl   05:05   0:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/lib/mysql/galera000.err --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
root     12456  0.0  0.1   7828   872 pts/0    S+   05:11   0:00 grep mysql

Let's kill mysql first :

root@galera000:~# pkill -9ef mysql
root@galera000:~# ps aux | grep mysql
root     12475  0.0  0.1   7828   876 pts/0    S+   05:12   0:00 grep mysql

Next up, we bootstrap the cluster:

root@galera000:~# service mysql bootstrap-pxc
[....] Bootstrapping Percona XtraDB Cluster database server: mysqld[....] Please take a l[FAILt the syslog. ... failed!
 failed!

Somehow this thinks it failed, but it didn't. To make sure it worked, log into mysql and check the status of the wsrep_cluster_* status variables. It should look something like this:

mysql> show global status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 1                                    |
| wsrep_cluster_size       | 1                                    |
| wsrep_cluster_state_uuid | 7665992d-bc38-11e3-a2c4-9aefb5dea18a |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.00 sec)

Now that mysql is properly bootstrapped, we can run the puppet agent one more time and see it completely properly now:

root@galera000:~# puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/etckepper_puppet.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Caching catalog for galera000.olindata.vm
Info: Applying configuration version '1398437502'
Notice: /Stage[main]/Xinetd/Service[xinetd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Xinetd/Service[xinetd]: Unscheduling refresh on Service[xinetd]
Notice: /Stage[main]/Mcollective::Server::Config::Factsource::Yaml/File[/etc/mcollective/facts.yaml]/content:
--- /etc/mcollective/facts.yaml 2014-04-25 08:17:39.000000000 -0700
+++ /tmp/puppet-file20140425-17657-1jhcy3j  2014-04-25 08:23:25.000000000 -0700
@@ -63,7 +63,7 @@
   operatingsystemmajrelease: "7"
   operatingsystemrelease: "7.3"
   osfamily: Debian
-  path: "/usr/bin:/bin:/usr/sbin:/sbin"
+  path: "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
   physicalprocessorcount: "1"
   processor0: "Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz"
   processorcount: "1"

Info: /Stage[main]/Mcollective::Server::Config::Factsource::Yaml/File[/etc/mcollective/facts.yaml]: Filebucketed /etc/mcollective/facts.yaml to puppet with sum 3a6aabbe41f4023031295a8ac3735df3
Notice: /Stage[main]/Mcollective::Server::Config::Factsource::Yaml/File[/etc/mcollective/facts.yaml]/content: content changed '{md5}3a6aabbe41f4023031295a8ac3735df3' to '{md5}227af082af9547f423040a45afec7800'
Notice: /Stage[main]/Mysql::Server::Root_password/Mysql_user[root@localhost]/password_hash: defined 'password_hash' as '*55070223BD04C680F8BD1586E6D12989358B4B55'
Notice: /Stage[main]/Mysql::Server::Root_password/File[/root/.my.cnf]/ensure: defined content as '{md5}af3f5d93645d29f88fd907e78d53806b'
Notice: /Stage[main]/Galera::Health_check/Mysql_user[mysqlchk_user@127.0.0.1]/ensure: created
Notice: /Stage[main]/Galera/Mysql_user[sst_xtrabackup@%]/ensure: created
Notice: /Stage[main]/Galera/Mysql_grant[sst_xtrabackup@%/*.*]/privileges: privileges changed ['USAGE'] to 'CREATE TABLESPACE LOCK TABLES RELOAD REPLICATION CLIENT SUPER'
Notice: Finished catalog run in 4.65 seconds

Now that this is done, we're ready to move on to the other nodes.

Subsequent nodes

Next, we bring up the other three vagrant nodes. The output from vagrant up will look like this:

$ vagrant up galera001
Bringing machine 'galera001' up with 'virtualbox' provider...
==> galera001: Importing base box 'debian-73-x64-virtualbox-puppet'...
==> galera001: Matching MAC address for NAT networking...
==> galera001: Setting the name of the VM: vagrant_galera001_1398437027038_8689
==> galera001: Fixed port collision for 22 => 2222. Now on port 2201.
==> galera001: Clearing any previously set network interfaces...
==> galera001: Preparing network interfaces based on configuration...
    galera001: Adapter 1: nat
    galera001: Adapter 2: hostonly
==> galera001: Forwarding ports...
    galera001: 22 => 2201 (adapter 1)
==> galera001: Booting VM...
==> galera001: Waiting for machine to boot. This may take a few minutes...
    galera001: SSH address: 127.0.0.1:2201
    galera001: SSH username: vagrant
    galera001: SSH auth method: private key
    galera001: Error: Connection timeout. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
    galera001: Error: Remote connection disconnect. Retrying...
==> galera001: Machine booted and ready!
==> galera001: Checking for guest additions in VM...
==> galera001: Setting hostname...
==> galera001: Configuring and enabling network interfaces...
==> galera001: Mounting shared folders...
    galera001: /vagrant => /Users/walterheck/Source/olindata-galera-demo/vagrant
==> galera001: Running provisioner: shell...
    galera001: Running: /var/folders/4x/366j5zl15b1b4z7t6l7jf6zw0000gn/T/vagrant-shell20140425-19022-fiys2z
stdin: is not a tty

Do the same for galera002 and galera003, then log into galera001 and run puppet agent -t:

$ vagrant ssh galera001
Linux vagrant 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb  5 12:49:09 2014 from 10.0.2.2
vagrant@galera001:~$ sudo su -
root@galera001:~# puppet agent -t
Info: Creating a new SSL key for galera001.olindata.vm
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for galera001.olindata.vm
Info: Certificate Request fingerprint (SHA256): A2:FF:3B:6F:7C:BA:FF:5B:65:C7:36:6F:CF:D2:FD:10:50:7C:63:7E:26:F1:F5:06:54:B8:C5:E7:2D:E2:17:37
Info: Caching certificate for galera001.olindata.vm
Info: Caching certificate_revocation_list for ca
Info: Caching certificate for ca
Info: Retrieving plugin
Notice: /File[/var/lib/puppet/lib/puppet]/ensure: created
Notice: /File[/var/lib/puppet/lib/puppet/provider]/ensure: created
Notice: /File[/var/lib/puppet/lib/puppet/provider/database_user]/ensure: created
[..snip..]
Notice: /Stage[main]/Profile::Mysql::Base/Package[xtrabackup]/ensure: ensure changed 'purged' to 'latest'
Info: Class[Mcollective::Server::Config]: Scheduling refresh of Class[Mcollective::Server::Service]
Info: Class[Mcollective::Server::Service]: Scheduling refresh of Service[mcollective]
Notice: /Stage[main]/Mcollective::Server::Service/Service[mcollective]: Triggered 'refresh' from 1 events
Info: Creating state file /var/lib/puppet/state/state.yaml
Notice: Finished catalog run in 137.24 seconds

When the puppet agent run is finished, we do a similar round of pkill and service start:

root@galera001:~# pkill -9ef mysql
root@galera001:~# ps aux | grep mysql
root     12077  0.0  0.1   7828   876 pts/0    S+   08:28   0:00 grep mysql
root@galera001:~# service mysql start
[FAIL] Starting MySQL (Percona XtraDB Cluster) database server: mysqld[....] Please take a look at the syslog. ... failed!
 failed!

If you then look at the mysql error log, it will output something like this after a few seconds, indicating the node has joined our cluster:

root@galera001:~# tail /var/log/mysql/error.log
2014-04-25 08:29:00 12900 [Note] WSREP: inited wsrep sidno 1
2014-04-25 08:29:00 12900 [Note] WSREP: SST received: 7019fb90-cc8d-11e3-9540-1248cb76bdcb:6
2014-04-25 08:29:00 12900 [Note] WSREP: 0.0 (galera001): State transfer from 1.0 (galera000) complete.
2014-04-25 08:29:00 12900 [Note] WSREP: Shifting JOINER -> JOINED (TO: 6)
2014-04-25 08:29:00 12900 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.15-63.0'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release 25.5, wsrep_25.5.r4061
2014-04-25 08:29:00 12900 [Note] WSREP: Member 0 (galera001) synced with group.
2014-04-25 08:29:00 12900 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 6)
2014-04-25 08:29:00 12900 [Note] WSREP: Synchronized with group, ready for connections
2014-04-25 08:29:00 12900 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

next up is a little hack. There's a galera-specific dependency error in the mysql module where it will try to create the root user with password before it writes that info to the ~/.my.cnf file (which is used by the module's commands to avoid needing any hard-coded root password). Since fixing the module is outside of the scope of this article, we'll cheat a little bit. Create a /root/.my.cnf file like this:

root@galera001:~# cat .my.cnf
[client]
user=root
host=localhost
password='khbrf9339'
socket=/var/lib/mysql/mysql.sock

After that, the puppet agent run will complete succesfully:

root@galera001:~# puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/etckepper_puppet.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Caching catalog for galera001.olindata.vm
Info: Applying configuration version '1398437502'
Notice: /Stage[main]/Xinetd/Service[xinetd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Xinetd/Service[xinetd]: Unscheduling refresh on Service[xinetd]
Notice: Finished catalog run in 3.04 seconds

The last step is to restart the xinetd service one more time:

root@galera001:~# service xinetd restart
[ ok ] Stopping internet superserver: xinetd.
[ ok ] Starting internet superserver: xinetd.

Now, rinse and repeat the steps for galera001 on galera002 and galera003:

  • write the .my.cnf file
  • run puppet agent -t
  • pkill mysql, then start the service manually
  • run puppet agent -t again
  • service xinetd restart

After all this is done, run puppet agent -t on all nodes one more time, specifically on galera000, as this has an haproxy running on it that will help us load balance the connections. This haproxy automatically configures galera nodes as they come up, and a puppet agent run will take care of this.

haProxy

This demo cluster comes with an haproxy instance running on galera000. It's http status page should be accessible from the host directly, giving you an insight into what the status of all nodes is. If you did the above all succesfully, the result should be like so:

Open a browser on your host and go to: [http://192.168.56.100/haproxy?stats]

haproxy stats

We have created two listeners by default, with slightly different behaviour:

1) One listener (galera_reader, port 13306) divides incoming queries round-robin over it's backends. This can be used to send all select queries to. 2) The second listener (galera_writer, port 13307) always directs sessions at the same server, unless that one is unavailable. This can be used to send all write-traffic to.

This process assumes your application can make a split in such a way. This is common in applications that used to be run on classic asynchronous replication previously. If your app can't do this, start by sending all traffic to galera_writer. Then slowly implement functionality that makes selects go to galera_reader.

Note that galera is synchronous replication and in theory you can send your writes to any node. In practice however, this is not so simple when concurrency goes up. This discussion is not for this blog post however.

Summary

You are now ready to send queries to the two ports on the haproxy node, and watch them be distributed over the galera cluster. Feel free to play around by shutting down certain nodes, then watch them come back up.

In a next article I'll discuss the puppet repository structure that is used for this article.

walterheck's picture

Comments

Problem with galera001

Everything works for me until I got to the step "Do the same for galera002 and galera003, then log into galera001 and run puppet agent -t:"

Here's what I get:

tail /var/log/mysql/error.log
2014-07-12 20:30:18 13570 [Note] WSREP: gcomm: connecting to group 'galera', peer '192.168.56.100:,192.168.56.101:,192.168.56.102:,192.168.56.103:'
2014-07-12 20:30:18 13570 [Warning] WSREP: (031353e7-0a3e-11e4-bb69-32ccc4450148, 'tcp://0.0.0.0:4567') address 'tcp://192.168.56.101:4567' points to own listening address, blacklisting
2014-07-12 20:30:21 13570 [Warning] WSREP: no nodes coming from prim view, prim not possible
2014-07-12 20:30:21 13570 [Note] WSREP: view(view_id(NON_PRIM,031353e7-0a3e-11e4-bb69-32ccc4450148,1) memb {
031353e7-0a3e-11e4-bb69-32ccc4450148,0
} joined {
} left {
} partitioned {
})
2014-07-12 20:30:21 13570 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.5109S), skipping check
root@galera001:~# tail /var/log/mysql/error.log
2014-07-12 20:32:36 17032 [Note] WSREP: gcomm: connecting to group 'galera', peer '192.168.56.100:,192.168.56.101:,192.168.56.102:,192.168.56.103:'
2014-07-12 20:32:36 17032 [Warning] WSREP: (55496b80-0a3e-11e4-9820-53e4942871b9, 'tcp://0.0.0.0:4567') address 'tcp://192.168.56.101:4567' points to own listening address, blacklisting
2014-07-12 20:32:39 17032 [Warning] WSREP: no nodes coming from prim view, prim not possible
2014-07-12 20:32:39 17032 [Note] WSREP: view(view_id(NON_PRIM,55496b80-0a3e-11e4-9820-53e4942871b9,1) memb {
55496b80-0a3e-11e4-9820-53e4942871b9,0
} joined {
} left {
} partitioned {
})
2014-07-12 20:32:39 17032 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.51012S), skipping check
root@galera001:~# pkill -9ef mysql
root@galera001:~# ps aux | grep mysql
root 18102 0.0 0.1 7828 872 pts/0 S+ 20:34 0:00 grep mysql
root@galera001:~# service mysql start
[....] Starting MySQL (Percona XtraDB Cluster) database server: mysqld . . . . . . . . . . . . . . . . . . . . . . . . . . . . [FAIL . . . . . . .[....] The server quit without updating PID file (/var/run/mysqld/mysqld.pid). ... failed!
failed!
root@galera001:~# tail /var/log/mysql/error.log
2014-07-12 20:34:53 18972 [ERROR] WSREP: gcs connect failed: Connection timed out
2014-07-12 20:34:53 18972 [ERROR] WSREP: wsrep::connect() failed: 7
2014-07-12 20:34:53 18972 [ERROR] Aborting

2014-07-12 20:34:53 18972 [Note] WSREP: Service disconnected.
2014-07-12 20:34:54 18972 [Note] WSREP: Some threads may fail to exit.
2014-07-12 20:34:54 18972 [Note] Binlog end
2014-07-12 20:34:54 18972 [Note] /usr/sbin/mysqld: Shutdown complete

140712 20:34:54 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Fixed

I misread your steps. Once I ran the puppet agent -t on the other two nodes and went back to the first node and restarted mysql it works as expected.

Glad you got this part to

walterheck's picture

Glad you got this part to work in the end. Were my steps unclear ro did you not read properly? Aka should I adjust anything?

.my.cnf file error

I tried creating the .my.cnf file and running puppet agent -t and I kept getting errors about the file not ending in a newline. Tried editing in vi and nano and had the same problem. Then OSX crashed. Apparently with a White Screen of Death.

Ouch. VirtualBox is some of

walterheck's picture

Ouch. VirtualBox is some of the only software I have seen crash MacOS hard. From your comments below I believe this was fixed?

Fixed .my.cnf newline error

Fixed newline error:

root@galera001:~# sed -i-e '$a\' .my.cnf
root@galera001:~# puppet agent -t

Galera cluster won't run

Now I get this when I start Percona and tail the logs:

root@galera001:~# tail /var/log/mysql/error.log
2014-07-13 06:16:30 14935 [ERROR] WSREP: gcs connect failed: Connection timed out
2014-07-13 06:16:30 14935 [ERROR] WSREP: wsrep::connect() failed: 7
2014-07-13 06:16:30 14935 [ERROR] Aborting

2014-07-13 06:16:30 14935 [Note] WSREP: Service disconnected.
2014-07-13 06:16:31 14935 [Note] WSREP: Some threads may fail to exit.
2014-07-13 06:16:31 14935 [Note] Binlog end
2014-07-13 06:16:31 14935 [Note] /usr/sbin/mysqld: Shutdown complete

Stuck - Can bootstrap mysql on galera000

Can't get MYSQL to restart or bootsrap on galera000 or any other node. Tried deleting the sock file because of the dirty shutdown. Am I getting a bootstrap error because of SSL or an SSL error because of MYSQL?

root@galera000:~# service mysql bootstrap-pxc
[....] Bootstrapping Percona XtraDB Cluster database server: mysqld .[....] The server quit without updating PID file (/var/run/mys[FAILysqld.pid). ... failed!
failed!
root@galera000:~# tail /var/log/mysql/error.log
2014-07-13 08:11:18 4251 [Warning] Failed to setup SSL
2014-07-13 08:11:18 4251 [Warning] SSL error: SSL_CTX_set_default_verify_paths failed
2014-07-13 08:11:18 4251 [Note] RSA private key file not found: /var/lib/mysql//private_key.pem. Some authentication plugins will not work.
2014-07-13 08:11:18 4251 [Note] RSA public key file not found: /var/lib/mysql//public_key.pem. Some authentication plugins will not work.
2014-07-13 08:11:18 4251 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
2014-07-13 08:11:18 4251 [Note] - '0.0.0.0' resolves to '0.0.0.0';
2014-07-13 08:11:18 4251 [Note] Server socket created on IP: '0.0.0.0'.
2014-07-13 08:11:18 4251 [ERROR] /usr/sbin/mysqld: Can't create/write to file '/var/run/mysqld/mysqld.pid' (Errcode: 2 - No such file or directory)
2014-07-13 08:11:18 4251 [ERROR] Can't start server: can't create PID file: No such file or directory
140713 08:11:18 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
root@galera000:~#

oot@galera000:~# puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/etckepper_puppet.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Caching catalog for galera000.olindata.vm
Info: Applying configuration version '1405259207'
Notice: /Stage[main]/Xinetd/Service[xinetd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Xinetd/Service[xinetd]: Unscheduling refresh on Service[xinetd]
Error: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: /Stage[main]/Mysql::Server::Service/Service[mysqld]/ensure: change from stopped to running failed: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: /Stage[main]/Mysql::Server::Service/Service[mysqld]/ensure: change from stopped to running failed: Could not start Service[mysqld]: Execution of '/etc/init.d/mysql start' returned 1:
Error: Could not prefetch mysql_user provider 'mysql': Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)

Notice: /Stage[main]/Mysql::Server::Root_password/Mysql_user[root@localhost]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Mysql::Server::Root_password/Mysql_user[root@localhost]: Skipping because of failed dependencies
Notice: /Stage[main]/Mysql::Server::Root_password/File[/root/.my.cnf]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Mysql::Server::Root_password/File[/root/.my.cnf]: Skipping because of failed dependencies
Notice: /Stage[main]/Galera::Health_check/Mysql_user[mysqlchk_user@127.0.0.1]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Galera::Health_check/Mysql_user[mysqlchk_user@127.0.0.1]: Skipping because of failed dependencies
Notice: /Stage[main]/Mysql::Server/Anchor[mysql::server::end]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Mysql::Server/Anchor[mysql::server::end]: Skipping because of failed dependencies
Notice: /Stage[main]/Galera/Mysql_user[sst_xtrabackup@%]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Galera/Mysql_user[sst_xtrabackup@%]: Skipping because of failed dependencies
Error: Could not prefetch mysql_grant provider 'mysql': Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)

Notice: /Stage[main]/Galera/Mysql_grant[sst_xtrabackup@%/*.*]: Dependency Service[mysqld] has failures: true
Warning: /Stage[main]/Galera/Mysql_grant[sst_xtrabackup@%/*.*]: Skipping because of failed dependencies
Notice: Finished catalog run in 88.01 seconds

When you get errors from

walterheck's picture

When you get errors from puppet that it can't get mysql to start, forget about puppet for a bit. Focus on mysql first, and check the logs to see why it won't start. When mysql has problems starting, the last 10 lines of the log is usually not enough, look further up the logfile for the first error it throws after you try to start mysql. Every error after that first one might just be a snowball effect, and solving the first error will solve the others often as well.

What we have found is that

What we have found is that MySQL will take more than the expected amount of time to start up the first time around. We have modified the init script to wait 60s when starting the service and it has resolved this exact issue.

```
diff --git a/mysql b/mysql
index 17cd8fa..b317560 100755
--- a/mysql
+++ b/mysql
@@ -129,7 +129,7 @@ case "$cmd" in
/usr/bin/mysqld_safe $WSREP_OPTS $other_args > /dev/null 2>&1 &

# 6s was reported in #352070 to be too few when using ndbcluster
- for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; do
+ for i in `seq 1 60`; do
sleep 1
if mysqld_status check_alive nowarn ; then break; fi
log_progress_msg "."
```

Did You use puppetlabs-mysql to install percona?

Hi,

I found your post googling for a way to install and manage percona with puppetlabs module mysql. The module is tagged with percona on the forge, but I have no idea how to get to install percona on debian...

Could you please confirm if you are using the module to install percona with puppet and I will gladly try to dig into your code!

Thanks in advance, regards from munich

Jochen

Add new comment

By submitting this form, you accept the Mollom privacy policy.