本文共 16372 字,大约阅读时间需要 54 分钟。
操作系统: CentOS 6.6 x64,本文采用rpm方式安装corosync+pacemaker+drbd,采用二进制版本安装mysql-5.6.29。本文是在基础上进行配置修改,然后进行测试的安装过程。
[root@app1 soft]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.0.24 app1 192.168.0.25 app2 10.10.10.24 app1-priv 10.10.10.25 app2-priv说明:10段是心跳IP, 192.168段是业务IP, 采用VIP地址是192.168.0.26。
sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config
setenforce 0 chkconfig iptables off service iptables stop
app1:
[root@app1 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' [root@app1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app2app2:
[root@app2 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' [root@app2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app1
app1: /dev/sdb1 —> app2: /dev/sdb1
drbd84-utils-8.9.1-1.el6.elrepo.x86_64.rpm
kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm# rpm -ivh drbd84-utils-8.9.5-1.el6.elrepo.x86_64.rpm kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm
Preparing... ########################################### [100%] 1:drbd84-utils ########################################### [ 50%] 2:kmod-drbd84 ########################################### [100%] Working. This may take some time ... Done. #
app1,app2分别操作,并加入到/etc/rc.local文件中。
modprobe drbd lsmode |grep drbd
[root@app1 ~]# vi /etc/drbd.d/global_common.conf
global { usage-count no; } common { protocol C; disk { on-io-error detach; no-disk-flushes; no-md-flushes; } net { sndbuf-size 512k; max-buffers 8000; unplug-watermark 1024; max-epoch-size 8000; cram-hmac-alg "sha1"; shared-secret "hdhwXes23sYEhart8t"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 300M; al-extents 517; } }resource data {
on app1 { device /dev/drbd0; disk /dev/sdb1; address 10.10.10.24:7788; meta-disk internal; } on app2 { device /dev/drbd0; disk /dev/sdb1; address 10.10.10.25:7788; meta-disk internal; } }
在app1和app2上分别执行:
# drbdadm create-md data
initializing activity log
NOT initializing bitmap Writing meta data... New drbd meta data block successfully created.
在app1和app2上分别执行:或采用 drbdadm up data
# service drbd start
Starting DRBD resources: [
create res: data prepare disk: data adjust disk: data adjust net: data ] .......... #
cat /proc/drbd #或者直接使用命令drbd-overview
节点1:
[root@app1 drbd.d]# cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116 节点2: [root@app2 drbd.d]# cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116
我们需要将其中一个节点设置为Primary,在要设置为Primary的节点上执行如下两条命令均可:
drbdadm -- --overwrite-data-of-peer primary data drbdadm primary --force data 主节点查看同步状态: [root@app1 drbd.d]# cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:1229428 nr:0 dw:0 dr:1230100 al:0 bm:0 lo:0 pe:2 ua:0 ap:0 ep:1 wo:d oos:19735828 [>...................] sync'ed: 5.9% (19272/20472)M finish: 0:27:58 speed: 11,744 (11,808) K/sec [root@app1 drbd.d]#
文件系统的挂载只能在Primary节点进行,只有在设置了主节点后才能对drbd设备进行格式化, 格式化与手动挂载测试。
[root@app1 ~]# mkfs.ext4 /dev/drbd0
[root@app1 ~]# mount /dev/drbd0 /data
wget
tar zxvf mysql-5.6.29-linux-glibc2.5-x86_64.tar.gz -C /usr/local cd /usr/local/ ln -sv mysql-5.6.29-linux-glibc2.5-x86_64 mysql groupadd mysql useradd -g mysql -M -s /sbin/nologin mysql chown -R mysql:mysql /usr/local/mysql
/usr/local/mysql/scripts/mysql_install_db --user=mysql --basedir=/usr/local/mysql --datadir=/data/mysql3306
cd /usr/local/mysql
cp support-files/my-default.cnf /etc/my.cnf cp support-files/mysql.server /etc/rc.d/init.d/mysqld chkconfig --add mysqld
ln -sf /usr/local/mysql/bin/mysql /usr/bin/mysql
ln -sf /usr/local/mysql/bin/mysqldump /usr/bin/mysqldump ln -sf /usr/local/mysql/bin/myisamchk /usr/bin/myisamchk ln -sf /usr/local/mysql/bin/mysqld_safe /usr/bin/mysqld_safe或通过加入环境变量中解决。
# vi /etc/profile
export PATH=/usr/local/mysql/bin/:$PATH # source /etc/profileln -sv /usr/local/mysql/include /usr/include/mysql
echo '/usr/local/mysql/lib' > /etc/ld.so.conf.d/mysql.conf ldconfig
vi /etc/my.cnf
[client]
port = 3306 default-character-set = utf8 socket = /tmp/mysql.sock [mysqld] character-set-server = utf8 collation-server = utf8_general_ci port = 3306 socket = /tmp/mysql.sock basedir = /usr/local/mysql datadir = /data/mysql3306 skip-external-locking key_buffer_size = 16M max_allowed_packet = 1M table_open_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M log-bin = mysql-bin binlog_format = mixed server-id = 1 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash[myisamchk]
key_buffer_size = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M[mysqlhotcopy]
interactive-timeout
service mysqld start
# /usr/local/mysql/bin/mysqladmin -u root password 'admin' #设置管理员密码
# /usr/local/mysql/bin/mysql -u root -p #测试密码输入
# scp /etc/my.cnf app2:/etc/
[root@node1 ~]# service mysqld stop
[root@node1 data]# chkconfig mysqld off
# umount /data/
# drbdadm secondary data # drbd-overview 0:web/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----# drbdadm primary data
# drbd-overview 0:web/0 Connected Primary/Secondary UpToDate/UpToDate C r-----# mkdir /data
# mount /dev/drbd0 /data/ # service mysqld start
# yum install corosync pacemaker -y
RHEL自6.4起不再提供集群的命令行配置工具crmsh,要实现对集群资源管理,还需要独立安装crmsh。
crmsh的rpm安装可从如下地址下载:[root@app1 crm]# yum install python-dateutil -y
说明:python-pssh、pssh依懒于python-dateutil包[root@app1 crm]# rpm -ivh pssh-2.3.1-4.2.x86_64.rpm python-pssh-2.3.1-4.2.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm
warning: pssh-2.3.1-4.2.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 17280ddf: NOKEY Preparing... ########################################### [100%] 1:python-pssh ########################################### [ 33%] 2:pssh ########################################### [ 67%] 3:crmsh ########################################### [100%] [root@app1 crm]# [root@app1 crm]#
cd /etc/corosync/
cp corosync.conf.example corosync.confvi /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 secauth: on threads: 0 interface { ringnumber: 0 bindnetaddr: 10.10.10.0 mcastaddr: 226.94.8.8 mcastport: 5405 ttl: 1 } }logging {
fileline: off to_stderr: no to_logfile: yes to_syslog: no logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } }amf {
mode: disabled }service {
ver: 1 name: pacemaker } aisexec { user: root group: root }
各节点之间通信需要安全认证,需要安全密钥,生成后会自动保存至当前目录下,命名为authkey,权限为400。
[root@app1 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator. Gathering 1024 bits for key from /dev/random. Press keys on your keyboard to generate entropy. Press keys on your keyboard to generate entropy (bits = 128). Press keys on your keyboard to generate entropy (bits = 192). Press keys on your keyboard to generate entropy (bits = 256). Press keys on your keyboard to generate entropy (bits = 320). Press keys on your keyboard to generate entropy (bits = 384). Press keys on your keyboard to generate entropy (bits = 448). Press keys on your keyboard to generate entropy (bits = 512). Press keys on your keyboard to generate entropy (bits = 576). Press keys on your keyboard to generate entropy (bits = 640). Press keys on your keyboard to generate entropy (bits = 704). Press keys on your keyboard to generate entropy (bits = 768). Press keys on your keyboard to generate entropy (bits = 832). Press keys on your keyboard to generate entropy (bits = 896). Press keys on your keyboard to generate entropy (bits = 960). Writing corosync key to /etc/corosync/authkey. [root@app1 corosync]#
# scp authkeys corosync.conf root@app2:/etc/corosync/
节点1:
[root@app1 ~]# service corosync start Starting Corosync Cluster Engine (corosync): [OK][root@app1 ~]# service pacemaker start
Starting Pacemaker Cluster Manager [OK]配置服务开机自启动:
chkconfig corosync on chkconfig pacemaker on 节点2: [root@app2 ~]# service corosync start Starting Corosync Cluster Engine (corosync): [OK][root@app1 ~]# service pacemaker start
Starting Pacemaker Cluster Manager [OK]配置服务开机自启动:
chkconfig corosync on chkconfig pacemaker on
[root@app1 ~]# crm status
Last updated: Tue Jan 26 13:13:19 2016 Last change: Mon Jan 25 17:46:04 2016 via cibadmin on app1 Stack: classic openais (with plugin) Current DC: app1 - partition with quorum Version: 1.1.10-14.el6-368c726 2 Nodes configured, 2 expected votes 0 Resources configuredOnline: [ app1 app2 ]
# netstat -tunlp
Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name udp 0 0 10.10.10.25:5404 0.0.0.0:* 2828/corosync udp 0 0 10.10.10.25:5405 0.0.0.0:* 2828/corosync udp 0 0 226.94.8.8:5405 0.0.0.0:* 2828/corosync
[root@app1 corosync]# tail -f /var/log/cluster/corosync.log
可以查看日志中关键信息:
Jan 23 16:09:30 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service. Jan 23 16:09:30 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. .... Jan 23 16:09:30 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Jan 23 16:09:31 corosync [TOTEM ] The network interface [10.10.10.24] is now up. Jan 23 16:09:31 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 23 16:09:48 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. [root@app1 corosync]#
corosync默认启用了stonith功能,而我们要配置的集群并没有stonith设备,因此在配置集群的全局属性时要对其禁用。
# crm
crm(live)# configure ##进入配置模式 crm(live)configure# property stonith-enabled=false ##禁用stonith设备 crm(live)configure# property no-quorum-policy=ignore ##不具备法定票数时采取的动作 crm(live)configure# rsc_defaults resource-stickiness=100 ##设置默认的资源黏性,只对当前节点有效。 crm(live)configure# verify ##校验 crm(live)configure# commit ##校验没有错误再提交 crm(live)configure# show ##查看当前配置 node app1 node app2 property cib-bootstrap-options: \ dc-version=1.1.11-97629de \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes=2 \ stonith-enabled=false \ default-resource-stickiness=100 \ no-quorum-policy=ignore
#命令使用经验说明:verify报错的,可以直接退出,也可以采用edit编辑,修改正确为止。
# crm configure edit 可以直接编辑配置文件
不要单个资源提交,等所有资源及约束一起建立之后提交。
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.0.26 cidr_netmask=24 nic=eth0:1 op monitor interval=30s timeout=20s on-fail=restart crm(live)configure# verify
crm(live)configure# primitive mydrbd ocf:linbit:drbd params drbd_resource=data op monitor role=Master interval=20 timeout=30 op monitor role=Slave interval=30 timeout=30 op start timeout=240 op stop timeout=100
crm(live)configure# verify把drbd设为主从资源:
crm(live)configure# ms ms_mydrbd mydrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true crm(live)configure# verify
crm(live)configure# primitive mystore ocf:heartbeat:Filesystem params device=/dev/drbd0 directory=/data fstype=ext4 op start timeout=60s op stop timeout=60s op monitor interval=30s timeout=40s on-fail=restart
crm(live)configure# verify
创建组资源,vip与mystore一起。
crm(live)configure# group g_service vip mystore crm(live)configure# verify创建位置约束,组资源的启动依懒于drbd主节点
crm(live)configure# colocation c_g_service inf: g_service ms_mydrbd:Master创建位置约整,mystore存储挂载依赖于drbd主节点
crm(live)configure# colocation mystore_with_drbd_master inf: mystore ms_mydrbd:Master
启动顺序依懒,drbd启动后,创建g_service组资源
crm(live)configure# order o_g_service inf: ms_mydrbd:promote g_service:start
crm(live)configure# verify crm(live)configure# commit
crm(live)# configure
crm(live)configure# primitive mysqld lsb:mysqld op monitor interval=20 timeout=20 on-fail=restart 创建mysql服务与g_service组在一起 crm(live)configure# colocation mysqld_with_g_service inf: mysqld g_service crm(live)configure# verify crm(live)configure# show创建启动顺序,mysql服务在g_service组启动之后再启动
crm(live)configure# order mysqld_after_g_service mandatory: g_service mysqld crm(live)configure# verify crm(live)configure# show crm(live)configure# commit
[root@app1 ~]# crm status
Last updated: Fri Apr 29 14:59:14 2016 Last change: Fri Apr 29 14:59:05 2016 via cibadmin on app1 Stack: classic openais (with plugin) Current DC: app1 - partition with quorum Version: 1.1.10-14.el6-368c726 2 Nodes configured, 2 expected votes 5 Resources configuredOnline: [ app1 app2 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app1 ] Slaves: [ app2 ] mysqld (lsb:mysqld): Started app1 Resource Group: g_service vip (ocf::heartbeat:IPaddr): Started app1 mystore (ocf::heartbeat:Filesystem): Started app1[root@app1 ~]#
[root@app1 mysql]# crm node standby app1
[root@app1 ~]# crm status
Last updated: Fri Apr 29 15:12:01 2016 Last change: Fri Apr 29 15:01:49 2016 via crm_attribute on app1 Stack: classic openais (with plugin) Current DC: app1 - partition with quorum Version: 1.1.10-14.el6-368c726 2 Nodes configured, 2 expected votes 5 Resources configuredNode app1: standby
Online: [ app2 ]Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app2 ] Stopped: [ app1 ] mysqld (lsb:mysqld): Started app2 Resource Group: g_service vip (ocf::heartbeat:IPaddr): Started app2 mystore (ocf::heartbeat:Filesystem): Started app2 [root@app1 ~]#
[root@app2 ~]# mysql -uroot -padmin
Warning: Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.29-log MySQL Community Server (GPL)Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective owners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> \q
Bye
[root@app2 ~]# df -h
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_app2-lv_root 36G 5.0G 29G 16% / tmpfs 1004M 29M 976M 3% /dev/shm /dev/sda1 485M 39M 421M 9% /boot /dev/drbd0 5.0G 249M 4.5G 6% /data [root@app2 ~]# [root@app2 ~]##说明:切换测试时有时会出现警告提示,影响真实状态查看,可以采用如下方式清除,提示哪个资源报警就清哪个,清理后,再次crm status查看状态显示正常。
Failed actions: mystore_stop_0 on app1 'unknown error' (1): call=97, status=complete, last-rc-change='Tue Jan 26 14:39:21 2016', queued=6390ms, exec=0ms[root@app1 ~]# crm resource cleanup mystore
Cleaning up mystore on app1 Cleaning up mystore on app2 Waiting for 2 replies from the CRMd.. OK [root@app1 ~]#
在切换的过程中最大的问题就是DRBD的同步问题,必竟数据都在磁盘上,如果不同步就会造成数据不一致的问题,standby模拟切换其实不能真实模拟drbd的故障转移的。因为在故障转移之后,drbd被stop之后,从库接管主节点会从因stop之后会出现unknownn状态,这时候需要做会数据初始化同步。
本文出自 “” 博客,请务必保留此出处