富士通双机软件异常恢复过程
在分析富士通双机软件过程中,跟踪发现XX平台的两台服务器的syslog均没有正常输出,日志输出最后时间点为服务器重启的时间点,查看守护进程不存在,从直觉判断,应该是syslog异常导致PLC软件不能正常工作:
# ps -ef | grep syslog
root 7656 6142 0 11:18:53 pts/4 0:00 grep syslog
通过Solaris的SMF(Service Management Facility)对syslog进行跟踪分析,发现异常,syslog没有正常启动的原因时有两个关联服务disable
# svcs -l svc:/system/system-log:default fmri svc:/system/system-log:default 名称 system log 启用 是
状态 offline next_state none
state_time 2014年04月03日 星期四 16时48分48秒 重启程序 svc:/system/svc/restarter:default
dependency require_all/none svc:/milestone/sysconfig (online) dependency require_all/none svc:/system/filesystem/local (online) dependency optional_all/none svc:/system/filesystem/autofs (disabled) dependency require_all/none svc:/milestone/name-services (disabled) dependency require_all/none svc:/system/fjsvmadm-evhandsd (online)
将disable的程序启动
# svcadm enable svc:/system/filesystem/autofs # svcadm enable svc:/milestone/name-services
重新启动syslog服务
svcadm enable svc:/system/system-log:default
查看进程,syslog启动 # ps -ef | grep syslog
root 655 1 0 10:50:58 ? 0:01 /usr/sbin/syslogd root 7656 6142 0 11:18:53 pts/4 0:00 grep syslog
后续双机软件自动启动,在132服务器上重新进行上述操作,双机软件也恢复正常启动,查看状态如下:
# XXX.XXX.XXX.XXX
# XXX.XXX.XXX.XXX
由于在132上看不到节点2的机器状态,想尝试进行切换,发现失败,可能和程序原来是手工通过root或其他账号启动有关,合适的时候找时间对132进行重启动:
观察到PCL工作异常的日志,后续需要跟踪下: main(1): Got SIGALRM
writemsg(2): Logging msg 'Apr 3 16:07:24 hanet: [ID 361421 user.error] WARNING: 87500: standby interface failed. (sha0)' to CONSOLE /dev/sysmsg
writemsg(9): Logging msg 'Apr 3 16:07:24 hanet: [ID 361421 user.error] WARNING: 87500: standby interface failed. (sha0)' to FILE /var/opt/FJSVmadm/evh/evh_pipe
writemsg(3): Logging msg 'Apr 3 16:07:24 hanet: [ID 361421 user.error] WARNING: 87500: standby interface failed. (sha0)' to FILE /var/adm/messages
writemsg(2): Logging msg 'Apr 3 16:07:24 hanet: [ID 960721 user.error] INFO: 88500: standby interface recovered. (sha0)' to CONSOLE /dev/sysmsg
writemsg(9): Logging msg 'Apr 3 16:07:24 hanet: [ID 960721 user.error] INFO: 88500: standby interface recovered. (sha0)' to FILE /var/opt/FJSVmadm/evh/evh_pipe
writemsg(3): Logging msg 'Apr 3 16:07:24 hanet: [ID 960721 user.error] INFO: 88500: standby interface recovered. (sha0)' to FILE /var/adm/messages #
# ifconfig -a
lo0: flags=2001000849
e1000g1: flags=1000863
inet # XXX.XXX.XXX.XXX netmask ffffff80 broadcast 10.235.156.255 ether 0:21:28:13:65:2b # # #
# /opt/FJSVhanet/usr/sbin/dsphanet [IPv4,Patrol]
Name Status Mode CL Device
+----------+--------+----+----+------------------------------------------------+ sha1 Inactive d ON e1000g1(ON),e1000g0(OFF) sha0 Active p OFF sha1(ON) [IPv6]
Name Status Mode CL Device
+----------+--------+----+----+------------------------------------------------+ #
ARNING: 87500: standby interface failed. (sha0)
资料:
http://docs.oracle.com/cd/E19424-01/820-4809/log_syslog/index.html
http://unix.ittoolbox.com/groups/technical-functional/solaris-l/how-to-run-the-syslogd-server-on-solaris-10-2351469
http://unix.derkeiler.com/Newsgroups/comp.unix.solaris/2006-04/msg01071.html
http://www.oracle.com/technetwork/articles/servers-storage-admin/intro-smf-basics-s11-1729181.html
https://community.oracle.com/thread/1921656?tstart=0
http://www.fujitsu.com/global/services/computing/server/primequest/documents/pcl-manuals.html
http://software.fujitsu.com/jp/manual/manualfiles/m120009/j2uz7781/03enz201/j7781-f-03-02.html
百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说综合文库富士通双机软件异常恢复过程20140404在线全文阅读。
相关推荐: