OCR损坏RAC集群服务无法启动:CRS-0704、CRS-10132: No msg for has:crs-10132 [10][60]、Could not init OLR

2023-10-11 05:48

本文主要是介绍OCR损坏RAC集群服务无法启动:CRS-0704、CRS-10132: No msg for has:crs-10132 [10][60]、Could not init OLR,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、环境描述:

 RedHat5.8 + ORACLE11204 + RAC

 

二、问题描述:

OCR(Oracle Cluster Registry)、Voting disk(Voting disks manage information about node membership)对应的物理磁盘损坏,从自动备份的OCR_VOTE集群服务无法正常启动,报错如下:

 

 ohasd.log:

[ohasd(18298)]CRS-0704:Oracle High Availability Service aborted due to Oracle Local Registry error [PROCL-24: Error in the messaging layer Messaging error [gipcretAddressInUse] [20]]. Details at (:OHAS00106:) in /u01/app/11.2.0/grid/log/kawjrmdb001l/ohasd/ohasd.log.
[client(18359)]CRS-10001:CRS-10132: No msg for has:crs-10132 [10][60]

 

ossd.log

2014-09-10 14:48:29.907: [  CRSOCR][2428572496] OCR context init failure.  Error: PROCL-24: Error in the messaging layer Messaging error [gipcretAddressInUse] [20]
2014-09-10 14:48:29.908: [ default][2428572496] Created alert : (:OHAS00106:) :  OLR initialization failed, error: PROCL-24: Error in the messaging layer Messaging error [gipcretAddressInUse] [20]
2014-09-10 14:48:29.908: [ default][2428572496][PANIC] OHASD exiting; Could not init OLR

三、问题分析:

11gR2开始,OCR、Voting disk存放于ASM磁盘组里,OCR是记录着集群的配置信息,VOTEDISK是集群的仲裁盘,二者都起着重启性作用。如果OCR VOTEDISK损坏,将无法启动集群服务包括数据库。好在集群软件会每隔4小时做一次备份,可以通过集群命令ocrconfig -showbackup来查看具体的备份文件。

OLR:OLR resides on every node in the cluster and manages Oracle Clusterware configuration information for each particular node

 

四、解决方法:

1. 查看自动备份的全路径:

$ ocrconfig -showbackup

2. 还原OCR、VOTING DISK

# crsctl stop crs -f

# /u01/app/11.2.0/grid/bin/ocrconfig -local -restore /u01/app/11.2.0/grid/cdata/kawjrmd-cluster/backup00.ocr

3. 启动集群进程

# crsctl start crs -excl

CRS无法启动,报错信息详见本文“问题描述”

4. 无法初始化OLR的解决

1. 删除OLR配置

$GRID_HOME/crs/install/rootcrs.pl -deconfig -force

Using configuration parameter file: ./crsconfig_params
PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node

2. 执行root.sh脚本 

# $GRID_HOME/root.sh (忽略任何报错信息)

./root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.mdnsd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.gpnpd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'kawjrmdb001l'
CRS-2672: Attempting to start 'ora.gipcd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.cssdmonitor' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.gipcd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'kawjrmdb001l'
CRS-2672: Attempting to start 'ora.diskmon' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.diskmon' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.cssd' on 'kawjrmdb001l' succeeded

ASM created and started successfully.

Disk Group OCR_VOTE created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Successful addition of voting disk a9be444f48c84facbfb04d9fbd60f955.
Successfully replaced voting disk group with +OCR_VOTE.
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a9be444f48c84facbfb04d9fbd60f955 (/dev/oracleasm/disks/OCR_VOTE) [OCR_VOTE]
Located 1 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.asm' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.OCR_VOTE.dg' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.OCR_VOTE.dg' on 'kawjrmdb001l' succeeded
/u01/app/11.2.0/grid/bin/srvctl start nodeapps -n kawjrmdb001l ... failed
FirstNode configuration failed at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 9380.
/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

 

3. 关闭集群进程

# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.crsd' on 'kawjrmdb001l'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.OCR_VOTE.dg' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.OCR_VOTE.dg' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.asm' on 'kawjrmdb001l' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'kawjrmdb001l' has completed
CRS-2677: Stop of 'ora.crsd' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.ctssd' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.evmd' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.asm' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.evmd' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.crf' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.asm' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.cssd' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.drivers.acfs' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.gpnpd' on 'kawjrmdb001l' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'kawjrmdb001l' has completed

5. 还原OCR、VOTING DISK 

1. 以独占模式启动CRS进程

 crsctl start crs -excl

CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.mdnsd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.gpnpd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'kawjrmdb001l'
CRS-2672: Attempting to start 'ora.gipcd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.cssdmonitor' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.gipcd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'kawjrmdb001l'
CRS-2672: Attempting to start 'ora.diskmon' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.diskmon' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.cssd' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'kawjrmdb001l'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'kawjrmdb001l'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'kawjrmdb001l'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.drivers.acfs' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.ctssd' on 'kawjrmdb001l' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.asm' on 'kawjrmdb001l' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'kawjrmdb001l'
CRS-2676: Start of 'ora.crsd' on 'kawjrmdb001l' succeeded

2. 关闭crsd进程 

crsctl stop resource ora.crsd -init

CRS-2673: Attempting to stop 'ora.crsd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.crsd' on 'kawjrmdb001l' succeeded

3. 从备份中还原OCR

# /u01/app/11.2.0/grid/bin/ocrconfig -restore /u01/app/11.2.0/grid/cdata/kawjrmd-cluster/backup00.ocr

$ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3124
         Available space (kbytes) :     258996
         ID                       :  742521882
         Device/File Name         :  +OCR_VOTE
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

4. 重启CRS进程

# crsctl stop crs -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.ctssd' on 'kawjrmdb001l'
CRS-2673: Attempting to stop 'ora.asm' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.ctssd' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.asm' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.cssd' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.drivers.acfs' on 'kawjrmdb001l' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'kawjrmdb001l' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'kawjrmdb001l'
CRS-2677: Stop of 'ora.gpnpd' on 'kawjrmdb001l' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'kawjrmdb001l' has completed
CRS-4133: Oracle High Availability Services has been stopped.

 

# crsctl start crs <all nodes>

$ crsctl stat res -t 

--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  OFFLINE      kawjrmdb001l                                
               ONLINE  OFFLINE      kawjrmdb002l                                
ora.LISTENER.lsnr
               ONLINE  OFFLINE      kawjrmdb001l                                
               ONLINE  OFFLINE      kawjrmdb002l                                
ora.OCR_VOTE.dg
               ONLINE  ONLINE       kawjrmdb001l                                
               ONLINE  ONLINE       kawjrmdb002l                                
ora.asm
               ONLINE  ONLINE       kawjrmdb001l             Started            
               ONLINE  ONLINE       kawjrmdb002l             Started            
ora.gsd
               OFFLINE OFFLINE      kawjrmdb001l                                
               OFFLINE OFFLINE      kawjrmdb002l                                
ora.net1.network
               ONLINE  OFFLINE      kawjrmdb001l                                
               ONLINE  OFFLINE      kawjrmdb002l                                
ora.ons
               ONLINE  OFFLINE      kawjrmdb001l                                
               ONLINE  OFFLINE      kawjrmdb002l                                
ora.registry.acfs
               ONLINE  ONLINE       kawjrmdb001l                                
               ONLINE  ONLINE       kawjrmdb002l                                
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  OFFLINE                                                  
ora.cvu
      1        ONLINE  OFFLINE                                                  
ora.filesrv.db
      1        ONLINE  OFFLINE                               Instance Shutdown  
      2        ONLINE  OFFLINE                               Instance Shutdown  
ora.fjrcpmis.db
      1        ONLINE  OFFLINE                               Instance Shutdown  
      2        ONLINE  OFFLINE                               Instance Shutdown  
ora.kawjrmdb001l.vip
      1        ONLINE  OFFLINE                                                  
ora.kawjrmdb002l.vip
      1        ONLINE  OFFLINE                                                  
ora.oc4j
      1        ONLINE  ONLINE       kawjrmdb001l                                
ora.scan1.vip
      1        ONLINE  OFFLINE                                                  

 

至此,OCR、VOTING DISK已经恢复完成,集群服务也顺利启动。

 

五、启示总结

关键性的设备或文件尽量要做冗余,如OCR、VOTING DISK,controlfile,redo logfile...

-------------------------------------------------------------------------------------------------

本文来自于我的技术博客 http://blog.csdn.net/robo23

转载请标注源文链接,否则追究法律责任!

 

这篇关于OCR损坏RAC集群服务无法启动:CRS-0704、CRS-10132: No msg for has:crs-10132 [10][60]、Could not init OLR的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/185973

相关文章

Linux创建服务使用systemctl管理详解

《Linux创建服务使用systemctl管理详解》文章指导在Linux中创建systemd服务,设置文件权限为所有者读写、其他只读,重新加载配置,启动服务并检查状态,确保服务正常运行,关键步骤包括权... 目录创建服务 /usr/lib/systemd/system/设置服务文件权限:所有者读写js,其他

Java服务实现开启Debug远程调试

《Java服务实现开启Debug远程调试》文章介绍如何通过JVM参数开启Java服务远程调试,便于在线上排查问题,在IDEA中配置客户端连接,实现无需频繁部署的调试,提升效率... 目录一、背景二、相关图示说明三、具体操作步骤1、服务端配置2、客户端配置总结一、背景日常项目中,通常我们的代码都是部署到远程

前端导出Excel文件出现乱码或文件损坏问题的解决办法

《前端导出Excel文件出现乱码或文件损坏问题的解决办法》在现代网页应用程序中,前端有时需要与后端进行数据交互,包括下载文件,:本文主要介绍前端导出Excel文件出现乱码或文件损坏问题的解决办法,... 目录1. 检查后端返回的数据格式2. 前端正确处理二进制数据方案 1:直接下载(推荐)方案 2:手动构造

sysmain服务可以禁用吗? 电脑sysmain服务关闭后的影响与操作指南

《sysmain服务可以禁用吗?电脑sysmain服务关闭后的影响与操作指南》在Windows系统中,SysMain服务(原名Superfetch)作为一个旨在提升系统性能的关键组件,一直备受用户关... 在使用 Windows 系统时,有时候真有点像在「开盲盒」。全新安装系统后的「默认设置」,往往并不尽编

Python 基于http.server模块实现简单http服务的代码举例

《Python基于http.server模块实现简单http服务的代码举例》Pythonhttp.server模块通过继承BaseHTTPRequestHandler处理HTTP请求,使用Threa... 目录测试环境代码实现相关介绍模块简介类及相关函数简介参考链接测试环境win11专业版python

SpringBoot通过main方法启动web项目实践

《SpringBoot通过main方法启动web项目实践》SpringBoot通过SpringApplication.run()启动Web项目,自动推断应用类型,加载初始化器与监听器,配置Spring... 目录1. 启动入口:SpringApplication.run()2. SpringApplicat

Nginx中配置使用非默认80端口进行服务的完整指南

《Nginx中配置使用非默认80端口进行服务的完整指南》在实际生产环境中,我们经常需要将Nginx配置在其他端口上运行,本文将详细介绍如何在Nginx中配置使用非默认端口进行服务,希望对大家有所帮助... 目录一、为什么需要使用非默认端口二、配置Nginx使用非默认端口的基本方法2.1 修改listen指令

Redis中哨兵机制和集群的区别及说明

《Redis中哨兵机制和集群的区别及说明》Redis哨兵通过主从复制实现高可用,适用于中小规模数据;集群采用分布式分片,支持动态扩展,适合大规模数据,哨兵管理简单但扩展性弱,集群性能更强但架构复杂,根... 目录一、架构设计与节点角色1. 哨兵机制(Sentinel)2. 集群(Cluster)二、数据分片

解决Nginx启动报错Job for nginx.service failed because the control process exited with error code问题

《解决Nginx启动报错Jobfornginx.servicefailedbecausethecontrolprocessexitedwitherrorcode问题》Nginx启... 目录一、报错如下二、解决原因三、解决方式总结一、报错如下Job for nginx.service failed bec

SysMain服务可以关吗? 解决SysMain服务导致的高CPU使用率问题

《SysMain服务可以关吗?解决SysMain服务导致的高CPU使用率问题》SysMain服务是超级预读取,该服务会记录您打开应用程序的模式,并预先将它们加载到内存中以节省时间,但它可能占用大量... 在使用电脑的过程中,CPU使用率居高不下是许多用户都遇到过的问题,其中名为SysMain的服务往往是罪魁