Oracle数据库恢复后报错ORA-600: [4194]处理

2023-11-05 08:12

本文主要是介绍Oracle数据库恢复后报错ORA-600: [4194]处理,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

Oracle数据库恢复后报错ORA-600: [4194]处理

  • 故障现象
  • 处理办法
    • 重建UNDO表空间
    • ORA-600 [4137]报错
    • 可能的扫尾工作

故障现象

现象:完成NBU带库恢复后,测试库打开后几分钟就会自己宕机挂掉。

告警日志报错如下:

Errors in file /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/ORCL_smon_201857.trc  (incident=592157):
ORA-00600: internal error code, arguments: [4194], [546.27.149175], [0], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/app/diag/rdbms/ORCL_0/ORCL/incident/incdir_592157/ORCL_smon_201857_i592157.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Oct 30 09:17:09 2023
PMON (ospid: 201781): terminating the instance due to error 474
System state dump requested by (instance=1, osid=201781 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/ORCL_diag_201796_20231030091710.trc
Errors in file /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/ORCL_ora_203196.trc:
ORA-00474: SMON 进程因错误而终止
ORA-00600: 内部错误代码, 参数: [4194], [u do not have the SHARED lock on this object.], [], [], [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [4194], [ unlock objec], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/ORCL_ora_203196.trc:
ORA-00474: SMON 进程因错误而终止

在MOS查询ORA-00600: internal error code, argument: [4194]这个报错,得到的解释如下(Doc ID 39283.1):

A mismatch has been detected between Redo records and rollback (Undo) records.
...
This error may indicate a rollback segment corruption.
...
This may require a recovery from a database backup depending on the situation.

⭐️ 具体解决办法可以参考 Step by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1)。

以下是我的处理过程。

处理办法

重建UNDO表空间

检查控制文件和数据文件头中记录的最新的SCN:

idle> startup mount;SQL> col checkpoint_change# for 999999999999999
SQL> select distinct checkpoint_change#  from v$datafile;  --控制文件中记录的最后一次checkpoint时的SCNCHECKPOINT_CHANGE#
------------------1053731346332SQL> select distinct checkpoint_change# from v$datafile_header;  --数据文件头中记录的SCNCHECKPOINT_CHANGE#
------------------1053731346332

发现控制文件和数据文件头中记录的SCN是一致的,考虑重建UNDO表空间即可。

生成一个初始化参数文件:

SQL> create pfile='initORCL_new.ora' from spfile;File created.

修改pfile,修改UNDO管理为手动模式,存储在SYSTEM表空间中,并设置10513事件禁用事务恢复:

[oracle@dbhost dbs]$ cat initORCL_new.ora | grep undo
*._optimizer_undo_cost_change='11.2.0.4'
*._undo_autotune=FALSE
*.undo_retention=10800
*.undo_tablespace='UNDOTBS1'[oracle@dbhost dbs]$ vi initORCL_new.ora 
[oracle@dbhost dbs]$ cat initORCL_new.ora | grep undo
*._optimizer_undo_cost_change='11.2.0.4'
*._undo_autotune=FALSE
*.undo_management='MANUAL'
*.undo_retention=10800
*.undo_tablespace='SYSTEM'
*.event='10513 trace name context forever, level 2'

:我自己实际操作过程中没有设置10513事件,可能会导致ORA-600 [4137]报错,后面会提到。

使用pfile启动数据库:

shutdown immediate;
startup pfile='initORCL_new.ora';

不能有报错,否则要单独对报错进行处理。

创建新的UNDO表空间:

SQL> show parameter undoNAME				     TYPE		    VALUE
------------------------------------ ---------------------- ------------------------------
_optimizer_undo_cost_change	     string		    11.2.0.4
_undo_autotune			     boolean		    FALSE
undo_management 		     string		    MANUAL
undo_retention			     integer		    10800
undo_tablespace 		     string		    SYSTEMSQL> CREATE UNDO TABLESPACE UNDOTBS2;Tablespace created.SQL> alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;
alter tablespace undotbs2 add datafile;Tablespace altered.SQL> select file_name,sum(bytes)/1024/1204/1204 from dba_data_files where tablespace_name like 'UNDOTBS%' group by file_name;FILE_NAME											     SUM(BYTES)/1024/1204/1204
---------------------------------------------------------------------------------------------------- -------------------------
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmskcjmq_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmqyxqqg_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs2_lmy7b0py_.dbf								    .070639397
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmqyo3pk_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lms93876_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs2_lmy7br2f_.dbf								    .070639397
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmrb0qkx_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmqytnn7_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmsb8plf_.dbf								    46.2942131
/oradata/ORCL_0/datafile/o1_mf_undotbs1_lmsl7d0o_.dbf								    46.2942131
...

再次修改pfile,将UNDO管理模式设置为自动,UNDO表空间设置为新建的UNDOTBS2:

[oracle@dbhost dbs]$ vi initORCL_new.ora
[oracle@dbhost dbs]$ cat initORCL_new.ora | grep undo
*._optimizer_undo_cost_change='11.2.0.4'
*._undo_autotune=FALSE
*.undo_management='AUTO'
*.undo_retention=10800
*.undo_tablespace='UNDOTBS2'

重启数据库:

SQL> shutdown immediate;
SQL> create spfile from pfile='initORCL_new.ora';
SQL> startup;   SQL> show parameter undoNAME				     TYPE		    VALUE
------------------------------------ ---------------------- ------------------------------
_optimizer_undo_cost_change	     string		    11.2.0.4
_undo_autotune			     boolean		    FALSE
undo_management 		     string		    AUTO
undo_retention			     integer		    10800
undo_tablespace 		     string		    UNDOTBS2

检查ALERT日志,发现新的报错ORA-600 [4137](这里可能是没有设置10513事件才会出现的报错):

[oracle@dbhost ~]$ tail -n300 /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/alert_ORCL.log
...
Sweep [inc2][624461]: completed
Sweep [inc2][624460]: completed
ORACLE Instance ORCL (pid = 36) - Error 600 encountered while recovering transaction (546, 27).
Errors in file /oracle/app/diag/rdbms/ORCL_0/ORCL/trace/ORCL_smon_224148.trc  (incident=640172):
ORA-00600: internal error code, arguments: [4137], [546.27.149175], [0], [0], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance ORCL (pid = 36) - Error 600 encountered while recovering transaction (546, 27).
Mon Oct 30 11:14:27 2023
Sweep [inc][640172]: completed
Sweep [inc][624467]: completed

ORA-600 [4137]报错

查询MOS可知报错 ORA-600 [4137] 的解释如下:

There is a mismatch between the XID in the undo segment header and the XID in the undo block
during rollback or transaction recovery.  This would indicate a corrupted rollback segment

尝试删除旧的UNDO:

SQL> drop tablespace UNDOTBS1 including contents and datafiles;
drop tablespace UNDOTBS1 including contents and datafiles
*
ERROR at line 1:
ORA-01548: active rollback segment '_SYSSMU314_3300756365$' found, terminate dropping tablespaceSQL> select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';TABLESPACE_NAME 					     STATUS			      SEGMENT_NAME
------------------------------------------------------------ -------------------------------- ------------------------------
SYSTEM							     ONLINE			      SYSTEM
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU546_811175239$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU360_2198386275$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU347_654930751$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU314_3300756365$
UNDOTBS2						     ONLINE			      _SYSSMU1274_2513395007$
UNDOTBS2						     ONLINE			      _SYSSMU1273_1341585299$
UNDOTBS2						     ONLINE			      _SYSSMU1272_34058637$
UNDOTBS2						     ONLINE			      _SYSSMU1271_4040385653$
UNDOTBS2						     ONLINE			      _SYSSMU1270_1270536444$
UNDOTBS2						     ONLINE			      _SYSSMU1269_402143936$
UNDOTBS2						     ONLINE			      _SYSSMU1268_4100704859$
UNDOTBS2						     ONLINE			      _SYSSMU1267_2250107085$
UNDOTBS2						     ONLINE			      _SYSSMU1266_94778785$
UNDOTBS2						     ONLINE			      _SYSSMU1265_4196515074$15 rows selected.

不能删除的原因是UNDOTBS1还有未下线的段,状态为PARTLY AVAILABLE

过了一会儿数据库又宕机了,检查发现是生成了大量trace文件占满了oracle目录。可能是因为没有设置10513事件,大量事务恢复失败的日志不停地刷到trace文件中。

[oracle@dbhost trace]$ du -sh $ORACLE_BASE/diag/rdbms/ORCL_0/${ORACLE_SID}/ 
5.2G	/oracle/app/diag/rdbms/ORCL_0/ORCL/
[oracle@dbhost trace]$ du -sh $ORACLE_BASE/diag/rdbms/ORCL_0/${ORACLE_SID}/trace/ 
31G	/oracle/app/diag/rdbms/ORCL_0/ORCL/trace/
[oracle@dbhost trace]$ df -h | grep oracle
/dev/mapper/VolGroup-lv_oracle    50G   50G  848M  99% /oracle

通过隐含参数忽略UNDOTBS1中未下线的回滚段:

SQL> select tablespace_name, status, segment_name from dba_rollback_segs 
where tablespace_name='UNDOTBS1' and status != 'OFFLINE';TABLESPACE_NAME 					     STATUS			      SEGMENT_NAME
------------------------------------------------------------ -------------------------------- ------------------------------
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU546_811175239$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU360_2198386275$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU347_654930751$
UNDOTBS1						     PARTLY AVAILABLE		      _SYSSMU314_3300756365$

修改initORCL_new.ora添加隐含参数:

*._corrupted_rollback_segments='_SYSSMU546_811175239$','_SYSSMU360_2198386275$','_SYSSMU347_654930751$','_SYSSMU314_3300756365$'

启动数据库:

SQL> create spfile from pfile='initORCL_new.ora';
SQL> startup;--确认是否已忽略
SQL> select segment_name,tablespace_name,status from dba_rollback_segs where tablespace_name='UNDOTBS1' and status != 'OFFLINE';SEGMENT_NAME		       TABLESPACE_NAME						    STATUS
------------------------------ ------------------------------------------------------------ --------------------------------
_SYSSMU314_3300756365$	       UNDOTBS1 						    NEEDS RECOVERY
_SYSSMU347_654930751$	       UNDOTBS1 						    NEEDS RECOVERY
_SYSSMU360_2198386275$	       UNDOTBS1 						    NEEDS RECOVERY
_SYSSMU546_811175239$	       UNDOTBS1 						    NEEDS RECOVERY

这里上面的SQL最好是没有任何输出,但是实际测试发现UNDO段状态变成NEEDS RECOVERY也可以删除UNDOTBS1表空间。

删除旧的UNDO表空间:

SQL> drop tablespace UNDOTBS1 including contents and datafiles;Tablespace dropped.SQL> select segment_name,tablespace_name,status from dba_rollback_segs where tablespace_name='UNDOTBS1' and status != 'OFFLINE';no row selected.

检查ALERT日志有无报错。

可能的扫尾工作

停止数据库,以便移除掉10513事件和_corrupted_rollback_segments隐含参数:

SQL> shutdown immediate;
SQL> create pfile from spfile;

移除pfile中的下列参数:

##*.event='10513 trace name context forever, level 2'
##*._corrupted_rollback_segments"='_SYSSMU546_811175239$','_SYSSMU360_2198386275$','_SYSSMU347_654930751$','_SYSSMU314_3300756365$'

重建spfile并拉起数据库:

SQL> create spfile from pfile='initORCL.ora';
SQL> startup;

检查ALERT日志有无报错。可能遇到的TEMP表空间为空的报错:

*********************************************************************
WARNING: The following temporary tablespaces contain no files.This condition can occur when a backup controlfile hasbeen restored.  It may be necessary to add files to thesetablespaces.  That can be done using the SQL statement:ALTER TABLESPACE <tablespace_name> ADD TEMPFILEAlternatively, if these temporary tablespaces are no longerneeded, then they can be dropped.Empty temporary tablespace: TEMP
*********************************************************************

为临时表空间添加临时文件即可:

SQL> alter tablespace temp add tempfile; 

REFs
【1】https://www.modb.pro/db/48609
【2】https://blog.csdn.net/sinat_36757755/article/details/130333335
【3】https://www.modb.pro/db/45428
【4】Step by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1)

这篇关于Oracle数据库恢复后报错ORA-600: [4194]处理的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/348549

相关文章

Python使用vllm处理多模态数据的预处理技巧

《Python使用vllm处理多模态数据的预处理技巧》本文深入探讨了在Python环境下使用vLLM处理多模态数据的预处理技巧,我们将从基础概念出发,详细讲解文本、图像、音频等多模态数据的预处理方法,... 目录1. 背景介绍1.1 目的和范围1.2 预期读者1.3 文档结构概述1.4 术语表1.4.1 核

Spring Boot @RestControllerAdvice全局异常处理最佳实践

《SpringBoot@RestControllerAdvice全局异常处理最佳实践》本文详解SpringBoot中通过@RestControllerAdvice实现全局异常处理,强调代码复用、统... 目录前言一、为什么要使用全局异常处理?二、核心注解解析1. @RestControllerAdvice2

使用Python实现可恢复式多线程下载器

《使用Python实现可恢复式多线程下载器》在数字时代,大文件下载已成为日常操作,本文将手把手教你用Python打造专业级下载器,实现断点续传,多线程加速,速度限制等功能,感兴趣的小伙伴可以了解下... 目录一、智能续传:从崩溃边缘抢救进度二、多线程加速:榨干网络带宽三、速度控制:做网络的好邻居四、终端交互

MySQL数据库中ENUM的用法是什么详解

《MySQL数据库中ENUM的用法是什么详解》ENUM是一个字符串对象,用于指定一组预定义的值,并可在创建表时使用,下面:本文主要介绍MySQL数据库中ENUM的用法是什么的相关资料,文中通过代码... 目录mysql 中 ENUM 的用法一、ENUM 的定义与语法二、ENUM 的特点三、ENUM 的用法1

Java中调用数据库存储过程的示例代码

《Java中调用数据库存储过程的示例代码》本文介绍Java通过JDBC调用数据库存储过程的方法,涵盖参数类型、执行步骤及数据库差异,需注意异常处理与资源管理,以优化性能并实现复杂业务逻辑,感兴趣的朋友... 目录一、存储过程概述二、Java调用存储过程的基本javascript步骤三、Java调用存储过程示

Go语言数据库编程GORM 的基本使用详解

《Go语言数据库编程GORM的基本使用详解》GORM是Go语言流行的ORM框架,封装database/sql,支持自动迁移、关联、事务等,提供CRUD、条件查询、钩子函数、日志等功能,简化数据库操作... 目录一、安装与初始化1. 安装 GORM 及数据库驱动2. 建立数据库连接二、定义模型结构体三、自动迁

嵌入式数据库SQLite 3配置使用讲解

《嵌入式数据库SQLite3配置使用讲解》本文强调嵌入式项目中SQLite3数据库的重要性,因其零配置、轻量级、跨平台及事务处理特性,可保障数据溯源与责任明确,详细讲解安装配置、基础语法及SQLit... 目录0、惨痛教训1、SQLite3环境配置(1)、下载安装SQLite库(2)、解压下载的文件(3)、

MySQL数据库的内嵌函数和联合查询实例代码

《MySQL数据库的内嵌函数和联合查询实例代码》联合查询是一种将多个查询结果组合在一起的方法,通常使用UNION、UNIONALL、INTERSECT和EXCEPT关键字,下面:本文主要介绍MyS... 目录一.数据库的内嵌函数1.1聚合函数COUNT([DISTINCT] expr)SUM([DISTIN

MySQL追踪数据库表更新操作来源的全面指南

《MySQL追踪数据库表更新操作来源的全面指南》本文将以一个具体问题为例,如何监测哪个IP来源对数据库表statistics_test进行了UPDATE操作,文内探讨了多种方法,并提供了详细的代码... 目录引言1. 为什么需要监控数据库更新操作2. 方法1:启用数据库审计日志(1)mysql/mariad

postgresql数据库基本操作及命令详解

《postgresql数据库基本操作及命令详解》本文介绍了PostgreSQL数据库的基础操作,包括连接、创建、查看数据库,表的增删改查、索引管理、备份恢复及退出命令,适用于数据库管理和开发实践,感兴... 目录1. 连接 PostgreSQL 数据库2. 创建数据库3. 查看当前数据库4. 查看所有数据库