Oracle force-cr-override flush造成数据库卡顿问题排查思路:

8点半数据库卡顿
9点接收消息进行远程排查
首先查看集群数据库资源是否正常
其次进行tnsname连接数据库正常
cpu 内存 io 都还可以(因为此时已经不卡了)

当前数据库等待事件查看:

SQL> select event,count(*) from gv$session_wait where wait_class<>'Idle' group by event;
EVENT                                                              COUNT(*)
---------------------------------------------------------------- ----------
gc cr request                                                             7
db file sequential read                                                   3
Streams AQ: qmn coordinator waiting for slave to start                    1
force-cr-override flush                                                   3
direct path read                                                          2
PX Deq: reap credit                                                       1
enq: PS - contention                                                      1

查看历史等待事件:

select count(*),EVENT from v$active_session_history where ( to_char(SAMPLE_TIME,'yyyy-mm-dd hh24:mi:ss') > '2020-11-13 08:00:00' ) and (to_char(SAMPLE_TIME,'yyyy-mm-dd hh24:mi:ss') < '2020-11-13 9:00:00') and wait_class<>'idle' group by event order by 1;
.....
.....COUNT(*) EVENT
---------- ----------------------------------------------------------------182 gc current grant 2-way203 gc current block congested206 log file sync231 gc cr block congested251 gc current grant busy270 gcs log flush sync401 gc cr block busy524 gc current block 3-way530 control file sequential read564 log file parallel write691 DFS lock handle772 gc cr block 2-way823 gc current block 2-way1063 gc buffer busy acquire2153 wait for stopper event to be increased3325 enq: PS - contention3397 db file sequential read4640 direct path read25713 force-cr-override flush192329 wait for a undo record

wait for stopper event to be increased 和wait for a undo record。
由于出现大量wait for a undo record,估计当时出现大事务回滚争用问题,进而寻找大事务回滚的原因

Mos上对于该问题解决办法是:
Mos:
Database Appears to Hang Waits for “Wait for a undo record” and “Wait for stopper event to be increased” Due to Parallel Transaction Recover (Doc ID 464246.1)

alter system set fast_start_parallel_rollback = false scope=spfile;
false:禁止并行回滚功能
low:2 x cpu个slave进程数
high:4 x cpu个slave进程数

但是在进行大事务回滚时,数据库应该出现PX Deq: Txn Recovery Start等待事件,但是并没有.


Mos:
Database Appears to Hang Waits for “Wait for a undo record” and “Wait for stopper event to be increased” Due to Parallel Transaction Recover (Doc ID 464246.1)

根据官方文档可以理解为并行回滚会加快恢复的进度,但是在并行回滚的过程中会启动很多的slave进程,会占用系统的
大量cpu,因此我们可能会禁用并行回滚,来减小对系统性能的影响.

分析force-cr-override flush:

force-cr-override flush 对于该等待事件不是十分熟悉通过dba_hist_active_sess_history dba_objects两个视图针对该等待事件进行定位原因
SELECT ASH.event, ASH.current_obj#, ASH.sample_time, OBJ.object_name
FROM   dba_hist_active_sess_history ASH, dba_objects OBJ
WHERE  ASH.event LIKE '%wait for a undo record%' AND ASH.sample_time BETWEEN to_date('2020-11-13 08:00:00','yyyy-mm-dd hh24:mi:ss') AND to_date('2020-11-13 09:00:00','yyyy-mm-dd hh24:mi:ss')AND ASH.current_obj# = OBJ.object_id
UNION
SELECT ASHS.event, ASHS.current_obj#, ASHS.sample_time, OBJ.object_name
FROM   v$active_session_history ASHS, dba_objects OBJ
WHERE  ASHS.event LIKE '%wait for a undo record%' AND ASHS.sample_time BETWEEN to_date('2020-11-13 08:00:00','yyyy-mm-dd hh24:mi:ss') AND to_date('2020-11-13 09:00:00','yyyy-mm-dd hh24:mi:ss') AND ASHS.current_obj# = OBJ.object_id
ORDER  BY sample_time DESC;EVENT                                    CURRENT_OBJ#             SAMPLE_TIME           OBJECT_NAME
------------------------ ------------ ----------------------------------- -------------------------------
wait for a undo record       153265 13-NOV-20 08.29.46.117 AM                II_TRADEDATAINFO
wait for a undo record       153266 13-NOV-20 08.29.46.117 AM                IX_II_TRADEDATAINFO_JYLXBMSQL> select object_type,object_name,owner from dba_objects where object_name in ('II_xxxxx','IX_II_xxxx_xxx');OBJECT_TYPE         OBJECT_NAME                              OWNER
------------------- ---------------------------------------- ------------------------------
TABLE               II_TRADEDATAINFO                                 ETRACKHIS
INDEX               IX_II_TRADEDATAINFO_JYLXBM                       ETRACKHISset linesize 290 pages 999
col event for a50
col SAMPLE_TIME for a50
col OBJECT_NAME for a40SELECT ASH.event, ASH.current_obj#, ASH.sample_time, OBJ.object_name
FROM   dba_hist_active_sess_history ASH, dba_objects OBJ
WHERE  ASH.event LIKE '%force-cr-override flush%' AND ASH.sample_time BETWEEN to_date('2020-11-13 08:00:00','yyyy-mm-dd hh24:mi:ss') AND to_date('2020-11-13 09:00:00','yyyy-mm-dd hh24:mi:ss')AND ASH.current_obj# = OBJ.object_id
UNION
SELECT ASHS.event, ASHS.current_obj#, ASHS.sample_time, OBJ.object_name
FROM   v$active_session_history ASHS, dba_objects OBJ
WHERE  ASHS.event LIKE '%force-cr-override flush%' AND ASHS.sample_time BETWEEN to_date('2020-11-13 08:00:00','yyyy-mm-dd hh24:mi:ss') AND to_date('2020-11-13 09:00:00','yyyy-mm-dd hh24:mi:ss') AND ASHS.current_obj# = OBJ.object_id
ORDER  BY sample_time DESC;object_id:会话引用对象的对象ID,仅当出现等待事件时可用。force-cr-override flush:
force-cr-override flush 153265  2020/11/13 8:40:17.238  II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:40:07.228  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:40:07.228  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:57.208  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:57.208  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:47.188  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:47.188  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:37.168  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:37.168  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:27.158  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:27.158  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:17.138  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:17.138  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:39:07.118  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:39:07.118  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:38:57.098  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:38:57.098  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:38:47.084  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:38:47.084  PK_II_TRADEDATAINFO
force-cr-override flush 153265  2020/11/13 8:38:37.074  II_TRADEDATAINFO
force-cr-override flush 153268  2020/11/13 8:38:37.074  PK_II_TRADEDATAINFO

force-cr-override flush以及wait for a undo record
两个等待事件都是由表II_TRADEDATAINFO 以及索引PK_II_TRADEDATAINFO产生.

拉取AWR报告发现:

根据等待事件产生引用的对象在AWR报告中搜索(定位到Physical Writes):

db file sequential read定位Physical Reads分析:


根据以上对象定位到SQL:

1:
SQL_ID:fz0r58zzc5rb4
begin ETRACKHIS.Pkg_Etrack_InterFace.Prc_Web_BargaingapplyNew(Prm_InData=>:Prm_InData, Prm_AppCode=>:Prm_AppCode, Prm_OutData=>:Prm_OutData); end;2:
SQL_ID:5pfu3h435sfg9
SELECT COUNT(1) FROM CI_CHARGE WHERE NVL(SFZFPB, 0)=0 AND SFJLID > 0 AND BRJZHM = :B3 AND ZZJGDM = :B2 AND FYXMHM = :B1CI_CHARGE今天同时间段8:00-9:00:
Top 10 Foreground Events by Total Wait Time
Event   Waits   Total Wait Time (sec)   Wait Avg(ms)    % DB time   Wait Class
direct path read    341,073 12.5K   37  24.2    User I/O昨天同时间段8:00-9:00:
Top 10 Foreground Events by Total Wait Time
Event   Waits   Total Wait Time (sec)   Wait Avg(ms)    % DB time   Wait Class
direct path read    341,311 2439.6  7   4.3      User I/O

昨天同时间段没有超过5.0
5pfu3h435sfg9该SQL ID对应的业务语句感觉也存在问题.

警告日志信息:

Checkpoint not completeCurrent log# 1 seq# 13760 mem# 0: +DATA/orcl/onlinelog/group_1.85016.1052746729Current log# 1 seq# 13760 mem# 1: +DATA/orcl/onlinelog/group_1.96478.1052746731
Thread 2 advanced to log sequence 13761 (LGWR switch)Current log# 2 seq# 13761 mem# 0: +DATA/orcl/onlinelog/group_2.15857.1052746731Current log# 2 seq# 13761 mem# 1: +DATA/orcl/onlinelog/group_2.11544.1052746731
Fri Nov 13 10:59:55 2020
Archived Log entry 129464 added for thread 2 sequence 13760 ID 0x5e3416d0 dest 1:
Fri Nov 13 10:59:55 2020
LNS: Standby redo logfile selected for thread 2 sequence 13761 for destination LOG_ARCHIVE_DEST_3
Fri Nov 13 11:04:31 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Nov 13 11:04:31 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Nov 13 11:04:32 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Nov 13 11:04:32 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p046_43749.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p046_43749.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p030_43678.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p030_43678.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p014_43583.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p014_43583.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p062_43814.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p062_43814.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Fri Nov 13 11:06:56 2020
Thread 2 advanced to log sequence 13762 (LGWR switch)Current log# 1 seq# 13762 mem# 0: +DATA/orcl/onlinelog/group_1.85016.1052746729Current log# 1 seq# 13762 mem# 1: +DATA/orcl/onlinelog/group_1.96478.1052746731
Fri Nov 13 11:06:57 2020
LNS: Standby redo logfile selected for thread 2 sequence 13762 for destination LOG_ARCHIVE_DEST_3
Fri Nov 13 11:06:58 2020
Archived Log entry 129472 added for thread 2 sequence 13761 ID 0x5e3416d0 dest 1:
Fri Nov 13 11:09:52 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p046_43749.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p046_43749.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Fri Nov 13 11:09:52 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p062_43814.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p062_43814.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Fri Nov 13 11:09:53 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p030_43678.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Errors in file /oracle/app/diag/rdbms/orcl/orcl2/trace/orcl2_p030_43678.trc:
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Fri Nov 13 11:09:54 2020

怀疑是BUG
ORA-600:[kcbzwfcro_2] Reported in Alert.log (Doc ID 2085507.1)

ORA-600对应的trc文件:

DDE: Problem Key 'ORA 600 [kcbzwfcro_2]' was flood controlled (0x6) (incident: 585288)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [153266], [1], [32768], [0], [], [], [], [], [], [], []
Potentially stale force-CR-override buffer found before OBJD MISMATCH check.
This issue should be investigated by both cache fusion and space layer.
BH (0x124f06b778) file#: 90 rdba: 0x1689aeeb (90/634603) class: 1 ba: 0x12477d0000set: 68 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,25dbwrid: 3 obj: 153266 objn: 153266 tsn: 11 afn: 90 hint: fhash: [0x1bb76840c0,0xa4eaa3d90] lru: [0x8aee1e908,0x136ee36a28]ckptq: [NULL] fileq: [NULL] objq: [0x59ef5f4f0,0x114f224510] objaq: [0x2eea8d540,0xfbeb0e840]st: XCURRENT md: NULL fpin: 'kdgwh05: kdglfe' tch: 177 le: 0x14ef8f7418flags: block_written_once redo_since_read remote_transferedforce_cr_override但是堆栈信息没有对上.

尝试引用MOS解决办法:

尝试重启实例解决.

解决方法很简单,记录一下思路!

Oracle force-cr-override flush造成数据库卡顿问题排查思路相关推荐

  1. 带时间锉字段查询不走索引优化记录!+数据库卡顿问题排查顺序

    近期解决了病历系统数据库卡顿的问题,以下为分析经过,本文重点为带时间锉字段的优化和数据库卡顿问题排查的思路! 郑州病历系统登录卡慢问题,初步看是体温单表(t_vital_signs)查询没走索引影响的 ...

  2. 解决Navicat连接linux下mysql数据库卡顿的问题

    解决Navicat连接linux下mysql数据库卡顿的问题 进去到 etc目录下 vi my.cnf 添加 skip-name-resolve 取消名臣检测

  3. oracle 删除主键_大数据量删除的思考 4

    译者  汤健 · 沃趣科技数据库技术专家 出品  沃趣科技 在本系列的前一期文章中,我制作了一些图,突出显示了按表扫描执行大量删除操作和按索引范围扫描执行大量删除之间的主要区别.根据所涉及的数据模式, ...

  4. Oracle创建视图实现获取当前数据所在的页数,这里以每页2条数据分页

    摘要:Oracle创建视图实现获取当前数据所在的页数,这里以每页2条数据分页,详细请看: 一: DROP VIEW MIP.TB_CMS_FLGTINFO_D_VIEW;/* Formatted on ...

  5. mbk文件导入到oracle,Oracle基于物化视图的远程数据复制

    物化视图简介: 远程表复制功能:可以借助数据库链接(dblink),在远程数据库中建立一个本地表的副本,用该方式实现表的定时同步.物化视图存储基于远程表的数据,也可以称为快照. 加速查询功能:物化视图 ...

  6. Oracle EBS GL_INTERFACE中字段STATUS数据对应的含义

    Oracle EBS GL_INTERFACE中字段STATUS数据对应的含义 导入到接口表GL_INTERFACE中的数据,在字段STATUS中出现了'EU02' 'P'等错误状态. 其中的'P状态 ...

  7. Oracle11 expdp0734,oracle 11g expdp impdp 跨平台迁移数据

    以下只在AIX 6.1 和RedHat 5.4上实验成功 迁出环境:AIX 6.1 ORACLE 11.2.0.1 迁入环境:REDHAT 5.4 ORACLE 11.2.0.3 一.导出用户gree ...

  8. Oracle 物理结构(六) 文件-数据文件

    Oracle 物理结构(六) 文件-数据文件 转载于:https://www.cnblogs.com/xibuhaohao/p/10917338.html

  9. Oracle查询某一天日期数据的SQL语句的几种写法

    本文章向大家介绍Oracle查询某一天日期数据的SQL语句的几种写法,主要包括Oracle查询某一天日期数据的SQL语句的几种写法使用实例.应用技巧.基本知识点总结和需要注意事项,具有一定的参考价值, ...

最新文章

  1. Latex使用技巧01:改变数学公式字体的颜色
  2. 频繁项集挖掘之Aprior和FPGrowth算法
  3. 参数变化_PDP驱动波形参数分析
  4. top 命令_Linux监控cpu以及内存使用情况之top命令
  5. tomcat temp 大量 upload 文件_原创 | 浅谈URI中的任意文件下载
  6. HDU 3486 Interviewe RMQ
  7. 有感 Visual Studio 2015 RTM 简介 - 八年后回归 Dot Net,终于迎来了 Mvc 时代,盼走了 Web 窗体时代
  8. 为出海掘金创造更多可能 助力开发者触达全球用户
  9. Linux应急响应排查
  10. 图像降噪(去噪)是什么原理?
  11. FamilyParty生态起航,链游版皇室战争Infinite Force打响头炮
  12. 3dmax学习记录(二)
  13. lol服务器维护2021,2021LOL哪个区人多
  14. 隐马尔可夫(HMM)、前/后向算法、Viterbi算法 再次总结
  15. Python AIML搭建聊天机器人(附遇到的问题及解决)
  16. __kfifo_put和__kfifo_get
  17. python第二版课后习题答案_《python核心编程第二版》课后习题6-12答案
  18. unity绘制管道_在Unity里写一个纯手动的渲染管线(一)
  19. 多通道(Multichannel)单通道(singlechannel)图像
  20. 一个使用typescript实现的excel转json的工具

热门文章

  1. 通信是个大问题,还好我们有方法。
  2. 【最全的】JPEG Toolbox代码及使用方式详解
  3. 秋招面试题复习——机器学习
  4. 大数据Spark面试题2023
  5. 修改游戏服务器中的数据,修改游戏服务器中的数据库
  6. python科大讯飞
  7. 十大Android IDE工具和应用
  8. TicTacToe: 基于时序差分TD(0)算法的agent实现以及完整python实现框架
  9. 【面经】Morgan Stanley IT简易面经
  10. 江湖求生服务器无响应,江湖求生测试BUG一览 老玩家给出建议详解