一.问题起因

2014/10/14接某客户反馈,备份数据库的crontab执行失败。远程连接分析后发现是因为2014/09/13灾备演练过后dataguard参数没有正确调整导致的归档未清理,过多归档备份时因空间不足而失败。详细过程如下

二.日志分析

1.登陆后检查备份日志后发现数据文件备份成功但是备份归档时失败:

including current SPFILE in backup set
channel c1: starting piece 1 at 13-OCT-14
channel c1: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14004_1 tag=TAG20141013T220005 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:01
channel c2: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14001_1 tag=TAG20141013T220005 comment=NONE
channel c2: backup set complete, elapsed time: 01:45:12
channel c3: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14002_1 tag=TAG20141013T220005 comment=NONE
channel c3: backup set complete, elapsed time: 01:46:01
Finished backup at 13-OCT-14sql statement: alter system archive log current
。。。。skip .....released channel: c1
released channel: c2
released channel: c3
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on c3 channel at 10/14/2014 00:30:34
<span style="color:#ff0000;">ORA-19502: write error on file "/backup/addrrman/arch_ADDRPROD_20141014_14093_1", block number 442369 (block size=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576</span>

2.检查数据文件备份集大小发现数据量未剧增

oracle@p740a:/backup/addrrman[addr11g1]$ls -ltr
total 143197088
-rw-------    1 oracle   oinstall         98 Aug 21 18:53 nohup.out
-rw-r--r--    1 oracle   oinstall       7702 Oct 13 22:00 analyze.lst
-rw-r-----    1 oracle   asmadmin 23931797504 Oct 13 23:44 full_ADDRPROD_20141013_14000_1
-rw-r-----    1 oracle   asmadmin    7847936 Oct 13 23:44 full_ADDRPROD_20141013_14003_1
-rw-r-----    1 oracle   asmadmin      98304 Oct 13 23:44 full_ADDRPROD_20141013_14004_1
-rw-r-----    1 oracle   asmadmin 23550468096 Oct 13 23:45 full_ADDRPROD_20141013_14001_1
-rw-r-----    1 oracle   asmadmin 25820962816 Oct 13 23:46 full_ADDRPROD_20141013_14002_1
-rw-r--r--    1 oracle   oinstall    2659758 Oct 14 00:34 rman_delete.log
-rw-r--r--    1 oracle   oinstall     803655 Oct 14 00:37 delete_local_std_arch.log
-rw-r--r--    1 oracle   oinstall    1210456 Oct 14 00:38 rman_bk.log
-rw-r--r--    1 oracle   oinstall        527 Oct 14 00:38 delete_cd_std_arch.log

3.检查归档删除日志发现9/13日归档因为没有在所有standby去apply

RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13079.1905.858179699 thread=1 sequence=13079
RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13080.1618.858181499 thread=1 sequence=13080
<span style="color:#ff0000;">RMAN-08120: WARNING: archived log not deleted, not yet applied by standby</span>
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13081.1619.858182367 thread=1 sequence=13081

4.结合归档删除脚本中的archivelog删除策略

rman target / nocatalog log /backup/addrrman/rman_delete.log<<EOF
allocate channel for maintenance type disk connect 'sys/xxxx@addr11g1';
allocate channel for maintenance type disk connect 'sys/xxxx@addr11g2';
CONFIGURE RETENTION POLICY TO REDUNDANCY 1;
<span style="color:#ff0000;">CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;-->在所有standby应用后才能删除</span>
crosscheck backup;
crosscheck archivelog all;
delete noprompt archivelog until time 'sysdate-7';
delete noprompt obsolete;
delete noprompt expired backup;exit
EOF

5.检查log_archive_dest和log_archive_dest_state发现有defer的LAD

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest                     string
log_archive_dest_1                   string      LOCATION=+ARCHDG VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=addrprodlog_archive_dest_3                   string      service=ADDRCD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRCDlog_archive_dest_4                   string      service=ADDRPROD_STD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRPROD_STD
log_archive_dest_state_1             string      ENABLE
<span style="background-color: rgb(255, 255, 0);">log_archive_dest_state_3             string      defer</span>
log_archive_dest_state_4             string      enable

三.问题解决

清理log_archive_dest_3后重新手工删除archivelog 成功:

SQL> show parameter log_archive_dest_3;NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
log_archive_dest_3                   string     service=ADDRCD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRCD
log_archive_dest_30                  string
log_archive_dest_31                  string
SQL> alter system set log_archive_dest_3='' scope=both sid='*';System altered.SQL> show parameter log_archive_dest_3;NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
log_archive_dest_3                   string
log_archive_dest_30                  string
log_archive_dest_31                  string
删除归档时未再报错:
RMAN> CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;delete noprompt archivelog until time 'sysdate-7';using target database control file instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
new RMAN configuration parameters:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
new RMAN configuration parameters are successfully storedRMAN>allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=963 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=1717 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=1908 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=2189 instance=addr11g1 device type=DISK
List of Archived Log Copies for database with db_unique_name ADDRPROD
=====================================================================Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
168624  1    13079   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13079.1905.858179699168643  1    13080   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13080.1618.858181499168646  1    13081   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13081.1619.858182367168648  1    13082   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13082.1620.858182411168656  1    13083   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13083.1625.858182901168658  1    13084   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13084.1624.858182903168662  1    13085   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13085.1627.858182967168666  1    13086   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13086.1629.858184767168670  1    13087   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13087.1631.858186569168674  1    13088   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13088.1633.858188367

四.小结

这种临时性操作的收尾不干净导致的问题应该也不少见,本次没有引起重大故障(当然并不意味着每次都不会引起重大故障)。所以,日常工作中我们还是需要从多方面入手确保系统的正常运行,例如:

1).足够熟悉系统环境,清楚掌握各个临时操作之后如何恢复回去;

2).当然以上一点纯粹不靠谱啦,都说好记性不如烂笔头,最好还是有标准化的OM咯;

3).相关临时操作完成后需要对系统进行一次完整的检查。

LAD(Log Archive Dest)配置不当引起备份失败相关推荐

  1. php目录遍历漏洞复现,nginx解析漏洞,配置不当,目录遍历漏洞环境搭建、漏洞复现...

    nginx解析漏洞,配置不当,目录遍历漏洞复现 1.Ubuntu14.04安装nginx-php5-fpm 安装了nginx,需要安装以下依赖 sudo apt-get install libpcre ...

  2. 记一次CentOS7因Redis配置不当导致被Root提权沦为矿机修复过程

    未曾想过,那些年影视剧中黑客们的精彩桥段,竟在2020这个充满魔幻的年份,变成了现实. 前几日傍晚突然收到了来自阿里云安全中心的提醒,服务器疑似受到攻击了.想不到我那用作学习的机器,有朝一日竟然沦为矿 ...

  3. Redis配置不当可导致服务器被控制,已有多个网站受到影响 #通用程序安全预警#...

    文章出自:http://news.wooyun.org/6e6c384f2f613661377257644b346c6f75446f4c77413d3d 符合预警中"Redis服务配置不当& ...

  4. crossdomain.xml配置不当的利用和解决办法

    00x1: 今天在无聊的日站中发现了一个flash小站,点进crossdomain.xml一看,震惊 本屌看到这个*就发觉事情不对 百度一下,这是一个老洞,配置不当能引起各种问题就算能远程加载恶意的s ...

  5. Springboot之actuator配置不当漏洞(autoconfig、configprops、beans、dump、env、health、info、mappings、metrics、trace)

    前言 Actuator 是 springboot 提供的用来对应用系统进行自省和监控的功能模块,借助于 Actuator 开发者可以很方便地对应用系统某些监控指标进行查看.统计等.在 Actuator ...

  6. mysql8.0导入备份_mysql8.0.20配合binlog2sql的配置和简单备份恢复的步骤详解

    第一步 安装 1.安装MySQL 2.安装Python3 [root@localhost /]#yum install python3 3.下载binlog2sql文件到本地(文件在百度云盘) [ro ...

  7. 微软低代码工具 Power Apps 配置不当,暴露3800万条数据记录

     聚焦源代码安全,网罗国内外最新资讯! 编译:代码卫士 Upguard 研究院称,由于微软 Power Apps 默认配置安全性薄弱,敏感数据如 COVID-19 打疫苗情况.社保号码和邮件地址遭泄露 ...

  8. Git 仓库配置不当 日产北美公司的源代码遭泄露

     聚焦源代码安全,网罗国内外最新资讯! 编译:代码卫士团队 日产北美公司所开发和使用的移动应用及内部工具的源代码遭泄露,原因是该公司的其中一个 Git 服务器配置不当. 瑞士软件工程师 Tillie ...

  9. 【vim环境配置】解决ubuntu上 由YouCompleteMe插件配置不当引起的 自动补全失效的问题

    [vim环境配置]解决ubuntu上 由YouCompleteMe插件配置不当引起的 自动补全失效的问题 参考文章: (1)[vim环境配置]解决ubuntu上 由YouCompleteMe插件配置不 ...

最新文章

  1. 生物信息培训之WGCNA-权重基因共表达网络分析
  2. 纯html css博客,纯HTML+CSS打造动画
  3. 引用借以记录借鉴 实现记住密码和自动登录功能
  4. Android 12正式发布:安卓历史最大设计变化、更流畅了!
  5. bitmap画文字 居中_【每日问答29】一键居中CAD表格中的文字
  6. 能量项链(NOIP 2006 提高组)
  7. oracle 统计分析 dic,数据库优化之统计分析实战篇
  8. MYSQL——《数据库》实验壹——熟悉数据库管理工具、数据库和表的基本操作
  9. C# 向Com口发送数据
  10. android+geturl+方法,浅入浅出Android(014):HTTP GET获取文本内容
  11. C#编写简易的学生成绩查询
  12. 【C++笔记】构造函数与析构函数相关知识
  13. activate-power-mode效果实验(未完全成功)
  14. 博弈论——Nim游戏
  15. php text换行_php实现文字换行
  16. 【原】YUI3:js加载过程及时序问题
  17. Ant安装及环境配置
  18. win10设置计算机关机时间,最新版:如何在Win10计算机上设置计划的关机时间? Windows 10计算机设置定时关机命令...
  19. C++STL算法equal(15)
  20. 【Kruskal】Uva 1395 Slim Span

热门文章

  1. docker-compose一键部署mysql-nacos-seata-redis
  2. 在上海服装批发进货去哪里?
  3. 罗马音平假名中文可复制_西方音乐史---古希腊、古罗马笔记
  4. 赞奇云一站式云上制作,完美应对游戏行业困难及挑战
  5. 新年第一面:一位软件测试新手面试官的复盘
  6. 如何判断投影坐标是 3 度带还是 6 度带?如何计算中央子午线经度?
  7. 网站搭建超详细教程(零基础)
  8. nn.Embedding中padding_idx的理解
  9. X射线荧光光谱(XRF)原理
  10. 怎么关闭计算机用户账户控制面板,win7系统用户账户控制设置|win7关闭/取消用户账户控制的方法-系统城...