本文共 5531 字,大约阅读时间需要 18 分钟。
某缴费充值系统,监控告警显示ogg进程异常。同事登录系统进行查看确认ogg进程异常停止。
查看ggserr.log发现有个undo表空间的告警,然后紧接着是异常停止ABENDING
2017-07-05 10:28:27 WARNING OGG-01003 Oracle GoldenGate Delivery for Oracle, rep1upay.prm: Repositioning to rba 19306043 in seqno 110291.
2017-07-05 10:28:28 ERROR OGG-00868 Oracle GoldenGate Delivery for Oracle, rep3upay.prm: OCI Error ORA-30036: unable to extend segment by 8 in undo tab
lespace 'UNDOTBS1' (status = 30036). Updating checkpoint table ggs.rep3upay, group 'REP3UPAY', key 666611345 (0x27bbae91), SQL <UPDATE ggs.rep3upay SET last
_update_ts = sysdate, seqno = :seqno, rba = :rba, audit_ts = :audit_ts, log_csn = :log_csn, log_xid = :log_xid, log_cmplt_csn = :log_cmplt_csn, log_cmplt_xi
ds = :log_cmplt_xids, version = :version WHERE group_name = :group_name AND group_key = :key>.
2017-07-05 10:28:28 ERROR OGG-01668 Oracle GoldenGate Delivery for Oracle, rep3upay.prm: PROCESS ABENDING.
2017-07-05 10:41:17 ERROR OGG-01668 Oracle GoldenGate Delivery for Oracle, rep7upay.prm: PROCESS ABENDING.
该报错特别诡异,报错信息居然只有一个“.”并无其他任何有用信息。
undo表空间确实已经100%利用率,添加一下
Wed Jul 05 10:59:12 CST 2017
alter tablespace UNDOTBS1 add datafile '/ora/npf/undotbs01_03.dbf' size 10g autoextend off
重新启动复制进程,但是还是起不来,看来还需要跳过一些事物
GGSCI (npfdb1) 2> info rep7upay
REPLICAT REP7UPAY Last Started 2017-07-05 11:00 Status ABENDED Checkpoint Lag 00:00:25 (updated 02:27:08 ago) Log Read Checkpoint File ./dirdat/rt110273 2017-07-05 08:40:51.937520 RBA 140305324可以看到当前队列文件为110273 GGSCI (npfdb1) 3> view param rep7upay REPLICAT rep7upay SETENV (NLS_LANG = "American_America.ZHS16GBK") USERID ggs,PASSWORD ggs2013 REPORTCOUNT EVERY 30 MINUTES, RATE REPERROR DEFAULT, ABEND numfiles 5000 HANDLECOLLISIONS assumetargetdefs DISCARDFILE ./dirrpt/rep7upay.dsc, APPEND, MEGABYTES 1000 ALLOWNOOPUPDATES map UCR_TRADE_02.TD_O_SYNC_7, target UCR_TRADE_01.TD_O_SYNC_7; map UCR_TRADE_01.TL_B_PAYLOG_UNICARD, TARGET UCR_TRADE_01.TL_B_PAYLOG_UNICARD; map UCR_TRADE_01.TF_F_CUSTOM, target UCR_TRADE_01.TF_F_CUSTOM; map UCR_TRADE_01.TL_O_EMAILLOG, target UCR_TRADE_01.TL_O_EMAILLOG; map UCR_TRADE_01.TL_R_CARDQUERYLOG, target UCR_TRADE_01.TL_R_CARDQUERYLOG; map UCR_TRADE_02.TL_B_PAYLOG_UNICARD, TARGET UCR_TRADE_01.TL_B_PAYLOG_UNICARD; map UCR_TRADE_02.TL_O_EMAILLOG, target UCR_TRADE_01.TL_O_EMAILLOG; map UCR_TRADE_02.TL_R_CARDQUERYLOG, target UCR_TRADE_01.TL_R_CARDQUERYLOG; map UCR_TRADE_02.TL_B_CHECK_OFFLINE_DATA, target UCR_TRADE_01.TL_B_CHECK_OFFLINE_DATA; map UCR_TRADE_02.TL_B_OFFLINE_ORDER_REFUND, target UCR_TRADE_01.TL_B_OFFLINE_ORDER_REFUND; map UCR_TRADE_02.TL_B_DELIVER_PRECISIONRATE, target UCR_TRADE_01.TL_B_DELIVER_PRECISIONRATE; GGSCI (npfdb1) 4> exit -bash-3.2$ logdump Oracle GoldenGate Log File Dump Utility for Oracle Version 11.2.1.0 .1 OGGCORE_11.2.1.0 .1_PLATFORMS_120423.0230 Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved. Logdump 323 >open ./dirdat/rt110273 Current LogTrail is /npf03/ggs/dirdat/rt110273 Logdump 324 >position 140305324 Reading forward from RBA 140305324 Logdump 325 >n 2017/07/05 08:40:51.937.520 Insert Len 571 RBA 140305324 Name: UCR_TRADE_02.TF_B_ORDER After Image: Partition 4 G b 0000 0018 0000 0014 3137 3037 3035 3038 3430 3532 | ........170705084052 3730 3933 3331 3235 0001 000a 0000 0000 0000 0000 | 70933125............ 0019 0002 0022 0000 001e 3130 3034 3031 3130 3930 | ....."....1004011090 3431 3730 3730 3530 3834 3035 3231 3131 3239 3631 | 41707050840521112961 0003 0004 0000 3034 0004 0005 0000 3031 3700 0500 | ......04......017... 0500 0031 3631 0006 0004 ffff 0000 0007 0004 ffff | ...161.............. 0000 0008 0015 0000 3230 3137 2d30 372d 3035 3a30 | ........2017-07-05:0 Logdump 326 >n 2017/07/05 08:40:51.937.520 Insert Len 433 RBA 140306034 Name: UCR_TRADE_02.TL_B_PAYLOG After Image: Partition 4 G m 0000 0018 0000 0014 3137 3037 3035 3038 3430 3532 | ........170705084052 3730 3933 3331 3237 0001 000a 0000 0000 0000 0000 | 70933127............ 001b 0002 0004 0000 3032 0003 0003 0000 3100 0400 | ........02......1... 0300 0030 0005 000a ffff 0000 0000 0000 0000 0006 | ...0................ 0003 ffff 0000 0700 1500 0032 3031 372d 3037 2d30 | ...........2017-07-0 353a 3038 3a34 303a 3532 0008 0015 ffff 3139 3030 | 5:08:40:52......1900 2d30 312d 3031 3a30 303a 3030 3a30 3000 0900 15ff | -01-01:00:00:00..... Logdump 327 >exit -bash-3.2$ ggsci Oracle GoldenGate Command Interpreter for Oracle Version 11.2.1.0 .1 OGGCORE_11.2.1.0 .1_PLATFORMS_120423.0230_FBO Solaris, sparc, 64bit (optimized), Oracle 10g on Apr 24 2012 09:06:57 Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved. GGSCI (npfdb1) 1> alter rep7upay ,extseqno 110273 extrba 140305324REPLICAT altered.
跳过seqno为110273,RBA为140305324的事物,因为当前打开的队列文件中的UCR_TRADE_02.TF_B_ORDER、UCR_TRADE_02.TL_B_PAYLOG表并不在rep7upay的参数文件中,没有同步该表。GGSCI (npfdb1) 2> start rep7upay
Sending START request to MANAGER ... REPLICAT REP7UPAY starting GGSCI (npfdb1) 3> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING REPLICAT RUNNING REP1UPAY 00:00:03 00:00:00 REPLICAT RUNNING REP2UPAY 00:00:03 00:00:03 REPLICAT RUNNING REP3UPAY 00:00:02 00:00:00 REPLICAT RUNNING REP4UPAY 00:00:04 00:00:00 REPLICAT RUNNING REP5UPAY 00:00:03 00:00:04 REPLICAT RUNNING REP6UPAY 00:00:27 02:35:15 REPLICAT RUNNING REP7UPAY 02:28:31 00:00:03 REPLICAT ABENDED REPUPAY 00:00:30 01:13:54第二种场景:如果此时rep7upay进程还是启动失败,那么可以尝试跳过这段无用事物(因为我们打开的队列文件中现在同步的表,并不在我们7这个参数文件同步表的范围)可以执行如下命令进行跳过:
alter rep7upay ,extseqno 110273 extrba 140306034
本文转自yangjunfeng 51CTO博客,原文链接:http://blog.51cto.com/yangjunfeng/1944702
转载地址:http://tmcix.baihongyu.com/