Oracle诊断案例-Spfile案例一则

oracle

Oracle诊断案例-Spfile案例一则

 

link:

http://www.eygle.com/case/spfile.htm

情况说明:
系统:SUN Solaris8
数据库版本:9203
问题描述:工程人员报告,数据库在重新启动时无法正常启动.检查发现UNDO表空间丢失.
问题诊断及解决过程如下:

 

1. 登陆系统检查alert.log文件

检查alert.log文件是通常是我们诊断数据库问题的第一步

SunOS 5.8

login: root
Password:
Last login: Thu Apr 1 11:39:16 from 10.123.7.162
Sun Microsystems Inc. SunOS 5.8 Generic Patch October 2001
You have new mail.
# su - oracle
bash-2.03$ cd $ORACLE_BASE/admin/*/bdump
bash-2.03$ vi *.log

"alert_gzhs.log" 7438 lines, 283262 characters
Sat Feb 7 20:30:06 2004
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
SCN scheme 3
Using log_archive_dest parameter default value
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 9.2.0.3.0.
System parameters with non-default values:
processes = 150
timed_statistics = TRUE
shared_pool_size = 1157627904
large_pool_size = 16777216
java_pool_size = 637534208
control_files = /u01/oradata/gzhs/control01.ctl,
/u02/oradata/gzhs/control02.ctl,
/u03/oradata/gzhs/control03.ctl
db_block_size = 8192
db_cache_size = 2516582400
compatible = 9.2.0.0.0
log_archive_start = TRUE
log_archive_dest_1 = LOCATION=/u06/oradata/gzhs/arch
log_archive_format = %t_%s.dbf
db_file_multiblock_read_count= 16
fast_start_mttr_target = 300
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain =
instance_name = gzhs
dispatchers = (PROTOCOL=TCP) (SERVICE=gzhsXDB)
job_queue_processes = 10
hash_join_enabled = TRUE
background_dump_dest = /oracle/admin/gzhs/bdump
user_dump_dest = /oracle/admin/gzhs/udump
core_dump_dest = /oracle/admin/gzhs/cdump
sort_area_size = 524288
db_name = gzhs
open_cursors = 300
star_transformation_enabled= FALSE
query_rewrite_enabled = FALSE
pga_aggregate_target = 838860800
aq_tm_processes = 1
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
"alert_gzhs.log" 7438 lines, 283262 characters
USER: terminating instance due to error 30012
Instance terminated by USER, pid = 26433
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Apr 1 11:11:08 2004
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
SCN scheme 3
Using log_archive_dest parameter default value
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 9.2.0.3.0.
System parameters with non-default values:
processes = 150
timed_statistics = TRUE
shared_pool_size = 1157627904
large_pool_size = 16777216
java_pool_size = 637534208
control_files = /u01/oradata/gzhs/control01.ctl, /u02/oradata/gzhs/control02.ctl, /u03/oradata/gzhs/control03.ctl
db_block_size = 8192
db_cache_size = 2516582400
compatible = 9.2.0.0.0
log_archive_start = TRUE
log_archive_dest_1 = LOCATION=/u06/oradata/gzhs/arch
log_archive_format = %t_%s.dbf
db_file_multiblock_read_count= 16
fast_start_mttr_target = 300
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain =
instance_name = gzhs
dispatchers = (PROTOCOL=TCP) (SERVICE=gzhsXDB)
job_queue_processes = 10
hash_join_enabled = TRUE
background_dump_dest = /oracle/admin/gzhs/bdump
user_dump_dest = /oracle/admin/gzhs/udump
core_dump_dest = /oracle/admin/gzhs/cdump
sort_area_size = 524288
db_name = gzhs
open_cursors = 300
star_transformation_enabled= FALSE
query_rewrite_enabled = FALSE
pga_aggregate_target = 838860800
aq_tm_processes = 1
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
CJQ0 started with pid=8
Thu Apr 1 11:11:13 2004
starting up 1 shared server(s) ...
QMN0 started with pid=9
Thu Apr 1 11:11:13 2004
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=12
ARC0: Archival started
ARC1 started with pid=13
Thu Apr 1 11:11:13 2004
ARCH: STARTING ARCH PROCESSES COMPLETE
Thu Apr 1 11:11:13 2004
ARC0: Thread not mounted
Thu Apr 1 11:11:13 2004
ARC1: Archival started
ARC1: Thread not mounted
Thu Apr 1 11:11:14 2004
ALTER DATABASE MOUNT
Thu Apr 1 11:11:18 2004
Successful mount of redo thread 1, with mount id 1088380178.
Thu Apr 1 11:11:18 2004
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE MOUNT
Thu Apr 1 11:11:27 2004
alter database open
Thu Apr 1 11:11:27 2004
Beginning crash recovery of 1 threads
Thu Apr 1 11:11:27 2004
Started first pass scan
Thu Apr 1 11:11:28 2004
Completed first pass scan
1 redo blocks read, 0 data blocks need recovery
Thu Apr 1 11:11:28 2004
Started recovery at
Thread 1: logseq 177, block 2, scn 0.33104793
Recovery of Online Redo Log: Thread 1 Group 3 Seq 177 Reading mem 0
Mem# 0 errs 0: /u01/oradata/gzhs/redo03.log
Thu Apr 1 11:11:28 2004
Completed redo application
Thu Apr 1 11:11:28 2004
Ended recovery at
Thread 1: logseq 177, block 3, scn 0.33124794
0 data blocks read, 0 data blocks written, 1 redo blocks read
Crash recovery completed successfully
Thu Apr 1 11:11:28 2004
LGWR: Primary database is in CLUSTER CONSISTENT mode
Thread 1 advanced to log sequence 178
Thread 1 opened at log sequence 178
Current log# 1 seq# 178 mem# 0: /u01/oradata/gzhs/redo01.log
Successful open of redo thread 1.
Thu Apr 1 11:11:28 2004
ARC0: Evaluating archive log 3 thread 1 sequence 177
Thu Apr 1 11:11:28 2004
ARC0: Beginning to archive log 3 thread 1 sequence 177
Creating archive destination LOG_ARCHIVE_DEST_1: '/u06/oradata/gzhs/arch/1_177.dbf'
Thu Apr 1 11:11:28 2004
SMON: enabling cache recovery
ARC0: Completed archiving log 3 thread 1 sequence 177
Thu Apr 1 11:11:28 2004
Errors in file /oracle/admin/gzhs/udump/gzhs_ora_27781.trc:
ORA-30012: \263\267\317\373\261\355\277\325\274\344 'UNDOTBS1' \262\273\264\346\324\332\273\362\300\340\320\315\262\273\325\375\310\
267
Thu Apr 1 11:11:28 2004
Error 30012 happened during db open, shutting down database
USER: terminating instance due to error 30012
Instance terminated by USER, pid = 27781
ORA-1092 signalled during: alter database open...
:q

.............

在警报日志末尾显示了数据库在Open状态因为错误而异常终止.

2. 尝试重新启动数据库

 

bash-2.03$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 4月 1 11:43:52 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

已连接到空闲例程。

SQL> startup
ORACLE 例程已经启动。

Total System Global Area 4364148184 bytes
Fixed Size 736728 bytes
Variable Size 1845493760 bytes
Database Buffers 2516582400 bytes
Redo Buffers 1335296 bytes
数据库装载完毕。
ORA-01092: ORACLE 例程终止。强行断开连接

.............

工程人员报告的问题重现.

 

3. 检查数据文件

 

bash-2.03$ cd /u01/ oradata/gzhs
bash-2.03$ ls -l
total 55702458
-rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 UNDOTBS2.dbf
-rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_BILLINGDETAIL.dbf
-rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_MAIN.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN10.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN11.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN2.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN3.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN4.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN5.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN6.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN7.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN8.dbf
-rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN9.dbf
-rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_MVIEW.dbf
-rw-r----- 1 oracle dba 1073750016 Mar 24 17:15 WAP12_TEMP1.dbf
.........................
.............

发现存在文件UNDOTBS2.dbf

4. mount数据库,检查系统参数

 

bash-2.03$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 4月 1 11:46:20 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. 已连接到空闲例程。 SQL> SQL> SQL> startup mount;
ORACLE 例程已经启动。
Total System Global Area 4364148184 bytes Fixed Size 736728 bytes Variable Size 1845493760 bytes Database Buffers 2516582400 bytes Redo Buffers 1335296 bytes 数据库装载完毕。
SQL> select name from v$datafile; NAME -------------------------------------------------------------------------------- /u01/oradata/gzhs/system01.dbf /u01/oradata/gzhs/cwmlite01.dbf /u01/oradata/gzhs/drsys01.dbf /u01/oradata/gzhs/example01.dbf /u01/oradata/gzhs/indx01.dbf /u01/oradata/gzhs/odm01.dbf /u01/oradata/gzhs/tools01.dbf /u01/oradata/gzhs/users01.dbf /u01/oradata/gzhs/xdb01.dbf ......................... /u01/oradata/gzhs/UNDOTBS2.dbf
已选择23行。
SQL> SQL> show parameter undo NAME TYPE VALUE
------------------------------------ ----------- ------------------------------ undo_management string AUTO undo_retention integer 10800 undo_suppress_errors boolean FALSE
undo_tablespace string UNDOTBS1
SQL> show parameter spfile
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string

.........................
.............

发现系统没有使用spfile,而初始化参数设置的undo表空间为UNDOTBS1

5. 检查参数文件

 

bash-2.03$ cd $ORACLE_HOME/dbs bash-2.03$ ls
init.ora initgzhs.ora initgzhs.ora.old orapwgzhs initdw.ora initgzhs.ora.hurray lkGZHS snapcf_gzhs.f bash-2.03$ vi initgzhs.ora
"initgzhs.ora" [Incomplete last line] 105 lines, 3087 characters
####################################################
# Copyright (c) 1991, 2001, 2002 by Oracle Corporation
####################################################
###########################################
# Archive
###########################################
log_archive_dest_1='LOCATION=/u06/oradata/gzhs/arch'
log_archive_format=%t_%s.dbf log_archive_start=true ###########################################
# Cache and I/O
###########################################
db_block_size=8192
db_cache_size=2516582400
db_file_multiblock_read_count=16
###########################################
# Cursors and Library Cache
###########################################
open_cursors=300
......................

###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=AUTO
undo_retention=10800
undo_tablespace=UNDOTBS1
:q!
.............

这个设置是极其可疑的.
怀疑参数文件和实际数据库设置不符.

6. 再次检查alert文件
查找对于UNDO表空间的操作

第一部分,创建数据库时的信息:

 

Sat Feb 7 20:30:12 2004 CREATE DATABASE gzhs MAXINSTANCES 1 MAXLOGHISTORY 1 MAXLOGFILES 5 MAXLOGMEMBERS 3 MAXDATAFILES 100 DATAFILE '/u01/oradata/gzhs/system01.dbf' SIZE 500M REUSE AUTOEXTEND ON NEXT 10240K MAXSIZE UNLIMITED EXTENT MANAGEMENT LOCAL DEFAULT TEMPORARY TABLESPACE TEMP TEMPFILE '/u01/oradata/gzhs/temp01.dbf' SIZE 1000M REUSE AUTOEXTEND ON NEXT 250M MAXSIZE UNLIMITED UNDO TABLESPACE "UNDOTBS1" DATAFILE '/u01/oradata/gzhs/undotbs01.dbf' SIZE 1000M REUSE AUTOEXTEND ON NEXT 100M MAXSIZE UNLIMITED
CHARACTER SET ZHS16GBK
NATIONAL CHARACTER SET AL16UTF16
LOGFILE GROUP 1 ('/u01/oradata/gzhs/redo01.log') SIZE 256M,
GROUP 2 ('/u01/oradata/gzhs/redo02.log') SIZE 256M,
GROUP 3 ('/u01/oradata/gzhs/redo03.log') SIZE 256M
.............

注意,这也是OCP教材上提到的两种创建UNDO表空间的方式之一

第二部分,发现创建UNDOTBS2的记录信息:

 

Wed Mar 24 20:20:58 2004 /* OracleOEM */ CREATE UNDO TABLESPACE "UNDOTBS2" DATAFILE '/u01/oradata/gzhs/UNDOTBS2.dbf' SIZE 1024M AUTOEXTEND ON NEXT 100M MAXSIZE UNLIMITED Wed Mar 24 20:22:37 2004 Created Undo Segment _SYSSMU11$ Created Undo Segment _SYSSMU12$ Created Undo Segment _SYSSMU13$ Created Undo Segment _SYSSMU14$ Created Undo Segment _SYSSMU15$ Created Undo Segment _SYSSMU16$ Created Undo Segment _SYSSMU17$ Created Undo Segment _SYSSMU18$ Created Undo Segment _SYSSMU19$ Created Undo Segment _SYSSMU20$ Completed: /* OracleOEM */ CREATE UNDO TABLESPACE "UNDOTBS2"
Wed Mar 24 20:24:25 2004
Undo Segment 11 Onlined
Undo Segment 12 Onlined
Undo Segment 13 Onlined
Undo Segment 14 Onlined
Undo Segment 15 Onlined
Undo Segment 16 Onlined
Undo Segment 17 Onlined
Undo Segment 18 Onlined
Undo Segment 19 Onlined
Undo Segment 20 Onlined
Successfully onlined Undo Tablespace 15.
Undo Segment 1 Offlined
Undo Segment 2 Offlined
Undo Segment 3 Offlined
Undo Segment 4 Offlined
Undo Segment 5 Offlined
Undo Segment 6 Offlined
Undo Segment 7 Offlined
Undo Segment 8 Offlined
Undo Segment 9 Offlined
Undo Segment 10 Offlined
Undo Tablespace 1 successfully switched out.
.............

第三部分,新的UNDO表空间被应用

Wed Mar 24 20:24:25 2004
ALTER SYSTEM SET undo_tablespace='UNDOTBS2' SCOPE=MEMORY;

我们发现问题就在这里,创建了新的UNDO表空间以后,因为使用的是pfile文件,修改的只对当前实例生效,操作人员忘记了修改pfile文件.

如果使用spfile,缺省的修改范围是both,会同时修改spfile文件,就可以避免以上问题的出现.

第四部分,删除了UNDOTBS1的信息

 

Wed Mar 24 20:25:01 2004 /* OracleOEM */ DROP TABLESPACE "UNDOTBS1" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS Wed Mar 24 20:25:03 2004 Deleted file /u01/oradata/gzhs/undotbs01.dbf Completed: /* OracleOEM */ DROP TABLESPACE "UNDOTBS1" INCLUDI
.............

这样再次重新启动数据库的时候,问题出现了,pfile中定义的UNDOTBS1找不到了,而且操作实在很久以前,没人能回忆起来,甚至无法得知是什么人的操作。

7. 更改pfile,启动数据库

修改undo表空间

###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=AUTO
undo_retention=10800
undo_tablespace=UNDOTBS2

....

bash-2.03$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 4月 1 11:55:11 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

连接到:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
PL/SQL Release 9.2.0.3.0 - Production
CORE 9.2.0.3.0 Production
TNS for Solaris: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 - Production

SQL> exit
从Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production中断开
bash-2.03$

在这里我们可以看到,使用spfile可以免去手工修改pfile文件的麻烦,减少了犯错的可能。

既然Oracle9i给我们提供了这个新特性,就值得我们学习使用它.

 

 

时间: 2016-02-09

Oracle诊断案例-Spfile案例一则的相关文章

Oracle诊断案例-Job任务停止执行

oracle|执行 Oracle诊断案例-Job任务停止执行 Last Updated: Saturday, 2004-11-20 12:47 Eygle         昨天接到研发人员报告,数据库定时任务未正常执行,导致某些操作失败. 开始介入处理该事故.系统环境:SunOS DB 5.8 Generic_108528-21 sun4u sparc SUNW,Ultra-4 Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production

Oracle诊断案例-Job任务停止执行[最终版]

oracle|执行 Oracle诊断案例-Job任务停止执行 Last Updated: Friday, 2004-11-26 9:48 Eygle         昨天接到研发人员报告,数据库定时任务未正常执行,导致某些操作失败. 开始介入处理该事故.系统环境:SunOS DB 5.8 Generic_108528-21 sun4u sparc SUNW,Ultra-4 Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production 1.首

Oracle诊断案例----如何捕获问题SQL解决过度CPU消耗问题

oracle|解决|问题 Oracle诊断案例----如何捕获问题SQL解决过度CPU消耗问题 --使用vmstat,top等辅助解决Oracle数据库性能问题 Last Updated: Sunday, 2004-10-24 0:37 Eygle       问题描述:开发人员报告系统运行缓慢,影响用户访问. 1.登陆数据库主机 使用vmstat检查,发现CPU资源已经耗尽,大量任务位于运行队列: bash-2.03$ vmstat 3 procs memory page disk fault

Oracle诊断案例:Job任务停止执行

摘要: 本文通过一次Oracle Job任务异常案例诊断,分析其原因及解决过程,从内部揭示Oracle Job任务调度及内部计时机制. 问题及环境 接到研发人员报告,数据库定时任务未正常执行,导致某些操作失败. 开始介入处理该事故. 系统环境: 以下为引用的内容: SunOS DB 5.8 Generic_108528-21 sun4u sparc SUNW,Ultra-4Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production 解决过

Oracle PL/SQL入门案例实践_oracle

正在看的ORACLE教程是:Oracle PL/SQL入门案例实践. 前面已经了解了关于PL/SQL编程的基础,本文将结合一个案例来加深对这些知识点的理解. 一. 案例介绍 某数据库有两张表,是关于某公司员工资料.薪水和部门信息的,它们分别是emp表和dept表,两张表的结构如下: 要求如下: 1.按照上表结构建立相应的表,并每张表写入5组合法数据. 2.操纵相关表,使得"技术部"的员工的薪水上涨20%. 3.建立日志,追踪薪水变动情况. 4.建立测试包. 二. 案例的分析与实现 从前

Oracle诊断工具SQLT简介 (文档 ID 1677588.1、1526574.1)

Oracle诊断工具SQLT简介 (文档 ID 1677588.1.1526574.1) SQLT 使用指南 (文档 ID 1677588.1)   文档详细信息   类型: 状态: 上次主更新: 上次更新: 语言: REFERENCE PUBLISHED 2016-6-16 2016-6-16 English简体中文??? 215187.1SQLTXPLAIN (SQLT) 12.1.06 2014年1月30日 帮助诊断性能较差的 SQL 语句的工具 SQLT 概览 安全模式 安装 SQLT

Oracle诊断案例-Sql

oracle link: http://www.eygle.com/case/sql_trace_1.htm 问题描述: 这是帮助一个公司的诊断案例.应用是一个后台新闻发布系统. 症状是,通过连接访问新闻页是极其缓慢通常需要十数秒才能返回. 这种性能是用户不能忍受的. 操作系统:SunOS 5.8数据库版本:8.1.7 1.检查并跟踪数据库进程 诊断时是晚上,无用户访问在前台点击相关页面,同时进行进程跟踪 查询v$session视图,获取进程信息   SQL> select sid,serial

Oracle诊断案例-SGA与Swap之一

oracle     link: http://www.eygle.com/case/sga1.htm 案例描述: 用户报告,服务器启动一段时间以后,无法建立数据库连接重新启动几分钟以后,再次无法连接 系统无法正常使用. 1.登陆系统 SunOS 5.8 login: rootPassword: Last login: Tue Mar 23 13:56:59 from 172.16.31.41Sun Microsystems Inc. SunOS 5.8 Generic Patch Octobe

Oracle诊断案例-SGA与Swap之二

oracle     link: http://www.eygle.com/case/sga2.htm 案例描述: 这是一个大型生产系统问题出现时系统累计大量用户进程用户请求得不到及时响应,新的进程不断尝试建立连接连接数很快被用完 数据库版本:9.2.0.3操作系统:Solaris8   1.检查alert文件 日志中记录如下错误信息,说明磁盘异步IO出现问题:   WARNING: aiowait timed out 2 timesTue Aug 26 15:33:32 2003WARNING