Library Cache Lock的解决

发表于:2007-06-08来源:作者:点击数: 标签:
昨晚业务系统导入资料并重建索引时一个会话突然停滞不前,用TOAD一看,一直在等待Library Cache Lock。TOAD、OEM中都看不到此锁,会话每三秒启动一次,但每次都是等待这个锁。显然,这和数据字典有关,应该是一个索引的数据字典中的记录被锁住了,导致无法重
昨晚业务系统导入资料并重建索引时一个会话突然停滞不前,用TOAD一看,一直在等待Library Cache Lock。TOAD、OEM中都看不到此锁,会话每三秒启动一次,但每次都是等待这个锁。显然,这和数据字典有关,应该是一个索引的数据字典中的记录被锁住了,导致无法重建。可是杀光了其他ACTIVE的会话,问题仍然没有得到解决,看来是某一个被杀死的会话持有该锁,而会话尚未回滚完全,进程仍然吊死着。现在的问题就是找这个会话了。
首先想到的文档就是Oracle9i Database Reference了,找到附录A,说明如下:

This event controls the concurrency between clients of the library cache. It acquires a lock on the object handle so that either:

One client can prevent other clients from aclearcase/" target="_blank" >ccessing the same object

The client can maintain a dependency for a long time (for example, no other client can change the object)

This lock is also obtained to locate an object in the library cache.

Wait Time: 3 seconds (1 second for PMON)

Parameters:

handle address

Address of the object being loaded

lock address

Address of the load lock being used. This is not the same thing as a latch or an enqueue, it is a State Object.

mode

Indicates the data pieces of the object which need to be loaded

namespace

See 数据库/Oracle/Oracle9iR2%20Database%20Documentation/server.920/a96536/apa4.htm#970946">"namespace"数据库/Oracle/Oracle9iR2%20Database%20Documentation/server.920/a96536/apa4.htm#970946">


几乎等于什么都没说,不过lock address应该会有点用处。
转而上网搜索解决方案,终于找到一篇metalink上的文档:
Doc ID:
Note:122793.1
Subject:  HOW TO FIND THE SESSION HOLDING A LIBRARY CACHE LOCK
Type:  BULLETIN
Status:  PUBLISHED
 Content Type:  TEXT/PLAIN
Creation Date:  23-OCT-2000
Last Revision Date:  17-JUL-2002

PURPOSE
-------
 
 In some situations it may happen your session is 'hanging' and is awaiting for 
 a 'Library cache lock'. This document describes how to find the session that 
 in fact has the lock you are waiting for.
 
 
SCOPE & APPLICATION
-------------------
 
Support analysts, dba's, ..
 
 
HOW TO FIND THE SESSION HOLDING A A LIBRARY CACHE LOCK
------------------------------------------------------
 
 Common situations:
 
 * a DML operation that is hanging because the table which is accessed is currently 
   undergoing changes (ALTER TABLE). This may take quite a long time depending on 
   the size of the table and the type of the modification 
   (e.g. ALTER TABLE x MODIFY (col1 CHAR(200) on thousands of records). 
 
* The compilation of package will hang on Library Cache Lock and Library Cache Pin 
  if some users are executing any Procedure/Function defined in the same package. 
 
 In the first situation the V$LOCK view will show that the session doing the 
 'ALTER TABLE' has an exclusive DML enqueue lock on the table object (LMODE=6, 
 TYPE=TM and ID1 is the OBJECT_ID of the table). The waiting session however does 
 not show up in V$LOCK yet so in an environment with a lot of concurrent sessions 
 the V$LOCK information is insufficient to track down the culprit blocking your 
 operation.
 
METHOD 1: SYSTEMSTATE ANALYSIS
------------------------------
 
 One way of finding the session blocking you is to analyze the system state dump.
 Using the systemstate event one can create a tracefile containing detailed 
 information on every Oracle process. This information includes all the resources
 held & requested by a specific process.
 
 Whilst an operation is hanging, open a new session and launch the following
 statement:
 
 ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 8';
 
 Oracle will now create a systemstate tracefile in your USER_DUMP_DEST directory.
 Get the PID (ProcessID) of the 'hanging' session from the V$PROCESS by matching 
 PADDR from V$SESSION with ADDR from V$PROCESS:
 
 SELECT PID FROM V$PROCESS WHERE ADDR=
        (SELECT PADDR FROM V$SESSION WHERE SID=sid_of_hanging_session);
 
 The systemstate dump contains a separate section with information for each 
 process. Open the tracefile and do a search for 'PROCESS pid_from_select_stmt'.
 In the process section look up the wait event by doing a search on 'waiting for'.
 
 Example output:
 
 PROCESS 8:
   ----------------------------------------
   SO: 50050b08, type: 1, owner: 0, flag: INIT/-/-/0x00
   (process) Oracle pid=8, calls cur/top: 5007bf6c/5007bf6c, flag: (0) -
             int error: 0, call error: 0, sess error: 0, txn error 0
   (post info) last post received: 82 0 4
               last post received-location: kslpsr
               last process to post me: 5004ff08 1 2
               last post sent: 0 0 13
               last post sent-location: ksasnd
               last process posted by me: 5004ff08 1 2
     (latch info) wait_event=0 bits=0
     Process Group: DEFAULT, pseudo proc: 50058ac4
     O/S info: user: daemon, term: pts/1, ospid: 15161
     OSD pid info: 15161
     ----------------------------------------
     SO: 5005f294, type: 3, owner: 50050b08, flag: INIT/-/-/0x00
     (session) trans: 0, creator: 50050b08, flag: (41) USR/- BSY/-/-/-/-/-
               DID: 0001-0008-00000002, short-term DID: 0000-0000-00000000
               txn branch: 0
               oct: 6, prv: 0, user: 41/LC
     O/S info: user: daemon, term: pts/1, ospid: 15160, machine: goblin.forgotten.realms
               program: sqlplus@goblin.forgotten.realms">sqlplus@goblin.forgotten.realms (TNS V1-V3)
     application name: SQL*Plus, hash value=3669949024
     waiting for 'library cache lock' blocking sess=0x0 seq=253 wait_time=0
!>>              handle address=5023ef9c, lock address=5019cad4, 10*mode+namespace=15
 
 Using the 'handle address' you can look up the process that is keeping a lock
 on your resource by doing a search on the address within the same tracefile.
 
 Example output: 
 
 PROCESS 9:
   ----------------------------------------
   SO: 50050e08, type: 1, owner: 0, flag: INIT/-/-/0x00
   (process) Oracle pid=9, calls cur/top: 5007bbac/5007bbfc, flag: (0) -
             int error: 0, call error: 0, sess error: 0, txn error 0
 
   <cut> ....
  
        ----------------------------------------
         SO: 5019d5e4, type: 34, owner: 5015f65c, flag: INIT/-/-/0x00
!>>      LIBRARY OBJECT PIN: pin=5019d5e4 handle=5023ef9c mode=X lock=0
         user=5005fad4 session=5005fad4 count=1 mask=0511 savepoint=118218 flags=[00]
  
 From the output we can see that the Oracle process with PID 9 has an exclusive
 lock on the object we are trying to access. Using V$PROCESS and V$SESSION we can
 retrieve the sid,user,terminal,program,... for this process. The actual statement
 that was launched by this session is also listed in the tracefile (statements and
 other library cache objects are preceded by 'name=').
 
 
METHOD 2: EXAMINE THE X$KGLLK TABLE
-----------------------------------
 
 The X$KGLLK table (accessible only as SYS/INTERNAL) contains all the 
 library object locks (both held & requested) for all sessions and
 is more complete than the V$LOCK view although the column names don't
 always reveal their meaning.
 
 You can examine the locks requested (and held) by the waiting session 
 by looking up the session address (SADDR) in V$SESSION and doing the 
 following select:
 
 select * from x$kgllk where KGLLKSES = 'saddr_from_v$session'
 
 This will show you all the library locks held by this session where
 KGLNAOBJ contains the first 80 characters of the name of the object.
 The value in KGLLKHDL corresponds with the 'handle address' of the
 object in METHOD 1.
 
 You will see that at least one lock for the session has KGLLKREQ > 0 
 which means this is a REQUEST for a lock (thus, the session is waiting). 
 If we now match the KGLLKHDL with the handles of other sessions in 
 X$KGLLK that should give us the address of the blocking session since
 KGLLKREQ=0 for this session, meaning it HAS the lock.
 
 SELECT * FROM X$KGLLK LOCK_A 
 WHERE KGLLKREQ = 0
   AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
               WHERE KGLLKSES = 'saddr_from_v$session' /* BLOCKED SESSION */
               AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
               AND KGLLKREQ > 0);
 
 If we look a bit further we can then again match KGLLKSES with SADDR 
 in v$session to find further information on the blocking session:
 
 SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
 WHERE SADDR in 
  (SELECT KGLLKSES FROM X$KGLLK LOCK_A 
   WHERE KGLLKREQ = 0
     AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
                 WHERE KGLLKSES = 'saddr_from_v$session' /* BLOCKED SESSION */
                 AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
                 AND KGLLKREQ > 0)
  );

 
 In the same way we can also find all the blocked sessions:
 
 SELECT SID,USERNAME,TERMINAL,PROGRAM FROM V$SESSION
 WHERE SADDR in 
  (SELECT KGLLKSES FROM X$KGLLK LOCK_A 
   WHERE KGLLKREQ > 0
     AND EXISTS (SELECT LOCK_B.KGLLKHDL FROM X$KGLLK LOCK_B
                 WHERE KGLLKSES = 'saddr_from_v$session' /* BLOCKING SESSION */
                 AND LOCK_A.KGLLKHDL = LOCK_B.KGLLKHDL
                 AND KGLLKREQ = 0)
  );
 
 
RELATED DOCUMENTS
-----------------
 
[NOTE:1020008.6] SCRIPT FULLY DECODED LOCKING SCRIPT 
[NOTE:1054939.6] COMPILATION OF PACKAGE IS HANGING ON LIBRARY CACHE LOCK 
.
开头说了半天废话,后面给出了两个解决方案,进行事件跟踪或者查询X$KGLLK表,还是后者容易一点。搜寻半天,终于问到sysdna登录的方法(权限低,没密码)。第一次用X$表,用sysdba手还有点发抖(那可是正式的业务系统),敲入命令发现运行了很久才反应,找出一个SID了,仔细一看,居然是一个后台维护终端,程序是TOAD,状态INACTIVE。询问相关人员,从未ALTER该表,看来是TOAD看表结构或者数据的时候出了问题,kill掉终于正常,索引很快建完。只是INACTIVE的会话居然也持有锁,真是奇怪。工具太好了,也是双刃剑啊。

原文转自:http://www.ltesting.net