关于数据库调优Node1

发表于:2007-05-25来源:作者:点击数: 标签:数据库调优Node1大致情况
大致情况是这样的,两台COMPAQ GS 160 主机(OLTP SERVER), 数据库 装在共享盘上,目前 性能 没什么大问题,先期对硬件进行了扩容(node 1 从8CPU 11G 扩容到16 CPU 16GB MEM ,node 2 从8) ,但数据库参数 没有调整,原因是比如节日系统要稳定、两会期间

  大致情况是这样的,两台COMPAQ GS 160 主机(OLTP SERVER),数据库装在共享盘上,目前性能没什么大问题,先期对硬件进行了扩容(node 1 从8CPU 11G 扩容到16 CPU 16GB MEM ,node 2 从8) ,但数据库参数 没有调整,原因是比如节日系统要稳定、两会期间要考虑安全、系统业务高峰期不能停业务等等吧,准备今天晚上调整, 下面是我们的一些考虑,大家看看是否合适。
  
  一、操作系统情况
  orabus@Ahyz1> vmstat 1 20
  Virtual Memory Statistics: (pagesize = 8192)
  procs memory pages intr cpu
  r w u act free wire fault cow zero react pin pout in sy cs us sy id
  48 959 310 961K 925K 168K 3G 353M 1G 26271 645M 0 5K 24K 22K 17 10 73
  62 951 300 960K 925K 168K 2002 266 1010 0 443 0 7K 22K 22K 80 15 5
  49 958 308 961K 925K 168K 2495 75 553 0 228 0 6K 25K 23K 83 14 3
  53 954 308 961K 924K 168K 1972 173 784 0 598 0 6K 24K 22K 81 15 4
  66 952 297 961K 924K 168K 1065 96 434 0 292 0 5K 22K 22K 80 12 8
  52 955 311 962K 923K 168K 2067 75 427 0 234 0 8K 26K 24K 82 15 3
  59 960 299 962K 923K 168K 2368 179 512 0 398 0 8K 24K 24K 82 15 3
  62 959 300 963K 923K 168K 3022 173 979 0 602 0 8K 25K 24K 82 16 2
  63 958 300 963K 922K 168K 2505 157 877 0 480 0 8K 28K 25K 80 17 3
  65 952 303 963K 922K 168K 2006 98 821 0 366 0 8K 24K 25K 78 14 8
  55 969 299 964K 921K 168K 4094 149 1464 0 457 0 7K 24K 23K 81 15 4
  58 970 298 966K 920K 168K 3634 182 1393 0 682 0 5K 25K 23K 79 13 7
  46 980 298 965K 920K 168K 1738 38 300 0 84 0 4K 24K 21K 80 14 6
  49 974 300 965K 920K 168K 1660 139 558 0 442 0 5K 23K 22K 83 13 4
  63 962 297 965K 920K 168K 1278 27 610 0 95 0 5K 25K 22K 82 13 5
  56 964 305 966K 919K 168K 2396 86 490 0 298 0 8K 24K 24K 82 14 4
  66 962 297 967K 918K 168K 2349 119 786 0 394 0 8K 26K 25K 80 15 5
  40 986 298 967K 919K 168K 1801 66 1054 0 283 0 8K 22K 23K 79 16 5
  45 969 305 967K 918K 168K 1569 95 673 0 301 0 9K 24K 24K 78 18 4
  54 968 298 967K 918K 168K 1095 20 185 0 113 0 8K 23K 26K 80 16 4
  orabus@Ahyz1>
  
  orabus@Ahyz1> top
  
  load averages: 28.47, 29.85, 30.70 11:23:08
  643 processes: 26 running, 66 waiting, 216 sleeping, 328 idle, 7 zombie
  CPU states: % user, % nice, % system, % idle
  Memory: Real: 6669M/16G act/tot Virtual: 18727M use/tot Free: 6450M
  
  PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
  524288 root 0 0 20G 609M run 617.0H 81.50% kernel idle
  556369 buswork 56 0 3376M 1179K run 42:01 64.60% oracle
  733961 buswork 42 0 3376M 1155K run 42:22 61.20% oracle
  524817 buswork 42 0 3376M 1261K WAIT 43:12 48.70% oracle
  750447 orabus 52 0 3382M 7086K run 6:53 47.20% oracle
  677254 buswork 48 0 3385M 8462K sleep 35:09 41.90% oracle
  525117 buswork 48 0 3385M 8437K run 33:47 40.20% oracle
  960115 buswork 47 0 3385M 9740K run 35:30 38.40% oracle
  807149 buswork 49 0 3385M 8445K run 33:15 38.00% oracle
  654356 buswork 47 0 3377M 2056K run 31:50 37.20% oracle
  1046508 buswork 48 0 3385M 9478K run 36:23 37.00% oracle
  569891 buswork 49 0 3385M 8454K run 34:12 36.40% oracle
  587602 buswork 48 0 3385M 8740K sleep 32:46 36.40% oracle
  860992 buswork 47 0 3385M 8429K run 33:35 34.90% oracle
  667424 buswork 49 0 3377M 2088K sleep 34:09 34.40% oracle
  
  orabus@Ahyz1> vmstat -P
  
  Total Physical Memory = 16384.00 M
  = 2097152 pages
  
  Physical Memory Clusters:
  
  start_pfn end_pfn type size_pages / size_bytes
  0 504 pal 504 / 3.94M
  504 524271 os 523767 / 4091.93M
  524271 524288 pal 17 / 136.00k
  8388608 8912872 os 524264 / 4095.81M
  8912872 8912896 pal 24 / 192.00k
  16777216 17301480 os 524264 / 4095.81M
  17301480 17301504 pal 24 / 192.00k
  25165824 25690088 os 524264 / 4095.81M
  25690088 25690112 pal 24 / 192.00k
  
  Physical Memory Use:
  
  start_pfn end_pfn type size_pages / size_bytes
  504 1032 scavenge 528 / 4.12M
  1032 1963 text 931 / 7.27M
  1963 2048 scavenge 85 / 680.00k
  2048 2278 data 230 / 1.80M
  2278 2756 bss 478 / 3.73M
  2756 3007 kdebug 251 / 1.96M
  3007 3014 cfgmgmt 7 / 56.00k
  3014 3016 locks 2 / 16.00k
  3016 3032 pmap 16 / 128.00k
  3032 6695 unixtable 3663 / 28.62M
  6695 6701 logs 6 / 48.00k
  6701 15673 vmtables 8972 / 70.09M
  15673 524271 managed 508598 / 3973.42M
  524271 8388608 hole 7864337 / 61440.13M
  8388608 8388609 unixtable 1 / 8.00k
  8388609 8388612 pmap 3 / 24.00k
  8388612 8389128 scavenge 516 / 4.03M
  8389128 8390059 text 931 / 7.27M
  8390059 8398104 vmtables 8045 / 62.85M
  8398104 8912872 managed 514768 / 4021.62M
  8912872 16777216 hole 7864344 / 61440.19M
  16777216 16777217 unixtable 1 / 8.00k
  16777217 16777220 pmap 3 / 24.00k
  16777220 16777736 scavenge 516 / 4.03M
  16777736 16778667 text 931 / 7.27M
  16778667 16786712 vmtables 8045 / 62.85M
  16786712 17301480 managed 514768 / 4021.62M
  17301480 25165824 hole 7864344 / 61440.19M
  25165824 25165825 unixtable 1 / 8.00k
  25165825 25165828 pmap 3 / 24.00k
  25165828 25166344 scavenge 516 / 4.03M
  25166344 25167275 text 931 / 7.27M
  25167275 25175320 vmtables 8045 / 62.85M
  25175320 25690088 managed 514768 / 4021.62M
  ============================
  Total Physical Memory Use: 2096559 / 16379.37M
  
  Managed Pages Break Down:
  
  free pages = 870044
  active pages = 580959
  inactive pages = 207561
  wired pages = 168231
  ubc pages = 228268
  ==================
  Total = 2055063
  
  WIRED Pages Break Down:
  
  vm wired pages = 14077
  ubc wired pages = 0
  meta data pages = 62820
  malloc pages = 79557
  contig pages = 1242
  user ptepages = 2396
  kernel ptepages = 492
  free ptepages = 15
  ==================
  Total = 160599
  
  orabus@Ahyz1>
  
  orabus@Ahyz1>
  orabus@Ahyz1> sar -u 1 30
  
  OSF1 Ahyz1 V5.1 732 alpha 06May2003
  
  11:26:21 %usr %sys %wio %idle
  11:26:22 85 14 2 0
  11:26:23 83 16 1 0
  11:26:24 82 12 5 1
  11:26:25 82 13 5 0
  11:26:26 85 13 2 0
  11:26:27 85 14 1 0
  11:26:28 86 14 1 0
  11:26:29 85 15 1 0
  11:26:30 85 14 1 0
  11:26:31 84 13 3 0
  11:26:32 88 12 0 0
  11:26:33 87 11 2 0
  11:26:34 87 12 1 0
  11:26:35 86 12 2 0
  11:26:36 87 11 1 0
  11:26:37 87 12 1 0
  11:26:38 87 12 1 0
  11:26:39 86 13 1 0
  11:26:40 86 13 1 0
  11:26:41 84 14 2 0
  11:26:42 85 14 1 0
  11:26:43 81 16 3 0
  11:26:44 80 16 3 0
  11:26:45 85 13 2 0
  11:26:46 86 11 2 0
  11:26:47 84 15 1 0
  11:26:48 87 11 2 0
  11:26:49 86 12 1 0
  11:26:50 87 13 0 0
  11:26:51 86 13 0 0
  
  Average 85 13 2 0
  
  SQL/Business>select count(*),status from v$session group by status ;
  
  COUNT(*) STATUS
  ---------- --------
  89 ACTIVE
  303 INACTIVE
  4 SNIPED
  
  Elapsed: 00:00:00.31
  SQL/Business>/
  
  COUNT(*) STATUS
  ---------- --------
  91 ACTIVE
  302 INACTIVE
  4 SNIPED
  
  Elapsed: 00:00:00.03
  SQL/Business>/
  
  COUNT(*) STATUS
  ---------- --------
  92 ACTIVE
  301 INACTIVE
  4 SNIPED
  
  二、准备先做一下修改
  Parameter Current value Recommended value
  Db_block_buffer 307200 448000
  Shared_pool_size 828375040 1258291200按1.2G
  Log_buffer 1048576 4194304
  Fast_start_io_target 307200 0
  Processes 600 650
  Db_block_max_dirty_target 307200 0
  
  三、附件为statspack 的report文件
  首先,你的Shared_pool_size 要改到1.2G?千万别,你现在的800多M也太大了,改到500M以下吧,别设置那么大。
  
  然后建议你使用绑定变量,你的重复调用多insert into ba_task_query_GC_CD_t (city_code,service_kind,service_id,customer_id,aclearcase/" target="_blank" >ccount_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id,service_group_id,serving_status, contact_address,contact_phone) values('317',9,:service_id,:customer_id,:account_id,:user_id,:tot
  
  61 1,334 3556047760 insert into ba_task_query_GC_CD_t (city_code,service_kind,service_id,customer_id,account_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id,service_group_id,serving_status, contact_address,contact_phone) values('317',8,:service_id,:customer_id,:account_id,:user_id,:tot
  
  51 14,320 30278495 insert into ba_task_query_CZ_WZH_t (city_code,service_kind,service_id,customer_id,account_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id,service_group_id,serving_status, contact_address,contact_phone)values('312',9,:service_id,:customer_id,:account_id,:user_id,:to
  
  begin hlr_getcommand_proc('HLR_2', 'HLR_BB21', '#', :nCount,:v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_event); end;
  
  6,163,145 1,998 3,084.7 1.8 893706115 begin hlr_getcommand_proc('HLR_2', 'HLR_BB22', '#', :nCount,:v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_event); end;
  
  5,700,511 1,818 3,135.6 1.7 2506361444 begin hlr_getcommand_proc('HLR_4', 'HLR_AQ1', '#', :nCount, :v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_event); end;
  
  5,362,724 1,620 3,310.3 1.6 4001442081 begin hlr_getcommand_proc('HLR_3', 'HLR_WH38', '#', :nCount,:v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_event); end;
  
  首先,你的Shared_pool_size 要改到1.2G?千万别,你现在的800多M也太大了,改到500M以下吧,别设置那么大。
  
  可以把db_block_buffers加大点。
  
  然后建议你使用绑定变量,你的重复调用多
  insert into ba_task_query_GC_CD_t (city_code,service_kind,service_id,customer_id,account_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id,service_group_id,serving_status, contact_address,contact_phone) values('317',9,:service_id,:customer_id,:account_id,:user_id,:tot
  
  61 1,334 3556047760 insert into ba_task_query_GC_CD_t (city_code,service_kind,service_id,customer_id,account_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id,service_group_id,serving_status, contact_address,contact_phone) values('317',8,:service_id,:customer_id,:account_id,:user_id,:tot
  
  51 14,320 30278495 insert into ba_task_query_CZ_WZH_t (city_code,service_kind,service_id,customer_id,account_id,user_id,total,overdue,times,first_name,identity_kind,identity_code,bus_favour_id,service_favour_id, service_group_id,serving_status, contact_address,contact_phone) values('312',9,:service_id,:customer_id,:account_id,:user_id,:to
  
  begin hlr_getcommand_proc('HLR_2', 'HLR_BB21', '#', :nCount, :v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_even t); end;
  
  6,163,145 1,998 3,084.7 1.8 893706115begin hlr_getcommand_proc('HLR_2', 'HLR_BB22', '#', :nCount, :v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_even t); end;
  
  5,700,511 1,818 3,135.6 1.7 2506361444 begin hlr_getcommand_proc('HLR_4', 'HLR_AQ1', '#', :nCount, : v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_event); end;
  
  5,362,724 1,620 3,310.3 1.6 4001442081 begin hlr_getcommand_proc('HLR_3', 'HLR_WH38', '#', :nCount,:v_command, :v_register_number, :v_attach_business_ret, :v_carry_business_ret, :v_service_kind, :v_source_kind, :v_apply_even t); end;
  
  关于share pool size 问题
  我们是这样考虑的
  至少目前的一些命中率都在ORACLE建议的范围内
  然后业务高峰期内存free的有5G多
  而CPU 占用率在80--90%,瓶颈在CPU
  所以考虑尽量增大share pool
  
  而且我一直觉得biti前段时间“关于SGA设置的一点总结 ”中有些部分值得推敲比如“内存 12G shared_pool_size = 300M , data buffer = 8G ”NODE1 在硬件调整前就是11G MEM ,当时metalink曾经建议尝试调小 share pool size 实际是不成功的
  
  select name,value
  from v$sysstat
  where name in ('physical reads','db block gets','consistent gets');
  
  当前数据库高速缓存区命中率为:
  physical reads 1249697407
  consistent gets 1.9561E+10
  db block gets 1686300261
  数据buffer命中率:1-1249697407/(1686300261+1.9561E+10)=0.941
  
  select gethitratio
  from v$librarycache
  where namespace = 'SQL AREA';
  库缓存命中率:0.938
  
  1.从高速缓存区命中率和库缓存区命中率看目前db_block_buffer和share_pool_size命中率不高
  
  2.从数据库的等待事件看,有一定数量的buffer_busi_waits和latch_free发生(具体请参考statspack报告)
  
  Avg
  Total Wait wait Waits
  Event Waits Timeouts Time (cs) (ms) /txn
  ---------------------------- ------------ ---------- ----------- --------- ------
  db file sequential read 9,015,682 0 15,545,722 17 21.4
  db file scattered read 1,321,873 0 2,228,585 17 3.1
  SQL*Net message from dblink 7,629,094 0 1,560,805 2 18.1
  log file sync 277,373 557 789,234 28 0.7
  db file parallel write 444,690 0 619,435 14 1.1
  buffer busy waits 265,226 70 496,506 19 0.6
  SQL*Net more data from dblin 32,930 0 280,688 85 0.1
  latch free 936,150 597,205 260,296 3 2.2
  
  相应增加buffer_cache和share_pool_size改善命中率和等待时间的发生频率目前主机内存16G,按总容量1/3分配给SGA区,大约(5G左右)相应调整buffer_cache为3.5G和share_pool为1.2G
  数据缓存区:3.5G
  db_block_buffer=3500*1024*1024/8192=448000
  
  共享池:1.2G
  share_pool_size=1200*1024*1024=1258291200
  
  CPU使用过高  那 shared_pool_size 就更不能太大了12G的内存,SGA 完全可以给到8G (如果没有其他应用) 并且把 SGA 锁定在 内存中row device 没有file system 的 cache 则 OS 应该不会使用过多的内存
  
  有一个目前运行的业务的 系统
  内存 32G , 14CPU
  data buffer 已经达到 20多 G

原文转自:http://www.ltesting.net