Henry Xie 's blog: October 2012

Wednesday, October 31, 2012

A Few CHM/OS Command For CRS

CHM/OS is installed by default after CRS 11.2.0.2
By default the location is on $GI_HOME/crf/

To check CHM/OS status:
crsctl stat res –t –init

To gather last 1 hour statistics of CHM/OS

oclumon dumpnodeview -allnodes -v -last "01:00:00" > chmos_data.txt

To gather certain time of statistics
start on "2012-10-28 21:05:00"
end on "2012-10-28 21:15:00"
oclumon dumpnodeview -allnodes -v -s "2012-10-28 21:05:00" -e "2012-10-28 21:15:00"

More hints on
oclumon showobjects -h
oclumon dumpnodeview -h
oclumon manage -h
oclumon version -h
oclumon debug -h

Sunday, October 28, 2012

Steps To Change DBNAME DBID

To reset the DBID and DBNAME, here are some top level steps

alter system set cluster_database=false scope=spfile;
shutdown immediate;
startup mount;
exit;
nid TARGET=SYS DBNAME="your DB Unique Name"
sqlplus / as sysdba
create pfile=’dbid_pfile.ora’ from spfile ='spfile_dbname.ora';
exit;
Edit dbid_pfile.ora
SET *.db_name parameter =’ <DB_UNIQUE_NAME>’
SET *.cluster_database=true
Save your changes.
sqlplus / as sysdba
create spfile ='spfile_dbname_new.ora' from pfile=’dbid_pfile.ora’;
startup mount;
alter database open resetlogs;

Refer : http://docs.oracle.com/cd/B28359_01/server.111/b28319/dbnewid.htm

Thursday, October 25, 2012

Linux HugPages With SGA

Normally when your Linux OS (no matter 32bit or 64bit)has more than 16G RAM while DB SGA is more than 8G, we recommend to set hugpages for SGA.
Just remember we can't use Automatic Memory Management (AMM) and HugePages together.
The main benefit is to avoid swap in OS and save the PIN RAM,thus overall memory performance improves

The calculation is simple:

# Hugepages/bigpages values are strongly tied to SGA size.
# vm.nr_hugepages = ((1+3%)*SGA_SIZE)/2MB

# vm.hugetlb_shm_group = <oracle process group id>
# vm.hugetlb_shm_group =1001

Main Parameters When Tuning Oracle NET

DEFAULT_SDU_SIZE
RECV_BUF_SIZE
SEND_BUF_SIZE
TCP.NODELAY

The whole list of 11gR1 is here
http://docs.oracle.com/cd/B28359_01/network.111/b28317/sqlnet.htm

Wednesday, October 24, 2012

A few DataGuard Sqls

Just share the sqls I often use to monitor DataGuard

Monitor log transport sql (primary):
SELECT DB_NAME, HOSTNAME, LOG_ARCHIVED, LOG_APPLIED,APPLIED_TIME,
LOG_ARCHIVED-LOG_APPLIED LOG_GAP
FROM
(
SELECT NAME DB_NAME
FROM V$DATABASE
),
(
SELECT UPPER(SUBSTR(HOST_NAME,1,(DECODE(INSTR(HOST_NAME,'.'),0,LENGTH(HOST_NAME),
(INSTR(HOST_NAME,'.')-1))))) HOSTNAME
FROM V$INSTANCE
),
(
SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES'
),
(
SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES'
),
(
SELECT TO_CHAR(MAX(COMPLETION_TIME),'DD-MON/HH24:MI') APPLIED_TIME
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES'
);

Monitor recovery (standby)
select inst_id, process, status, count(*)
from gv$managed_standby
group by inst_id, process, status
order by inst_id, process, status;

Gap sql (standby):
set linesize 200
column max_redo_seq format a30
column max_recovery_seq format a30
column gap_seq format a30

SELECT thread,
       max(redo) || ' : ' || max(next_scn_redo) as max_redo_seq,
       max(recovery) || ' : ' ||   max(next_scn_recovery) as max_recovery_seq,
       (max(redo) - max(recovery)) || ' : ' || (max(next_scn_redo) - max(next_scn_recovery)) as gap_seq
FROM (
select thread# as thread,
       max(sequence#) as redo,
       max(next_change#) as next_scn_redo,
       0 as next_scn_recovery,
       0 as recovery
from v$archived_log
group by thread#
union all
select thread# as thread,
       0 as redo,
       0 as next_scn_redo,
       max(next_change#) as next_scn_recovery,
       max(sequence#) as recovery
from v$log_history
group by thread# )
group by thread
order by thread
/

SELECT PROTECTION_MODE, PROTECTION_LEVEL, DATABASE_ROLE FROM V$DATABASE

ALTER DATABASE REGISTER logfile '/oraarch/testdb/testdb_628319983_1_43377.arc';

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Tips on BigIP iRule to Filter Certain Urls

We have requirement to redirect certain urls to another website, here are example of how to deal with it on F5 Bigip iRule

Case 1:
Any urls contains below 3 strings will be redirected
"/content/idcplg/webdav*"   "/_dav/content/idcplg/webdav*"   "/*sscontributor=true*"

.....
else {
         switch -glob [string tolower [HTTP::uri]] {
            "/content/idcplg/webdav*" -
            "/_dav/content/idcplg/webdav*" -
            "/*sscontributor=true*" {
               HTTP::redirect "http://our-stage.testadc.com/"
               event disable all
            }
...

Case 2:
Detect if cookie wcm.contributor.mode exists and value is true, we redirect urls
.......
elseif { [HTTP::cookie value "wcm.contributor.mode"] eq "true" } {
         HTTP::redirect "http://our-stage.testadc.com/"
         event disable all
      }
......

Case 3:
Detect if cookie wcm.contributor.mode exists and value is true, we remove the cookie
.....
elseif { [HTTP::cookie value "wcm.contributor.mode"] eq "true" } {
          HTTP::cookie remove wcm.contributor.mode
                      }

Case 4
Any urls from right contains "?SSContributor=true" . It will be trimmed
As "?" is special character, we need to trim twice
.....
"/*sscontributor=true*" {
               HTTP::redirect [string trimright [string trimright [HTTP::uri] "SSContributor=true"] \?]
.....

Wednesday, October 17, 2012

A Sql To Get Sid Of Blocking Session

There are many sql to get this information. They are all good. I just share mine.

SELECT blocking_sid, num_blocked FROM ( SELECT blocking_sid, SUM(num_blocked) num_blocked
    FROM ( SELECT l.id1, l.id2, MAX(DECODE(l.block, 1, i.instance_name||'-'||l.sid,    2, i.instance_name||'-'||l.sid, 0 )) blocking_sid,
   SUM(DECODE(l.request, 0, 0, 1 )) num_blocked
   FROM gv$lock l, gv$instance i
WHERE ( l.block!= 0 OR l.request > 0 ) AND
   l.inst_id = i.inst_id
GROUP BY l.id1, l.id2)
GROUP BY blocking_sid
   ORDER BY num_blocked DESC) WHERE num_blocked != 0 ;

Tuesday, October 16, 2012

How To Get Oracle DB Object Definition

Normally we can use dbms_metadata.get_ddl to get most of object definition

examples:

SELECT dbms_metadata.get_ddl('FUNCTION', 'WHOAMI') FROM dual;

SELECT dbms_metadata.get_ddl('TABLESPACE', 'DATA') FROM dual;

select dbms_metadata.get_ddl('INDEX','REVISIONS_CLASSIDSTATE') from dual;

select dbms_metadata.get_ddl('DB_LINK','TEST.US.ORACLE.COM') from dual;
select dbms_metadata.get_ddl('USER','MONITOR') from dual;

select dbms_metadata.get_ddl('PROCEDURE','COMBINE_FOLDER_STRUCTURE') from dual;

select dbms_metadata.get_ddl('FUNCTION','TEST') from dual;

select dbms_metadata.get_ddl('TABLE','MYTEST' ) from dual;

However for Oracle Text , we need to use specific ones:
examples:

set long 500000
set pages 999
select ctx_report.create_index_script('FT_MYTEXT2') from dual;

set long 50000
set pages 9999
select dbms_metadata.get_ddl('TABLE', 'DR$FT_MYTEXT2MI') from dual;
select dbms_metadata.get_ddl('TABLE', 'DR$FT_MYTEXT2$I') from dual;

Friday, October 12, 2012

How To View Exadata Cell Offload On Enterprise Manager Grid Control

Sometimes we would like to review a SQL performance gain on Exadata. There are many ways to compare, here I would point out a view on Enterprise Manager Grid Control.

Login Enterprise Manager Grid Control --> find your DB --> click performance tab
-->click SQL monitoring -->click SQL ID you are interested in

If your DB is running on Exadata, you will see "Cell Offload Efficiency" on up right.
Move your mouse on the percentage number, you will numbers of "Bytes from disk" "Bytes returned by Exadata"

"Bytes from disk" means how much bytes DB is expecting from Disk no matter exadata or not
"Bytes returned by Exadata" means Exadata Cells return how much data to DB on interconnect layer
By smart scan,filter,projection, Cell would return a subset data thus offload much traffic to DB layer.

Keep it in mind, sometimes the metric can be negative because of EHCC, ASM mirror...etc
Don't be surprise if you see something like -32%

How To DrillDown ZFS Latency By Client

In some cases, UCM cluster will compete a shared file on ZFS shared mount point, sometimes we need to know latency when UCM nodes do a IO on ZFS shares.

Here is how we do it from ZFS UI. There are lots of metric on ZFS UI, you can play with them to suite your need.

Login ZFS UI -->click Analytics -->add statistics
-->choose "Protocol: NFSv3 operations per second broken down by latency"
-->right click the top latency -->drilldown by client or share or file name...

You can see which client , which share have the top latency....to help you dig down root cause.

see screenshot below

Wednesday, October 10, 2012

How To Backup Table/Index When It has Blocks Corruption

You should have ora600 error in your alert logs
There are many notes of how to deal with ora600 ,see note 1088018.1 68013.1 556733.1
My note here is how to backup the table/index when there is block corruption.
In this case, you can claim back blocks which are not corrupted and keep them safe.

Step 1: How to get corrupted rowid from block number:
You can get block number from alert or trace files
Use sql below to get rowid:
select rowid from ucm_search.idctext2 where DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) = 527989

Step 2: After we find the rowid of table, we can select most of rows into backup table
The sql we use is like

create table ucm_search.idctext2_bak from select * from ucm_search.idctext2 where rowid not in ('AAAuMEAAjAACA51AAB','AAAuMEAAjAACA51AAA');

Step 3: You can rename it to original name to keep system running
Sql like:
ALTER TABLE idctext2_bak rename to idctext2;

How To Handle Error When You Remove DB From CRS

This is a common error when you have multiple DB HOME. You did some DB upgrades and CRS version is higher than DB

Error like:
CRS % srvctl remove database -d TESTPRD
PRCD-1027 : Failed to retrieve database TESTPRD
PRCD-1027 : Failed to retrieve database TESTPRD
PRKP-1088 : Failed to retrieve configuration of cluster database TESTPRD
PRKR-1078 : Database TESTPRD of version 11.0.0.0.0 cannot be administered using current version of srvctl. Instead run srvctl from /u01/app/oracle/product/10.2.0.3/TESTPRD

srvctl is not working from TESTPRD home because of compatible issues

Solution is:
Use srvctl from another DB HOME (ie MYTESTDB home) to remove the TESTPRD targets

Java Installation No space Error

When you run command below:
java -d64 -jar wls1036_upgrade_generic.jar

You may hit this error like:

Fatal error encountered during self-extraction. [No space left on device]

The reason is /tmp is too small

You can't enlarge /tmp easily in some systems as it was predefined.
workaround I have is to use -Djava.io.tmpdir
java -d64 -Djava.io.tmpdir=/u02/installs/11.1.1.6_setup/WebLogic/ -jar wls1036_upgrade_generic.jar

Tuesday, October 09, 2012

Tips of Handling Oracle Text Optimization Issues

Example of Sql to run optimization

begin
ctx_output.start_log('opt_rebuild_TEST_IDCTEXT1');
ctx_output.add_event(ctx_output.event_opt_print_token);
ctx_ddl.optimize_index(idx_name=>'TEST_IDCTEXT1',optlevel=>'REBUILD');
ctx_output.end_log;
end;
/

If proc error out straight away, logs show it started and ended without rows in between, it indicates it may have shadow index, need to drop it
To drop shadown index: exec ctx_ddl.drop_shadow_index('TEST_IDCTEXT2');

Check idx_options
select idx_name, idx_status, idx_option from ctxsys.dr$index where idx_name='TEST_IDCTEXT1';
to check idx_option (this is assuming TEST_IDCTEXT1 is current index)
if idx_option=C....then
conn ctxsys
update dr$index set idx_option='CY' where idx_name='TEST_IDCTEXT1';

How to add debug information during optimziation
begin
ctx_output.start_log('opt_reb_TEST_IDCTEXT1_'||to_char(sysdate,'DD-MON-YY_HH24MISS')||'.log');
ctx_output.add_event(ctx_output.event_opt_print_token);
ctx_output.add_event(ctx_output.EVENT_DRG_DUMP_ERRORSTACK,ctx_output.DRG_DUMP_ALL_ERRORS);
ctx_ddl.optimize_index(idx_name=>'TEST_IDCTEXT1',optlevel=>'FULL');
ctx_output.end_log;
end;

How to use sql to rebuild index online
ctx_output.start_log('TEST_IDCTEXT1_Rebuild14july2012.log');
alter index UCM_SEARCH.TEST_IDCTEXT1 rebuild online parameters('replace memory 200M');
ctx_output.end_log;

Examples of Dumping Redo Blocks From Archivelogs

More details see MOS note 960780.1
Here are pure examples for reference:

ALTER SYSTEM DUMP LOGFILE '/archive/<dbname>/<instancename>/ArchiveOnLine/<dbname>_709323090_1_8601.arc' DBA MIN 28 21649 DBA MAX 28 21649;
ALTER SYSTEM DUMP LOGFILE '/archive/<dbname>/<instancename>/ArchiveOnLine/<dbname>_709323090_1_8601.arc' DBA MIN 28 144259 DBA MAX 28 144259;
ALTER SYSTEM DUMP LOGFILE '/archive/<dbname>/<instancename>/ArchiveOnLine/<dbname>_709323090_1_8601.arc' DBA MIN 28 145056 DBA MAX 28 145056;

dump file#28 block: 21649
dump file#28 block: 144259
dump file#28 block: 145056

How to check hidden init DB parameters

how to check the implicit hidden init parameters such as "_disable_cell_optimized":

Use below sql:

select a.ksppinm "Parameter", b.ksppstvl "Session Value",
c.ksppstvl "Instance Value" from x$ksppi a, x$ksppcv b, x$ksppsv c
where a.indx = b.indx and a.indx = c.indx and a.ksppinm like '%_disable_cell_optimized%';

Examples of Finding then Deleting in LInux

Delete files in linux older than 1 year:

find /u01 -type f -mtime +365 -ls -exec rm -f -- {} \;

Delete files in linux in last 3 days:

find /u01 -type f -mtime -3 -ls -exec rm -f -- {} \;

Delete folders in linux older than 1 year:

find /u01 -type d -mtime +365 -ls -exec rm -rf -- {} \;

Delete trace files in linux older than 2 days

find . -type f -name "*.tr*" -mtime +2 -ls -exec rm -f -- {} \;

Oracle DB 11gR2 AWR Global Report Generation

Before 11gR2, the awrrpt.sql under rdbms/admin only generates awr report for local instance. You have to collect awr report for each of RAC node.

In 11gR2 we have two new scripts awrgrpt.sql & awrgdrpt.sql for RAC

awrgrpt.sql -- AWR Global Report (RAC) (global report)
awrgdrpt.sql -- AWR Global Diff Report (RAC)

Plus below for your reference
spawrrac.sql -- Server Performance RAC report
awrsqrpt.sql -- Standard SQL statement Report
awrddrpt.sql -- Period diff on current instance
awrrpti.sql -- Workload Repository Report Instance (RAC)

Monday, October 08, 2012

How to ssh into Exadata Cell host

Exadata Cell Host is with Oracle Linux
Normally we can not ssh into it via ethenet network.
We build trust host in exadata DB host, we ssh into DB host first, then ssh into Cell hosts

Exadata Cell init and trace files location version 11.2.3.1.1

Exadata Cell init file: version 11.2.3.1.1
This is just an example:
/opt/oracle/cell11.2.3.1.1_LINUX.X64_120426/cellsrv/deploy/config

Exadata Cell trace files: version 11.2.3.1.1
This is just an example:
/opt/oracle/cell11.2.3.1.1_LINUX.X64_120426/log/diag/asm/cell/<cell hostname>/trace

How to update init files and gather trace:
0. offline disks
1. update cellinit.ora with event trace
2. offline cells
3. stop/start celld/check trace enable
    service celld stop
    service celld start
or
CellCLI> alter cell shutdown services all;
CellCLI> alter cell startup services all;

    cd $CELLTRACE
    view alert*log
4 online disks

A Few Commands to manage OCR & Voting Disk

use ocrcheck to see information of ocr locations

to replace or move ocr:
   touch the ocr file first
   i.e touch /oracommon1_ocr/oracrs/ocrmirror/ocr1
   as root   ./ocrconfig -replace /oracommon2_ocr/oracrs/ocr -replacement /oracommon1_ocr/oracrs/ocrmirror/ocr1
No need to bounce CRS

Manage Voting disk
No need to bounce CRS: as root:
./crsctl add css votedisk /oracommon1_vote/oracrs/vdsk1
./crsctl delete css votedisk /oracommon2_vote/oracrs/vdsk
./crsctl query css votedisk

CRS alert from Oracle Enterprise manager

We have agent installed on our RAC DB. Oracle Enterprise Manager grid control is monitoring CRS on it
The alert is like :

Cluseterware has problems on all hosts of this cluster. Verifying CRS integrity ,,Checking CRS integrity...,,ERROR: ,PRVF-4037 : CRS is not installed on any of the nodes,Verification cannot proceed,,,CRS integrity check failed,,Verification of CRS integrity was unsuccessful on all the specified nodes. ,NODE_STATUS:

We can use ocrcheck of CRS Home to verify it manually. The manual test is fine.
Then we suspect something affecting agent to run the integrity script.
It turns out we change oraInventory of /etc/oraInst.loc. It confused agent, we rollback the changes, the alert is gone. Good to know agent monitoring is tied with oraInventory.