Wednesday, March 28, 2018

WildCards To Jeopardize Execuation Plan

Symptom:

 The incorrect usage of wildcard '%'  will Jeopardize Execuation Plan

create index username_idx on mytest(lower(USERNAME));
select *  from mytest where lower(USERNAME) like 'scot%';


Execution Plan
----------------------------------------------------------
Plan hash value: 3746642705

--------------------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time       |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |    10 |  1280 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MYTEST       |    10 |  1280 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN        | USERNAME_IDX |     2 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(LOWER("USERNAME") LIKE 'scot%')
       filter(LOWER("USERNAME") LIKE 'scot%')


If you put % in front of a string, it tells oracle look for the string 'scot' anywhere in the column.
It results full tablescan as normal B-tree index can't handle such search. If it is key word, we may consider to use domain index

select *  from mytest where lower(USERNAME) like '%scot%'; --full scan

Execution Plan
----------------------------------------------------------
Plan hash value: 1692938441

----------------------------------------------------------------------------
| Id  | Operation      | Name   | Rows  | Bytes | Cost (%CPU)| Time       |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |    10 |  1280 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| MYTEST |    10 |  1280 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(LOWER("USERNAME") LIKE '%scot%')


Sunday, March 25, 2018

Invalid Oracle Cluster Registry (OCR) location.

Symptom:

 We clone grid home (CRS)  from other node .Details see oracle support Doc ID 1493677.1
  Try to reconfigure GI home with new nodes. We always get below error

 ./config.sh -debug -logLEvel finest 2>&1 | tee `hostname -s`.Grid_config.log

[INS-41321] Invalid Oracle Cluster Registry (OCR) location


Diagnose:

   Compare ZFS mount options with good ones. No luck.
   Compare ZFS shares on filer side with good ones.No luck.
   Modify OS privilege, chmod.....etc .No luck.


Solution:

   Finally we found the OS kernel version of the source GI home is different from the target
2.6.39-400.298.2.el6uek.x86_64 vs 3.8.13-118.20.2.el6uek.x86_64

  The clone files must be from same kernel level source home

Monday, March 19, 2018

Ansible Temporary Files Issues with Pbrun

Symptom

   When we do a simple pwd command via pbrun from ansible , it error out

ansible xxxx.com -i ./hosts_t -u guid --become-user=oracle -k -K -b -m command -a 'pwd'

FAILED | rc=-1 >>
Failed to set permissions on the temporary files Ansible needs to create when becoming an unprivileged user (rc: 1, err: chown: changing ownership of `/tmp/ansible-tmp-1521520911.62-5031142116596/': Operation not permitted
chown: changing ownership of `/tmp/ansible-tmp-1521520911.62-5031142116596/command.py': Operation not permitted
}). For information on working around this, see https://docs.ansible.com/ansible/become.html#becoming-an-unprivileged-user


 Diagnosis:

 Find the note in github to set parameter in ansible.cfg
 allow_world_readable_tmpfiles = true
 However after that, we received another error: 

 (127, 'pbrun9.3.0-07[340105]: 3201.08 Exec of /bin/bash failed: Resource temporarily unavailable\r\n', 'Shared connection to ucmdx4007.us.oracle.com closed.\r\n')

 The error is related to OS settings . It seems pbrun has issues to run a bash and we can reproduce it via pbrun content-ansible -u pwd

  After digging around,  we found by default OS has limit for number of user's processes. The pbrun needs more than 1024 to exec bash commands.
cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     1024
root       soft    nproc     unlimited


Solution:

comment  "*          soft    nproc     1024" in the /etc/security/limits.d/90-nproc.conf

set parameter in ansible.cfg -- allow_world_readable_tmpfiles = true


 

Sunday, March 18, 2018

Ansible How to Hide Expected Stderr Output

Symptom:

   There are some cases the SHELL scripts we run via ansible, the return code (rc) is not 0
   And it is expected to return non-zero.

   Ansible always output stderr for you which can be annoyed.  ie

TASK [run the script  to get the result] ************************************************************************************************
fatal: [acmsdbv3053.us.oracle.com]: FAILED! => {"changed": true, "cmd": ". /tmp/spectre-meltdown-checker-ansible.sh --batch text", "delta": "0:00:01.688319", "end": "2018-03-19 00:36:12.670522", "failed": true, "msg": "non-zero return code", "rc": 2, "start": "2018-03-19 00:36:10.982203", "stderr": 

 We try to hide them

Solution:

   Use  ignore_errors to continue the ansible yml file without disruption by the failed error

   Use  failed_when to be set as false. It will force the task never to be failed
   Example codes like

- name: run the script  to get the result
     shell: |
           . /tmp/spectre-meltdown-checker-ansible.sh --batch text
     register: output
     failed_when: false
     ignore_errors: True




Before set  failed_when
"changed": true,
    "cmd": ". /tmp/spectre-meltdown-checker-ansible.sh --batch text",
    "delta": "0:00:01.724031",
    "end": "2018-03-19 00:55:37.677224",
    "failed": true



After set  failed_when
"changed": true,
    "cmd": ". /tmp/spectre-meltdown-checker-ansible.sh --batch text",
    "delta": "0:00:01.966887",
    "end": "2018-03-19 00:53:10.673414",
    "failed": false,
    "failed_when_result": false, 

Thursday, March 15, 2018

Ansible replace \t or '\t' in Sqlplus Output

Symptom:

     When we run sql via sqlplus to get some output from database via ansible playbook, we get some special characters. ie \t ,example below. The format is not quite friendly.

TASK [display details of sql output] *****************************************************************************************************************************************
ok: [acmsdbv3053.us.oracle.com] => {
    "msg": [
        "",
        "NAME\t\t\t\t\t INST_ID VALUE",
        "---------------------------------------- ------- --------------------------------------------------",
        "compatible\t\t\t\t       1 11.2.0.2.0",
        "compatible\t\t\t\t       2 11.2.0.2.0",
        "noncdb_compatible\t\t\t       1 FALSE",
        "noncdb_compatible\t\t\t       2 FALSE"
    ]
}

Solution:

We use replace function to replace \t (the string)  with ' ' or we need to replace real tab with ' '

To replace replace \t (the string)  with ' ' in the output
 - name: display details of sql output
     debug: msg="{{ sqloutput.stdout_lines|replace('\\t',' ') }}"

To replace real tab with ' ' in the output
- name: display details of sql output
     debug: msg="{{ sqloutput.stdout_lines|replace('\t',' ') }}"

Wednesday, March 14, 2018

An Easy Mistake when setting AWS Security Group

 Symptom:

 When we set up your wordPress test env in AWS, follow the instructions ,put RDS and EC2 in the same security group and  VPC, ssh, http ports are open.  Both services are up and running. However php can't connect into database. Error below: 

Error establishing a database connection

This either means that the username and password information in your wp-config.php file is incorrect or we can’t contact the database server at xd1rcv07eugoakc.c6kx8udnwhvn.us-east-1.rds.amazonaws.com. This could mean your host’s database server is down.

Diagnostic:

 Check RDS mysql logs and status , make sure its up and running
 Check Httpd logs status, make sure its up and running
 Use nc test mysql port 3306 from EC2 to RDS endpoint,see if it is open
 Find port is not open
 Check the inbound rules of the security group
 ->All traffic inbound rule was removed accidentally when we config ssh port
By default services in the security group would accept all traffic within the security group

Solution:

Add  All traffic inbound rule back for the security group
ie
All traffic    ALL      ALL      sg-ag41e528(default)

Tuesday, March 13, 2018

To Run Any Sql Scripts from Sqlplus on Targets via Ansible Playbook

The main part of Yaml file is below.
 name description self-explains

 - hosts: "{{ v_host | default('test.com') }}"
   vars:
    - myparameter: "{{ v_para | default('db_unique_name') }}"
    - mysid:  "{{ v_sid | default('ORCL') }}"
    - myhost: "{{ v_host | default('test.com') }}"
    - dbhome_locl: ""
    - parameter_sql: /u01/app/ansible/repository/show_parameter.sql

   become: true
   become_method: pbrun
   become_user: oracle
   become_flags: 'content-ansible'
   gather_facts: False
   tasks:
   - name: find the location of db home for the sid
     shell: 'grep {{ mysid }} /etc/oratab |grep -v ^[#] | cut -d: -f2|head -1'
     register: dbhomeoutput
   - set_fact:
       dbhome_locl: "{{ dbhomeoutput.stdout }}"
   - debug: var=dbhome_locl

   - name: copy show parameter sql to target host /tmp
     copy:
          src: "{{ parameter_sql }}"
          dest: /tmp/ansible_sql.sql
          force: yes
          mode: 0755
   - name: run the sql to get the result
     shell: |
          export ORACLE_SID={{ mysid }}
          export ORACLE_HOME={{ dbhome_locl }}
          {{ dbhome_locl }}/bin/sqlplus -s "/ as sysdba" @/tmp/ansible_sql.sql "{{ myparameter }}"
     register: sqloutput

   - name: display details of sql output
     debug: msg="{{ sqloutput.stdout_lines }}"

   - name: delete sql file on the remote target
     file:
       path: /tmp/ansible_sql.sql
       state: absent

Monday, March 12, 2018

Ansible "mapping values are not allowed in this context" Error

Symptom:

 When we try to put 2 logic commands into 1 element, like find and copy, we hit below error
The offending line appears to be:

   - name: copy env profile script to the target
      - find:
            ^ here
exception type: <class 'yaml.scanner.ScannerError'>
exception: mapping values are not allowed in this context


The part of related yaml file are
tasks:
   - name: find and copy  env profile scripts
      - find: 
              paths: /u01/app/ansible/repository
              patterns: 'apex-ext-stg'
        register: file_to_copy
      - copy: src={{ item.path }} dest=/tmp   mode=0775
        with_items: "{{ files_to_copy.files }}" 

 Reason:

"find" and "copy" are 2 different tasks. we need to use block to group them together , or we level it up with "- name"

Solution:

- name: find and copy  env profile scripts
  block:
        - find:
               paths: /u01/app/ansible/repository
               patterns: 'test-ext-stg'
          register: file_to_copy
        - copy: src={{ item.path }} dest=/tmp  mode=0775
          with_items: "{{ files_to_copy.files }}"

or

- name: find and copy  env profile scripts part1
  find:
               paths: /u01/app/ansible/repository
               patterns: 'test-ext-stg'
  register: file_to_copy
- name: find and copy  env profile scripts part2
  copy: src={{ item.path }} dest=/tmp  mode=0775
  with_items: "{{ files_to_copy.files }}"



Sunday, March 11, 2018

Ansible export: `=': not a valid identifier" Error

Symptom:

 When we try to run sqlplus via ansible yaml file, we hit error like
"stderr": "/bin/sh: line 0: export: `=': not a valid identifier", 
 Part of our yaml file related to export is:

- name: run the sql to get the result
     shell: |
          export ORACLE_SID = {{ mysid }}
          export ORACLE_HOME = {{ dbhome_locl }}
          {{ dbhome_locl }}/bin/sqlplus  "/ as sysdba" @/tmp/ansible_sql.sql "{{ myparameter }}"
     register: sqloutput

Reason:

 Ansible is written by python. Space ' '  is the delimiter. We need to be careful to add ' '  into any our commands which pass into ansible.
 export ORACLE_SID = {{ mysid }}  ---> there is space ' ' after ORACLE_SID, the new commandline ansible generates may confuse SHELL


Solution:

          export ORACLE_SID={{ mysid }}
          export ORACLE_HOME={{ dbhome_locl }}


No space before '='

RMAN Backup Error and Corruption of DatafileCopy

Symptom:

   Run restore preview to check sanity of rman backup and hit below error

run { 
ALLOCATE CHANNEL c1_tape DEVICE TYPE sbt_tape ;
RESTORE DATABASE PREVIEW;
RELEASE CHANNEL c1_tape;
} 

RMAN-06026: some targets not found - aborting restore
RMAN-06100: no channel to restore a backup or copy of datafile 98
RMAN-06100: no channel to restore a backup or copy of datafile 48
RMAN-06100: no channel to restore a backup or copy of datafile 7
 
Solution would be simple, we just manually tape backup the missing 3 files 
 run {
   ALLOCATE CHANNEL c1_tape DEVICE TYPE sbt_tape 
   backup copy of datafile 98; 
   backup copy of datafile 48;
   backup copy of datafile 7;
   release CHANNEL c1_tape ;
Then hit datafilecopy corruption error which is outstanding 

ORA-19566: exceeded limit of 0 corrupt blocks for file 48

Solution:

It is lucky we have only datafilecpy corrupted. Not a datafile corruption. DBV shows 4 blocks corruption.
LIST COPY OF DATAFILE 48 
DELETE COPY OF DATAFILE 48
BACKUP AS COPY DATAFILE 48;
then
BACKUP DEVICE TYPE SBT_TAPE COPY OF DATAFILE  48; 

Wednesday, March 07, 2018

Datapatch CDB / PDB hits prereq checks failed

Symptom:

     When we patch PSU on CDB / PDB , we need to ./datapatch -verbose under Opatch

     It reports
      Error: prereq checks failed!
  patch 21555660: The database must be in upgrade mode

Reason:

     Patch 21555660 (Database PSU 12.1.0.2.5, Oracle JavaVM Component ) is not in place of CDB/PDBs. It needs outages to upgrade this OJVM to pass thorough.

     Check both CDB and PDBs for that as the component applies to each PDBs

    sql: select comp_name, version from  dba_registry where comp_name like   '%JAVA Virtual Machin%' and status = 'VALID';

Solution:

If it is CDB$ROOT
  alter system set cluster_database=false scope=spfile;
  startup upgrade
  rerun datapatch
  alter system set cluster_database=true scope=spfile;
  startup

If it is PDB$SEED or other PDB
    alter pluggable database <PDB>  open upgrade;
    rerun datapatch
    alter pluggable database <PDB>  close immediate;
    alter pluggable database <PDB> open;

Tuesday, March 06, 2018

Datapatch CDB / PDB hits ORA-01652 Error

Symptom:

     When we patch PSU on CDB , we need to ./datapatch -verbose under Opatch

     It reports
      DBD::Oracle::st bind_param failed: ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

Reason:

     Datapatch is expected to patch all open PDBs in the CDB which includes PDB$SEED. Our application based PDBs are with enough TEMP . Unfortunately  PDB$SEED does not have much by default

Solution:

     Login PDB$SEED ,add 500m tempfile manually.

Monday, March 05, 2018

Note on Ansible Shell

Symptom :
   Always get below when we use variable in ansible playbook. It turns out we use |  to work around it.

Error Code:
tasks:
   - name: display databases in the GI via srvctl
     shell:   "{{ srvctl_locl }}" config database
     register: dblist
   - debug: msg="{{ dblist.stdout_lines }}"


Error Stack:
The offending line appears to be:

   - name: display databases in the GI via srvctl
     shell: '{{srvctl_locl }}' config database
                               ^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes.  Always quote template expression brackets when they
start a value. For instance:

    with_items:
      - {{ foo }}

Should be written as:

    with_items:
      - "{{ foo }}"

exception type: <class 'yaml.parser.ParserError'>
exception: while parsing a block mapping
  in "<unicode string>", line 16, column 6
did not find expected key
  in "<unicode string>", line 17, column 32

 
Solution:

tasks:
   - name: display databases in the GI via srvctl
     shell: |
         "{{ srvctl_locl }}" config database
     register: dblist
   - debug: msg="{{ dblist.stdout_lines }}"



Other ansible Tip:

- name: use loop control to display a specific attribute  of item ie stdout_lines ,std_err, start etc
     command: sleep 1
     with_items: "{{dbstatus.results }}"
     loop_control:
          label: "{{ item.stdout_lines }}" 

Sunday, March 04, 2018

Ansible Nested With_Items in Debug module

Use srvctl to list and display all Oracle DB instance status on the host

tasks:
   - name: display databases in the GI via srvctl
     shell: /u01/app/11.2.0.4/grid/bin/srvctl config database
     register: dblist
   - debug: msg="{{ dblist.stdout_lines }}"

   - name: display status of databases in the GI via srvctl
     shell:  /u01/app/11.2.0.4/grid/bin/srvctl status database -d  "{{ item }}"
     register: dbstatus
     with_items:
        - "{{ dblist.stdout_lines }}"
   - debug: msg="{{ item.stdout_lines }}"
     with_items:
        - "{{ dbstatus.results }}"

Wget With Proxy and Egrep

use -e for wget to use proxy

wget "https://rpmfind.net/linux/centos/7.4.1708/extras/x86_64/Packages/python2-jmespath-0.9.0-3.el7.noarch.rpm" -e  use_proxy=yes -e https_proxy=testproxy.test.com:80

egrep -v 'str one|str two' test.log

ps -few|grep pmon|egrep -v 'grep|MGM'|awk -F" " '{print $8}'|cut -d_ -f3