Thursday, April 30, 2020

Steps to implement ConfigOverride for JDBC DataSource in WebLogic via WebLogic Kubernetes Operator

Here are steps how we implement a config override JDBC DataSource in WebLogic via WebLogic Kubernetes Operator.
  • More details , please refer  https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/configoverrides/
  • Create secret for db connections
    • kubectl -n test-poc-dev create secret generic dbsecret --from-literal=username=weblogic --from-literal=password=**** --from-literal=url=jdbc:oracle:thin:@(DESCRIPTION=(ENABLE=BROKEN)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=testdb.oraclevcn.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=testdb)))
    • kubectl -n test-poc-dev label secret dbsecret weblogic.domainUID=test-poc-dev-domain1
  • Create Datasource  jdbc-AppDatasource module when you build docker images
  • Some good examples on github link
  • Create configmap for jdbc-MODULENAME.xml , in this case it is jdbc-AppDatasource.xml
    • cd kubernetes/samples/scripts/create-weblogic-domain/domain-home-in-image/weblogic-domains/test-poc-dev-domain1/
    • 2 files jdbc-AppDatasource.xml  version.txt in configoverride directory 
    • kubectl -n test-poc-dev create cm test-poc-dev-domain1-override-cm --from-file ./configoverride
      kubectl -n test-poc-dev label cm test-poc-dev-domain1-override-cm weblogic.domainUID=test-poc-dev-domain1

    • Add below in the spec section of domain.yaml
      • spec:
            ....
            configOverrides: test-poc-dev-domain1-override-cm
            configOverrideSecrets: [dbsecret]
    • bouncing WebLogic servers to make it effective. 
    • Debugging please refer https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/configoverrides/#debugging

Wednesday, April 15, 2020

Error: ExternaName not working In Kubernetes

Symptom:

   We have ExternName for our DB service: test-db-svc:

apiVersion: v1
kind: Service
metadata:
  name: test-db-svc
  namespace: test-stage-ns
spec:
  externalName: 10.10.10.10
  ports:
  - port: 1521
    protocol: TCP
    targetPort: 1521
  sessionAffinity: None
  type: ExternalName

   After we upgrade Kubernetes Master nodes, DNS service stops resolving the ExternalName

 curl -v telnet://test-db-svc:1521
* Could not resolve host: test-db-svc; Name or service not known
* Closing connection 0curl: 
(6) Could not resolve host: test-db-svc; Name or service not known

Solution:

   It is due to the new version of Kubernetes doesn't support IP address on ExternalName. We need to replace it with FQDN

apiVersion: v1
kind: Service
metadata:
  name: test-db-svc
  namespace: test-stage-ns
spec:
  externalName: testdb.testdomain.com
  ports:
  - port: 1521
    protocol: TCP
    targetPort: 1521
  sessionAffinity: None
  type: ExternalName


Tuesday, April 14, 2020

Tip: use curl to test network port and DNS service in the Pod

curl is installed in most of docker images by default. Most of pods have it
We can use curl to test if network ports are open and DNS service is working

Example:  To test DB service port 1521
curl -v telnet://mydb.testdb.com:1521

*   Trying 1.1.1.1:1521...
*   TCP_NODELAY set
*   connect to 1.1.1.1:1521 port 1521 failed: Connection timed out
*   Failed to connect to port 1521: Connection timed out
*  Closing connection 0curl: (7) Failed to connect to  Connection timed out


It tells us DNS is working as we see the IP address 1.1.1.1
But port is not open.

To test url behind proxy, we can use blow command

curl -x http://proxy-mine:3128 -v telnet://abc.com:443

Sunday, April 12, 2020

Oracle Non-RAC DB StatefulSet HA 1 Command Failover Test in OKE

Requirement:

OKE has a very powerful Block Volume management built-in. It can find, detach, reattach block storage volumes among different worker nodes seamlessly. Here is what we are going to test.
We create an Oracle DB statefulset on OKE. Imagine we have hardware or OS issue on the worker node and test HA failover to another worker node with only 1 command (kubectl drain).
Below things happen automatically when draining the node
  • OKE will shutdown DB pod
  • OKE will detach PV on the worker node
  • OKE will find a new worker node in the same AD
  • OKE will attach PV in the new worker node
  • OKE will start DB pod in the new worker node
DB in statefulset is not RAC, but with the power of OKE, we can failover a DB to new VM in less than a few minutes

Solution:

  • Create service for DB statefulset
    $ cat testsvc.yaml 
    apiVersion: v1
    kind: Service
    metadata:
      labels:
         name: oradbauto-db-service
      name: oradbauto-db-svc
    spec:
      ports:
      - port: 1521
        protocol: TCP
        targetPort: 1521
      selector:
         name: oradbauto-db-service
  • Create a DB statefulset, wait about 15 min to let DB fully up
    $ cat testdb.yaml 
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: oradbauto
      labels:
        app: apexords-operator
        name: oradbauto
    spec:
      selector:
         matchLabels:
            name: oradbauto-db-service
      serviceName: oradbauto-db-svc
      replicas: 1
      template:
        metadata:
            labels:
               name: oradbauto-db-service
        spec:
          securityContext:
             runAsUser: 54321
             fsGroup: 54321
          containers:
            - image: iad.ocir.io/espsnonprodint/autostg/database:19.2
              name: oradbauto
              ports:
                - containerPort: 1521
                  name: oradbauto
              volumeMounts:
                - mountPath: /opt/oracle/oradata
                  name: oradbauto-db-pv-storage
              env:
                - name: ORACLE_SID
                  value: "autocdb"
                - name: ORACLE_PDB
                  value: "autopdb"
                - name:  ORACLE_PWD
                  value: "whateverpass"
      volumeClaimTemplates:
      - metadata:
          name: oradbauto-db-pv-storage
        spec:
          accessModes: [ "ReadWriteOnce" ]
          resources:
            requests:
              storage: 50Gi

  • Image we have hardware issues on this node, we need to failover to a new node 
    • Before Failover: Check the status of PV and Pod. and the pod is running on the node  1.1.1.1
    • Check any if other  pods running on the node will be affected
    • We have a node ready in the same AD as statefulset Pod
    • kubectl get pv,pvc
      kubectl get po -owide
      NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
      oradbauto-0 1/1 Running 0 20m 10.244.3.40 1.1.1.1 <none> <none>
    • 1 command to failover DB to new worker node
      • kubectl drain  <node name> --ignore-daemonsets --delete-local-data
      • kubectl drain  1.1.1.1    --ignore-daemonsets --delete-local-data
      • No need to update MT connection string as DB servicename is untouched and transparent to new DB pod
    • After failover: Check the status of PV and Pod. and the pod is running on the new node 
      • kubectl get pv,pvc
      • kubectl get pod -owide
  • The movement of PV,PVC  work on  volumeClaimTemplates as well as  the PV,PVC when we create them via yaml files with storage class "oci"

Monday, April 06, 2020

Tip: Helm v3.x Error: timed out waiting for the condition

Error: timed out waiting for the condition

Use --debug to find out more trace information
helm  ********  --wait --debug


Error: container has runAsNonRoot and image has non-numeric user , cannot verify user is non-root

Symptom:

When we enable Pod Security Policy in OKE (Oracle Kubernete Engine) . We only allow nonroot user running in the Pods. However, we build an application with Oracle Linux base docker image and use oracle . We still get
Error: container has runAsNonRoot and image has non-numeric user , cannot verify user is non-root

Solution:

The error is very obvious , oracle is non-numeric , we need to update it to be 1000.
In the Dockerfile  : USER oracle --> USER 1000