Thursday, February 20, 2020

Example of Pod Security Policy for Apache Httpd in OKE

Requirement:

As Pod Security Policy is enabled in Kubernetes Cluster, we need a PSP (Pod Security Policy) for Apache Httpd Server. How to create an Apache Httpd docker image, please refer to note. Http Server needs some special features other than normal applications.
Here is a PSP example which is tested in OKE (Oracle Kubernetes Engine).

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: oke-restricted-psp
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  privileged: false
  allowedCapabilities:
  - NET_BIND_SERVICE
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: true
  # Allow core volume types.
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  seLinux:
    # This policy assumes the nodes are using AppArmor rather than SELinux.
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  readOnlyRootFilesystem: false
---
# Cluster role which grants access to the restricted pod security policy
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: oke-restricted-psp-clsrole
rules:
- apiGroups:
  - extensions
  resourceNames:
  - oke-restricted-psp
  resources:
  - podsecuritypolicies
  verbs:
  - use

Wednesday, February 19, 2020

Error: cannot delete Pods with local storage when kubectl drain in OKE

Symptom

In the Kubernetes world, we often need to upgrade Kubernetes Master nodes and worker nodes themselves.
In OKE (Oracle Kubernetes Engine)  we follow Master node upgrade guide  and Worker node upgrade guide

When we run kubectl drain <node name>  --ignore-daemonsets

We got :
error: cannot delete Pods with local storage (use --delete-local-data to override): monitoring/grafana-65b66797b7-d8gzv, monitoring/prometheus-adapter-8bbfdc6db-pqjck

Solution:

 The error is due to there is local storage (  emptyDir: {} ) attached in the pods.

For Statefulset 

Please use volumeClaimTemplates . OKE supports the automatic movement of PV PVC to a new worker node. It will detach and reattach the block storages to the new work node(assume they are in the same Availablity domain).  We don't need to worry about data migration which is excellent for statefulset.  example is:

volumeClaimTemplates:
  - metadata:
      name: prometheus-storage
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 50Gi
      selector:
        matchLabels:
          app: grafana
      storageClassName: oci
      volumeMode: Filesystem

For Deployment,

Because deployment is stateless, we can delete local data
kubectl drain  <node name> --ignore-daemonsets --delete-local-data
If the deployment has PV,PVC attached, same as statefulset,  OKE supports the automatic movement of PV PVC to a new worker node. It will detach and reattach the block storages to the new work node(assume they are in the same Availablity domain).  We don't need to worry about data migration.

Tuesday, February 18, 2020

Tip: A few commands to triage Kubernetes Network related Issues

tcptraceroute 100.95.96.3 53

nc -vz 147.154.4.38 53TCP port

nc -vz -u 147.154.4.38 53 UDP port

tcpdump -ni ens3 port 31530

for p in $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name); do kubectl logs --namespace=kube-system $p; done

Thursday, February 13, 2020

Tip: Calico BGP IP AUTODETECTION Issue and Troubleshoot in OKE

Symptoms:

  We follow Calico doc to deploy a Calico instance.  After that,  Calico Nodes are always not ready though the pods are up and running.

Error in logs:
Unable to open configuration file /etc/calico/confd/config/bird6.cfg: No such file or directory 

Verify commands
# ./calico-node -bird-ready
2020-02-13 20:27:38.757 [INFO][5132] health.go 114: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.244.16.1,10.244.15.1

Solutions:

     One possible reason is there are a few network interfaces in the host. Calico may choose the wrong interface with the auto-detection method. Details refer
https://docs.projectcalico.org/networking/ip-autodetection#change-the-autodetection-method
     It can be solved by setting an environment variable for calico node daemonset.  In OKE VMs with Oracle Linux, the primary network interface is ens*  ie, ens3.
     kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=ens*

Wednesday, February 12, 2020

Cross Namespace Ingress Usage Example in OKE

Requirement:

      The normal use case to create ingress is to create one in the application namespace where application services and TLS certificates/keys are sitting.
      In the enterprise world, the security team is not comfortable to store TLS private keys in the application namespace. TLS private keys need to be stored securely in the namespace of the ingress controller.   In this case, we need to create ingress in "ingress controller" namespace instead of the application namespace.   We need to find a way to let ingress in "ingress controller" namespace to point to services in the application namespace (cross namespace service ).  Below is the solution of how we can achieve that in OKE ( Oracle Kubernetes Engine).

Solution:

  • Create TLS secrets in ingress controller namespace. Refer doc
    • $ openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout tls.key -out tls.crt  -config req.conf -extensions 'v3_req'
      
      req.conf:
      [req]
      distinguished_name = ingress_tls_prometheus_test
      x509_extensions = v3_req
      prompt = no
      [ingress_tls_prometheus_test]
      C = US
      ST = VA
      L = NY
      O = BAR
      OU = BAR
      CN = www.bar.com
      [v3_req]
      keyUsage = keyEncipherment, dataEncipherment
      extendedKeyUsage = serverAuth
      subjectAltName = @alt_names
      [alt_names]
      DNS.1 = prometheus.bar.com
      DNS.2 = grafana.bar.com
      DNS.3 = alertmanager.bar.com
    • kubectl create secret tls tls-prometheus-test  --key tls.key --cert tls.crt -n ingress-nginx
  • The key to using services in different namespaces is ExternalName.  It is working in OKE, but may not be working other  Cloud providers. One of the externalname examples is:
    • apiVersion: v1
      kind: Service
      metadata:
        annotations:
        name: prometheus-k8s-svc
        namespace: ingress-nginx
      spec:
        externalName: prometheus-k8s.monitoring.svc.cluster.local
        ports:
        - port: 9090
          protocol: TCP
          targetPort: 9090
        type: ExternalName
  • Create ingress in ingress controller namespace.
    • apiVersion: extensions/v1beta1
      kind: Ingress
      metadata:
        name: prometheus-ingress
        namespace: ingress-nginx
        annotations:
          kubernetes.io/ingress.class: "nginx"
      spec:
        tls:
        - hosts:
          - prometheus.bar.com
          - grafana.bar.com
          - alertmanager.bar.com
          secretName: tls-prometheus-test
        rules:
        - host: prometheus.bar.com
          http:
            paths:
            - path: /
              backend:
                serviceName: prometheus-k8s-svc
                servicePort: 9090
        - host: grafana.bar.com
          http:
            paths:
            - path: /
              backend:
                serviceName: grafana-svc
                servicePort: 3000
        - host: alertmanager.bar.com
          http:
            paths:
            - path: /
              backend:
                serviceName: alertmanager-main-svc
                servicePort: 9093