Sunday, August 30, 2020

Tip: A few commands to debug Issues with Kubelet

sudo systemctl status -l kubelet
kubectl describe node <name>
sudo journalctl -u kubelet | grep ready
sudo systemctl restart docker
kubectl cluster-info dump  --- to detailed cluster info

Tip: Impersonate users on kubectl

We can impersonate users with the --as= and the --as-group= flags.

kubectl auth can-i create pods --as=me

Monday, August 17, 2020

Tip: remove linux files with special characters

ls -ltr

-rw-rw-r-- 1 henryxie henryxie    0 Apr 22 12:14 --header

-rw-rw-r-- 1 henryxie henryxie    0 Apr 22 12:14 -d


 To remove these 2 files

rm -v -- "-d"

rm -v -- "--header"

Wednesday, August 05, 2020

Tip: Pods are not created while deployment is created

Symptom:

  We have a normal deployment which was working fine. When we test it on a new Kubernetes cluster, the deployment is created well, but the pod is not created. No warning or error messages.
 "kubectl describe deployment"  does not show any hints. Pod security policy check is good, RBAC privilege check is good.

OldReplicaSets:    <none>
NewReplicaSet:     livesqlstg-admin-678df959b4 (0/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  16s   deployment-controller  Scaled up replica set livesqlstg-admin-678df959b4 to 1

Solution:

  The reason is we have resource quota implemented on the namespace. 
 spec:
  hard:
    configmaps: "10"
    limits.cpu: "10"
    limits.memory: 20Gi
    persistentvolumeclaims: "10"
    ....

By doing that, we need an additional resource section in the deployment yaml file.  ie
      resources:
              requests:
                  memory: "10Gi"
                  cpu: "1"
              limits:
                  memory: "10Gi"
                  cpu: "1"
 It would be good for Kubernetes to give users some warnings for that. 

Wednesday, July 29, 2020

Tip: No route to host issues in Kubernetes Pods

Symptom:

    We see intermittent the network issues in OKE (Oracle Kubernetes Engine). ingress controller pods have difficult to access other services.  We use curl to test the network port, we get an error like below:
 
$ curl -v telnet://10.244.97.24:9090
* Expire in 0 ms for 6 (transfer 0x560b9cdd7dd0)
*   Trying 10.244.97.24...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x560b9cdd7dd0)
* connect to 10.244.97.24 port 9090 failed: No route to host

Solution:

   There are quite a few reasons for that. Check my another blog 
   In this case, it is related to firewall ports open. 
  By default, the network team open all ingress and egress ports for the same worker nodes Subnet which means no firewall among all worker nodes.  However, it was set stateful.  As Kubernetes overly network heavily depends on UDP which is stateless, so we need to open ports as stateless