Sunday, May 23, 2021

Tip: Can't find docker networking namespace via ip netns list

Symptom:

    In ubuntu, we start a docker container, try to find docker networking namespace via "ip netns list". The output is empty.

Reason:

   The docker by default , it records netns on /var/run/docker/netns. While "ip netns list" is checking /var/run/netns

Workaround:  

 stop all containers , rm -rf /var/run/netns,  ln -s /var/run/docker/netns  /var/run/netns

Tip:

To find netns id of container use

docker ps ---> find container ID

docker inspect <contain ID> |grep netns

Thursday, May 13, 2021

Tip: Bind Error when running multiple schedulers in K8S

Error details: 

I0530 09:25:29.097683       1 serving.go:331] Generated self-signed cert in-memory

failed to create listener: failed to listen on 127.0.0.1:10259: listen tcp 127.0.0.1:10259: bind: address already in use

Reason:

     It's due to the default scheduler is running on the same node. We can move the 2nd scheduler to another node to fix this. 


Thursday, April 22, 2021

Tip: curl: (23) Failed writing body

Symptom: 

When we run 

curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/download/$VERSION/argocd-linux-amd64

We get error

curl: (23) Failed writing body (0 != 1369)

Reason:

 Tt is due to  "/usr/local/bin/argocd"  is on the /usr/local/bin  directory which is owned by root user while we use normal user to run curl.

To fix it , change "/usr/local/bin/argocd" to be "/tmp/argocd"


Wednesday, April 14, 2021

Tip: git can't communicate with github after unset http.proxy

Symptom:

    We used to have an HTTP proxy to access Github. It was working fine. When we take off HTTP proxy via "git config --global -e", use "git config --global -l" to confirm it is taken off.

   However, it still can't communicate with GitHub. Error like 

 kex_exchange_identification: Connection closed by remote host fatal: Could not read from remote repository

Reason:

   It is due to we use ssh to communicate with GitHub, while there are extra HTTP proxy settings in ~/.ssh/config file

Host=github.com

ProxyCommand=socat - PROXY:<proxy-server>:%h:%p,proxyport=80

Take them off will fix the issue. 


Tuesday, April 13, 2021

Tip: When OPA gatekeeper stuck

Symptom:

    We hit issues that all kubectl command stuck like kubectl get pod...etc

    initially, we thought it is a Kubernetes control plane issue but confirmed with the cloud provider, the control plane has some communication issues with the webhook

Solution:

  It turns out the OPA gatekeeper was stuck and cause webhook issues with the control plane.

Workaround:

1. Delete webhook

kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io gatekeeper-validating-webhook-configuration

2. It will stabilize the communications with the control plane

3. Delete and redeploy opa keeper deployment