Tuesday, October 23, 2018

Turn Off Checksum Offload For K8S with Oracle UEK4 Kernel

Symptom:

     We create K8S via Oracle Doc in Oracle OCI.  mysql server, service, phpadmin server ,service are created fine.  However we have problems that Pods can't communicate with other Pods. We created a debug container (refer blog here )with network tools to attach the  network stack of phpadmin pod. We find we can't access the port , nc -vz  <ip> 3306  is timing out, however ping <mysql ip> is fine

Solution:

   Dive deeper , we  see docker0  network interface (ip addr) has its orginal IP address (172.17.*.* ), it does not have flannel network ip address we created when we init K8S (192.168.*.*)  . It means docker daemon has issues to work with flannel network and not associated with flannel CNI well.

   By default, they should. It turns out it is related to broadcom driver with UEK4 kernel.
Refer: github doc
see terr## Disable TX checksum offloading so we don't break VXLAN
######################################
BROADCOM_DRIVER=$(lsmod | grep bnxt_en | awk '{print $1}')
if [[ -n "$${BROADCOM_DRIVER}" ]]; then
   echo "Disabling hardware TX checksum offloading"
   ethtool --offload $(ip -o -4 route show to default | awk '{print $5}') tx off
fiaform-kubernetes-installer)

So we need to turn off checksum offload and bounce K8S.
Here are steps (run on all K8S nodes) :
#ethtool --offload $(ip -o -4 route show to default | awk '{print $5}') tx offActual changes:tx-checksumming: off       tx-checksum-ipv4: off       tx-checksum-ipv6: offtcp-segmentation-offload: off       tx-tcp-segmentation: off [requested on]       tx-tcp6-segmentation: off 
#kubeadm-setup.sh stop#kubeadm-setup.sh restart

No comments: