Home Kubernetes Environment

Overview

The goal was to set up a bare metal Kubernetes cluster at home to experiment with some of the features of Kubernetes, and to have a more reasonable development environment configured for personal projects.

Objectives

1) To learn to use Kubernetes for container orchestration (and learn what container orchestration even means)

2) To learn to streamline the process of deploying a new server.

3) To set up a more expandable system for maintaining home projects.

Tech and Tools Used

  • Kubernetes

  • Containerd

  • Helm

  • Calico

  • Traefik

  • Cert manager

  • MetalLB

  • nfs-subdir-external-provisioner

  • Cloudflare

  • VirtualBox

  • Windows Terminal

  • VSCode

  • Ansible

  • Ubuntu

  • Raspberry Pi

Current Hardware Specs

  • Motherbrain
    • AMD CPU
    • 4 GB DDR3 RAM
    • Built-in display
    • 250 GB SSD
  • Ridley
    • Raspberry Pi 4 Model B
    • 8 GB RAM
    • 500 GB SSD
  • Chozo
    • Intel CPU
    • 16 GB DDR3 RAM
    • 1 TB HDD
    • 4 TB HDD RAID 1 (for NFS)
    • Controls UPS for cluster power
  • Luminoth
    • Raspberry Pi 4 Model B
    • 8 GB RAM
    • 1 TB SSD
  • Zebes
    • Google Mesh Router
  • SR388
    • TP-Link ER605 Omada Router
  • Adam
    • 4 GB RAM
    • Bridged Network Adapter
    • 32 GB Virtual Disk

Network Map

Planning and Design

Over the course of a year, the design of my Kubernetes cluster evolved significantly. Initially, the cluster started with a single server. As the project progressed, I expanded the infrastructure to two servers, then four, and finally segmented the entire cluster into its own dedicated network.

Initial plans

Initially, my plan was to host a single-server Kubernetes cluster using old computer hardware I had at home. I thought simply getting Kubernetes up and running would suffice. However, I quickly realized that Kubernetes is far more complex than Docker, which I was already familiar with, and it required much more thought and preparation for the setup.

At the time, I was very familiar with Ubuntu, particularly using the CLI and SSH for access. However, I wasn't familiar with some advanced security options or using the Windows Terminal for SSH. I typically used PuTTY but wanted to avoid installing additional software. As I worked with Windows SSH, I found it was a good decision due to its integrations with VSCode, which I made a lot of use of.

It took several attempts to configure and harden the server as desired, but eventually, I had it set up and was ready to move on to the next step: transforming it into a Kubernetes cluster. I began searching online and quickly realized that installing Kubernetes was not as straightforward as installing a simple package. The complexity was overwhelming, so I sought alternative learning methods.

I reached out to friends for help, advice, and resources. One friend responded and provided some lab guides for starting a cluster. Although these guides were outdated, they formed the foundation of my future plans and designs for the cluster. The guides covered the installation of containerd for containerization and Kubernetes with kubeadm, along with the setup for Calico. Despite their outdated nature, these guides helped move me along to getting something running.

After several attempts at installation and thorough documentation of my process, I reached my first major roadblock: I had a functioning control plane but no worker nodes.

Cluster Expansion

To expand the cluster, I needed additional servers. Financial constraints prevented me from purchasing a proper server rack, so I reached out to friends for old, unused hardware that had enough power to function as additional nodes. After resolving some hardware faults and purchasing a replacement drive, I managed to set up an old HP Pavilion All-In-One with Ubuntu Server.

By this time, I had optimized my setup process, creating a list of defaults for the initial setup and a set of commands to expedite the installation. However, I encountered a significant issue: with only two nodes, my cluster lacked redundancy. Kubernetes is designed to ensure maximum availability and integrity of hosted services, and with only two nodes, any failure would render the cluster inoperable.

In search of a cost-effective solution, I considered using Raspberry Pis, which are inexpensive ARM-based boards capable of running various workloads. I purchased two Raspberry Pis and integrated them into the cluster. The first Pi was added as a worker node, and the second as an untainted control plane to handle some worker loads. Although this setup was far from ideal, with fewer control planes than recommended, it provided a temporary solution.

During this expansion, I encountered various issues, notably with load balancing. MetalLB, the load balancer for a bare metal cluster, can operate using BGP requests or Layer 2 ARP requests. My Google Wi-Fi Mesh router was incompatible with MetalLB's port forwarding requirements, as it ignores any extraneous ARP requests made from a device with a static IP address.

To resolve this, I purchased a TP-Link router without Wi-Fi capabilities. I connected the WAN port to the primary Google Wi-Fi Mesh access point, connected three servers directly to the router's LAN ports, and used a switch to connect the remaining server. This setup provided additional ports for other devices, including a laptop for configuration.

I configured the WAN port with a static IP address from the Google Wi-Fi Mesh router, including IPv6 support. After confirming an external internet connection, I updated the router firmware. For the LAN, I moved the network to a non-default IP range and assigned static addresses to each server. I enabled remote logins from the home LAN, changed the router access port, and updated the login credentials.

However, this network change disrupted the cluster, necessitating its recreation multiple times. Despite these setbacks, I ultimately achieved a functional Kubernetes cluster within its own network segment, with limited remote and external access.

Hardware and Software Setup

You can view my notes in full here.

Initial Setup

The initial server setup varies slightly based on the hardware used. For Raspberry Pi servers, I utilize imaging software to install Ubuntu onto an external drive, which then serves as the boot device. For other servers, I typically download the Ubuntu ISO file and use Rufus to create a bootable flash drive, offering more customization during the installation process.

During the installation, I follow these steps:

English
Update to new Installer
Done (Keyboard)
Done (type of Install)
Done (Internet)
Done (Proxy)
Done (Mirror)
Custom storage layout (Done)

The storage configuration depends on the specific server requirements. I generally use Btrfs for data that needs preservation, such as RAID arrays, and XFS for other partitions:

rule of thumb: Use Btrfs for data that needs preservation, xfs for partitions that aren't set sizes, ext4 for its journaling
local: remaining xfs /

Done (Storage Configuration)

Next, I configure user accounts, ensuring not to reuse account details across servers. For initial setup, temporary passwords may be used to expedite the process:

Your Name: Rob Castagno
Your server's name: <hostname>
Pick a username: <username>
Choose a password: <pass>
Confirm your password: <pass>

Done (User config)

The final steps involve configuring OpenSSH:

Install OpenSSH server, import GitHub ssh keys
Done (SSH Setup)

Done (Skip snaps)
Reboot Now (remove install drive)

Post-installation, the server runs OpenSSH out of the box. I assign a static IP address to the server via the router settings to ensure it remains consistent on the LAN. Then, I connect remotely to start the hardening process.

Server Hardening

There's a few things necessary to start hardening the system.

# Setting the timezone and updating the system
sudo timedatectl set-timezone America/New_York
sudo apt update && sudo apt upgrade -y

# Disabling swap as it's not utilized by Kubernetes
sudo swapoff -a

# Installing necessary packages for server hardening and other tasks
sudo apt install -y libpam-google-authenticator fail2ban curl apt-transport-https git wget gnupg2 software-properties-common lsb-release ca-certificates uidmap

# Setting up fail2ban config
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local

Following the updates, the process to enhance server security includes configuring two-factor authentication (2FA) using Google Authenticator, which adds a layer of security and enables safer passwordless login methods.

Configure Google Authenticator for 2FA:

google-authenticator
y  # Make the token time-based
<scan QR Code and enter response code>
<Grab emergency codes>
y  # Update the .google_authenticator file
y  # Disallow multiple uses of the same authentication token
n  # Increase the time window for authentication
y  # Enable rate-limiting

Additionally, I enhance security by disabling the root user account, which prevents direct root logins—a critical security practice. Disable Root User: Open the shadow file and replace the root password field (likely *) with an ! to disable the account sudo nano /etc/shadow These measures significantly fortify the server against unauthorized access and ensure that the setup is robust for hosting a Kubernetes environment.

OpenSSH Configuration

The OpenSSH setup on my servers incorporates several security measures to enhance the robustness of remote access. Below is a detailed configuration aiming to strike a balance between security and functionality.

Edit the SSH server config sudo nano /etc/ssh/sshd_config

# Setting the SSH port to a non-default to reduce common attack vectors
Port 22222

# Specifying the use of SSH Protocol 2 for security
Protocol 2

# Restricting which users can log in via SSH
AllowUsers <username>

# Requiring both public key and a second factor authentication for most users
AuthenticationMethods publickey,keyboard-interactive
ChallengeResponseAuthentication yes
KbdInteractiveAuthentication yes

# Setting a five-minute grace period for login attempts
LoginGraceTime 5m

# Disallowing root login for enhanced security
PermitRootLogin no

# Setting maximum authentication tries to five
MaxAuthTries 5

# Enabling public key authentication and disabling password-only login
PubkeyAuthentication yes
PasswordAuthentication no

# Ensuring no empty passwords are permitted
PermitEmptyPasswords no

# Disabling Kerberos and GSSAPI authentication for simplicity and to reduce attack surface
KerberosAuthentication no
GSSAPIAuthentication no

# Special configuration for certain trusted connections, simplifying authentication to publickey only for specified IP range
Match Address <IP range>/<subnet> User <username>
      AuthenticationMethods publickey
      KbdInteractiveAuthentication no

# Ensuring default authentication methods apply universally except for specified cases
Match User <username>@*
      AuthenticationMethods publickey,keyboard-interactive
      KbdInteractiveAuthentication yes

# Allowing new users to log in with a password initially to set up their 2FA and SSH keys
# Uncomment and customize the following lines as needed for new user setup
# Match User <New account>
#      UsePAMSubsystem yes
#      AuthenticationMethods password
#      PasswordAuthentication yes

This configuration ensures that every connection is secured using both a public key and 2FA, except for connections from specific trusted local addresses, which require only the public key. This dual requirement maximizes security while accommodating practical needs for automation tasks and initial setups for new users.

Configure PAM for SSH:

To align with the security policies, we adjust the PAM (Pluggable Authentication Modules) settings to integrate with the new SSH configuration, particularly focusing on enhancing authentication mechanisms.

Edit the PAM SSH configuration to disable common-auth and enable Google Authenticator: sudo nano /etc/pam.d/sshd

# Disable the inclusion of common-auth
#@include common-auth
...
# Enforce Google Authenticator requirement
auth required pam_google_authenticator.so

Setting Up PAM for New User Logins:

To facilitate the setup of new user accounts, a dedicated PAM configuration is used. This configuration is only active during the account creation phase to simplify the login process for setup purposes.

Create a new PAM configuration for handling new user logins: sudo nano /etc/pam.d/sshd-newusers

#%PAM-1.0
auth [success=1 default=ignore] pam_succeed_if.so uid >= 1000 quiet
auth required pam_google_authenticator.so
auth [success=1 default=ignore] pam_unix.so nullok_secure
auth requisite pam_deny.so
auth required pam_permit.so

Final Configurations and Rebooting:

To ensure Fail2Ban aligns with our modified SSH settings, particularly the SSH port change, the configuration is updated. Configure Fail2Ban: sudo nano /etc/fail2ban/jail.local

Disabling swap space is crucial for the performance of Kubernetes, which does not utilize swap. Disable Swap Space: sudo nano /etc/fstab

Finally, a system reboot or service restart is necessary to ensure all configurations take effect. sudo reboot or sudo systemctl restart sshd

Setting Up NFS Server

For storage solutions within the Kubernetes cluster, setting up an NFS server is critical. It requires minimal configuration but is vital for persistent data storage across nodes.

Installing and Configuring NFS Server: sudo apt install nfs-kernel-server -y sudo nano /etc/exports

/storage  <KUBERNETES_NODE_IP>(rw,sync,no_root_squash,no_subtree_check)

Windows Terminal Setup

Configuring Windows SSH Client for Remote Connections

Step 1: SSH Key Generation and Deployment

Generate an ED25519 SSH key to secure remote SSH connections. This key is uploaded to the remote server to allow for key-based authentication, enhancing security over traditional password-based methods. Generate ED25519 Key and Upload to Remote Server:

# Generate ED25519 SSH Key
ssh-keygen -t ed25519 -o

# Upload the public key to the remote server's authorized_keys
Get-Content \path\to\<key-name>.pub | ssh user@remote_server "cat >> .ssh/authorized_keys"
# Follow prompts to type user password

Step 2: Configure SSH Client

Simplify SSH connections by configuring the SSH client to use profiles stored in a configuration file. This setup allows quick connections using predefined profiles. Add a New Host in SSH Configuration: C:\Users\<username>/.ssh/config

Host <hostname>
  HostName <IP Address>
  Port <Port>
  User <Username>
  IdentityFile <Private Key File Location>

Step 3: Installation of OpenSSH Client and Server

Install and update the Windows OpenSSH client and server components to enable advanced SSH features and functionalities. Install OpenSSH Client and Server:

# Check for available OpenSSH packages and install
Get-WindowsCapability -Online | Where-Object Name -like ‘OpenSSH*’
Add-WindowsCapability -Online -Name OpenSSH.Client~~~~0.0.1.0
Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0

# If an installation error occurs, use the following command:
Add-WindowsCapability -Online -Name "Msix.PackagingTool.Driver~~~~0.0.1.0"

Setting Up Windows OpenSSH Authentication Agent

Enable and configure the OpenSSH Authentication Agent to manage SSH keys and allow for automated key management, facilitating smoother and more secure logins. Enable and Configure OpenSSH Authentication Agent:

Go to Services > OpenSSH Authentication Agent
Right click > Properties
Set Startup Type to Automatic
Start the Service

# Add the SSH key to the agent for easier connection management
ssh-add /path/to/<key-name>
# Note: This is less secure than requiring a password for each key use, hence 2FA is recommended.

Creating Profiles in Windows Terminal

Customize the Windows Terminal by creating profiles for each remote connection. This customization enhances productivity and usability by allowing quick access to frequently used connections. Create and Customize Profiles in Windows Terminal:

# Access settings in Windows Terminal and add a new profile
Open up settings
Add a new profile
Configure it and save.

# Further customize the profile by editing the JSON configuration
Click Open JSON and tweak the profile to include:
{
    "name": "<name>",
    "tabTitle": "<Tab Title>",
    "commandline": "ssh <Profile>",
    "icon": "<emoji or file>"
},

Starting the Kubernetes Cluster

The initial setup of the Kubernetes cluster involves configuring the system, installing a container runtime, and setting up Kubernetes itself.

Installing Prerequisites

The configuration of the system's kernel and network settings is crucial for supporting Kubernetes functionalities such as overlay networks and network bridging. Switch to root and configure system settings:

# Gain root access
sudo -i

# Load necessary modules for Kubernetes
modprobe overlay
modprobe br_netfilter

# Update kernel settings to enable network traffic handling by iptables and enable IP forwarding
cat << EOF | tee /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

Setting Up Container Runtime

Containerd is chosen over Docker due to its lower overhead, making it suitable for a production Kubernetes environment. Install containerd:

# Create directory for Docker GPG key and add it to the system
mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# Add Docker repository for installing containerd
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install containerd from Docker repo
apt-get update && apt-get install containerd.io -y

# Configure containerd and set systemd as the cgroup driver
containerd config default | tee /etc/containerd/config.toml
sed -e 's/SystemdCgroup = false/SystemdCgroup = true/g' -i /etc/containerd/config.toml
systemctl restart containerd

Installing Kubernetes

Kubernetes installation includes the kubeadm tool for cluster creation, the kubelet daemon that runs on all cluster machines, and the kubectl command-line tool for cluster management. Add Kubernetes repository and install packages:

# Add Kubernetes repository
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

# Install Kubernetes components
apt-get update && apt-get install -y kubeadm=1.30.1-00 kubelet=1.30.1-00 kubectl=1.30.1-00
apt-mark hold kubelet kubeadm kubectl

Hosts File Configuration

To ensure all network traffic intended for the cluster is properly routed, update the hosts file. Update hosts file for network routing:

# Add the host IP address in the hosts file and associate it with k8scp
nano /etc/hosts
# Add line: <Host IP Address> k8scp

Initializing the Cluster

Properly initializing the Kubernetes cluster involves setting up a primary control node and configuring various network and control settings through a kubeadm configuration file.

To ensure the cluster initializes correctly with the desired network settings and control plane configurations, we create a kubeadm configuration file specifying network ranges, DNS settings, and the control plane endpoint. nano kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
networking:
  serviceSubnet: "10.96.0.0/12" # Default range for services
  podSubnet: "10.244.0.0/16" # Custom range for Calico; ensure it doesn't conflict with your network
  dnsDomain: "cluster.local" # Default DNS domain; change if you have a specific need
controlPlaneEndpoint: "k8scp:6443" # Your control plane endpoint; matches your hosts file configuration
apiServer:
  extraArgs:
    advertise-address: "<Primary Node IP>" # Replace with your control plane's static IP
    allow-privileged: "true"
kubeletConfiguration:
  baseConfig:
    resolvConf: "/run/systemd/resolve/resolv.conf"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"

Enable IP Forwarding: Ensuring IP forwarding is enabled is crucial for the proper routing of traffic within the cluster. echo 1 > /proc/sys/net/ipv4/ip_forward Initialize the Cluster with kubeadm: Begin the cluster initialization using the previously created configuration file. This process sets up the primary control plane and outputs commands for joining other nodes. kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out

Joining a Node to the Cluster

Joining additional nodes to the Kubernetes cluster is a critical step in scaling the cluster's capacity and capabilities. This includes both control plane nodes, which manage the cluster, and worker nodes, which run the applications.

To integrate a new node, follow these steps to use kubeadm join with the appropriate parameters to securely authenticate and join the node to the cluster.

# Join the control plane to the cluster
kubeadm join k8scp:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<token sha256 hash> \
--control-plane \ # exclude this if worker node
--certificate-key <certificate key> | tee kubeadm-join.out

If the connected node is a worker, it should be good to go, but if it's a control plane, there's one more necessary step to perform. We need to run the commands provided by the kubeadm output. These are as follows:

# Exit root user and configure kubectl for standard user access
exit
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

That's all there is for joining a node to the cluster. Although we still have additional things we can do. For example, if we wanted a control plane to be able to schedule workloads, we would run kubectl taint nodes <hostname> node-role.kubernetes.io/control-plane:NoSchedule- to remove the NoSchedule taint on the node. We can also install Helm on our nodes to allow us to use Helm charts to deploy Kubernetes applications.

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update && sudo apt-get install helm

Setting up Cluster Networking

Configuring the networking within a Kubernetes cluster, particularly in a bare-metal environment, involves the installation and management of a Cluster Networking Interface (CNI). Calico is selected as the CNI due to its robust feature set and simplicity in setup and integration.

Installing the CNI (Calico)

Install the Tigera Operator: The Tigera Operator is used to automate the installation and management of Calico, simplifying the process of applying configurations and updates. kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/tigera-operator.yaml Download and Deploy Calico Custom Resources: Custom resources provide the necessary configurations specific to how Calico operates within the cluster.

curl https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/custom-resources.yaml -O
kubectl create -f custom-resources.yaml

Update Calico: Updates to Calico are handled through the Tigera Operator, which can dynamically apply new configurations and updates to the CNI. kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v<version>/manifests/tigera-operator.yaml

Setting up Load Balancing with MetalLB

MetalLB provides a network load balancing solution for Kubernetes clusters not running on cloud providers that offer load balancers. The setup involves configuring kube-proxy to use IPVS with strictARP enabled, which is crucial for handling ARP requests properly, a method used by MetalLB for load balancing.

Configure kube-proxy for MetalLB: To ensure proper load balancing, kube-proxy must be configured to use IPVS mode with strictARP enabled. This configuration prevents issues related to ARP requests. kubectl edit configmap -n kube-system kube-proxy

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true

Install MetalLB: MetalLB is installed by applying its manifest file, which sets up the necessary components within the Kubernetes cluster. kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.5/config/manifests/metallb-native.yaml Create IP Address Pool and L2 Advertisement: After installing MetalLB, define an IP address pool and an L2 advertisement to manage how IP addresses are assigned and advertised within the cluster.

---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: <name>
  namespace: metallb-system
spec:
  addresses:
  - <IP Range>
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: <name>
  namespace: metallb-system

This configuration sets up MetalLB to handle external access to services within the cluster by providing a range of IP addresses that can be dynamically assigned to services and advertising those addresses using Layer 2 protocols.

Dynamic Provisioning with an NFS Server

Kubernetes by default does not support NFS servers as a dynamic storage backend. However, using a Helm chart called nfs-subdir-external-provisioner, we can integrate NFS into Kubernetes to enable dynamic volume provisioning. This allows for automatic volume provisioning whenever a new pod requests storage, eliminating the need for manual volume setup. Deploy NFS Subdirectory External Provisioner via Helm: This Helm chart deploys a pod that handles the dynamic provisioning of storage volumes from an NFS server, integrating it seamlessly into your Kubernetes environment.

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/

# Deploy the Helm chart with configuration tailored for your NFS server setup
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=<nfs IP> \
--set nfs.path=/<path on nfs> \
--set storageClass.defaultClass=true \
--set storageClass.reclaimPolicy=Retain \
--set storageClass.accessModes=ReadWriteMany \
-n nfs-subdir-external-provisioner

This setup configures the NFS server as a backend for Kubernetes, specifying the NFS server IP, the path on the NFS server to use for storage, and various storage class parameters like the default class, reclaim policy, and access modes. This configuration ensures that storage is dynamically provisioned and managed effectively within your Kubernetes cluster.

Automatic SSL Certificates with Cert Manager

Securing applications with SSL is crucial, especially for services exposed to the internet. Cert Manager automates the creation and renewal of SSL certificates within Kubernetes, minimizing downtime associated with manual renewals. Install Cert Manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.0/cert-manager.yaml Create an Issuer and Secret: To issue certificates automatically, set up an Issuer resource that utilizes ACME with DNS-01 challenge provider, in this case, Cloudflare.

---
apiVersion: v1
kind: Secret
metadata:
  name: cloudflare-api-token-secret
  namespace: traefik
type: Opaque
stringData:
  api-token: <API-Token>
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: cloudflare-issuer
  namespace: traefik
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <email>
    privateKeySecretRef:
      name: cloudflare-key
    solvers:
      - dns01:
          cloudflare:
            email: <email>
            apiTokenSecretRef:
              name: cloudflare-api-token-secret
              key: api-token

Create a Certificate: Once an Issuer is in place, you can automatically generate certificates for services that require them.

---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-template-com
  namespace: traefik
spec:
  secretName: wildcard-template-com-tls
  dnsNames:
    - "template.com"
    - "*.template.com"
  issuerRef:
    name: cloudflare-issuer
    kind: Issuer

Configuring Ingress Controller with Traefik

Traefik simplifies the deployment of services by managing inbound routing from external requests to internal services based on predefined rules. Install Traefik using Helm:

# Add Traefik Helm repository and update
helm repo add traefik https://traefik.github.io/charts
helm repo update

# Install Traefik to handle Ingress within Kubernetes
helm install traefik traefik/traefik --namespace traefik --create-namespace

Configure Ingress Routes: To route traffic to your services, configure IngressRoutes with Traefik.

---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: template-com-tls
  namespace: <namespace>
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`<domain>.template.com`)
    kind: Rule
    middlewares:
    services:
    - name: <name>
      port: <port>

Deployment and Updating with Ansible

With the Kubernetes cluster operational, the next challenge is managing updates and configurations efficiently. Ansible provides a robust solution for automating not only the deployment of services within Kubernetes but also the setup and upgrading of the Kubernetes infrastructure itself. You can check out all of my Ansible resources here.

Creating a Virtual Machine

To facilitate the management of the Kubernetes cluster, I set up a dedicated Ubuntu Virtual Machine using VirtualBox on my workstation. This machine is configured with a bridged network connection, ensuring it appears as another device on my network and can interact with other nodes seamlessly.

# Note for VM setup:
- OS: Ubuntu
- Network: Bridged connection
- Use: Ansible playbooks execution
- Security: Basic hardening applied
- State: Savestate created post-setup

Ansible was installed on this virtual machine to manage Kubernetes configurations and updates efficiently. sudo apt update && sudo apt install ansible -y

Connecting the Virtual Machine to the Cluster

It is crucial that the virtual machine has SSH keys configured for each node in the Kubernetes cluster, enabling seamless access for management tasks.

# Ensuring SSH keys are set up for each node
# Note: Replace `<user>` and `<node-ip>` with appropriate values
ssh-keygen -t ed25519
ssh-copy-id <user>@<node-ip>

After setting up SSH keys, I verified the connectivity to each node from the virtual machine. Subsequent to this verification, I proceeded to configure Ansible inventories and playbooks, which are crucial for automating tasks across the cluster.

# Commands to verify SSH connectivity and initial Ansible playbook setup
ssh <user>@<node-ip> 'echo connected successfully'

Creating the Inventory

Inventory Setup: Ansible manages configurations across various hosts by using an inventory file that specifies how to access these hosts. This file is essential for identifying and categorizing hosts based on their roles and characteristics. Inventory Structure:

# hosts.yml - Lists all hosts by name, address, and ports
# groups.yml - Categorizes hosts into groups based on node type, processor type, and security hardening stage

This method not only allows precise targeting of configurations to specific groups but also facilitates hierarchical management where playbooks can be executed on parent groups affecting all relevant child groups.

Writing Playbooks

Playbook Creation: The process of creating Ansible playbooks involves defining the tasks and configurations needed to automate cluster operations. This includes setting up directories for storing variables that are applicable to groups of hosts or individual hosts.

# Directory structure for playbooks and variables
/playbooks
  /group_vars
    all.yml  # Global variables
    kubernetes_nodes.yml  # Variables for Kubernetes node group
  /host_vars
    node1.yml  # Variables specific to node1

Using AI Assistance: Leveraging tools like ChatGPT can accelerate playbook development by translating command sequences into Ansible YAML format. However, this approach requires careful review and testing to ensure the playbooks perform as intended. Testing Playbooks: ansible-playbook -i hosts.yml playbook.yml --check This command allows administrators to perform a dry run of playbooks, verifying the actions without making any changes to the actual environment. This step is crucial for maintaining system integrity and avoiding disruptions during updates.

Future ideas for expansion

Enhanced Hosting and Scalability

My plans include diversifying the hosted services within the Kubernetes cluster. The aim is to host a variety of applications, including websites, databases, video game servers, and bots. Increasing the number of nodes within the cluster is a priority to enhance resource availability and reduce the risk of service disruptions due to hardware failures.

Infrastructure Improvements

To further improve reliability and performance:

  • Uninterruptible Power Supplies (UPS): Implement UPS systems to maintain server uptime and develop an automated shutdown process for extended power and internet outages.
  • Dedicated NFS Server: Transition to a dedicated NFS server to offload storage responsibilities from worker nodes, potentially utilizing RAID 5 on SSDs for enhanced data integrity and speed.
  • Server Updates: Automate server updates to facilitate maintenance while minimizing downtime, ensuring continuous service availability.

Security Enhancements

Security remains a cornerstone of infrastructure management:

  • Honeypots and Intrusion Detection Systems (IDS): Incorporate honeypots and IDS/IPS systems to bolster security measures, deterring attacks and monitoring for suspicious activities.
  • VPN Requirements: Mandate VPN usage for remote server access to safeguard against unauthorized entry and data breaches.

Reflecting on Overkill

While some enhancements may exceed the project's initial scope—such as extensive security measures or advanced server configurations—they reflect a commitment to leveraging technology to its fullest potential, aligning with personal interests and ongoing learning in server management and cybersecurity.

Outcomes and Learnings

Through this project, I've gained a solid understanding of Kubernetes fundamentals and developed a robust process for troubleshooting and resolving issues within the cluster. While I continue to enhance my expertise and often consult resources like ChatGPT and Stack Overflow for complex challenges, my capability to comprehend Kubernetes documentation and its core functionalities has significantly improved. Working on a bare metal cluster, distinct from more conventional cloud-based environments, has not only provided a unique learning experience but also deepened my understanding of the infrastructural adaptabilities that cloud environments offer. This journey has been immensely valuable, offering practical insights and reinforcing my skills in container orchestration, server deployment, and system scalability—key aspects that align with my initial objectives to establish a dynamic development environment for personal projects.