Cluster Deployment

Now that the hardware has been wired and the cluster has been configured, the cluster can be deployed.

Deploy the cluster

Run the following command as your administrative user (Not root). During the deployment process, you will first be prompted to connect the cluster node out of band management links. You will then be prompted to verify the firmware configuration of each node.

farosctl install-plan cluster

Voila!

This install plan is the equivalent to running all of the following commands.

# Apply configuration to the router components.
#   - Gateway, Firewall, DNS, DHCP, and NTP
farosctl apply router

# Create necessary virtual machines. Currently just the bootstrap.
farosctl create machines

# Configue cluster nodes in network services.
#   - DNS, DHCP, and Cockpit
farosctl apply host-records

#
# It is now safe to connect the management interfaces to the network
#

# Wait for maangement interfaces to become available.
farosctl wait-for management-interfaces

# Wait for firmware settings to be verified.
farosctl wait-for firmware-config

# Create the cluster load balancer.
farosctl create load-balancer

# Create the installation sources and machine config repositories.
farosctl create install-repos

# Create the OpenShift cluster.
# This will boot all of the cluster nodes, install Red Hat CoreOS on each
# node, and install OpenShift.
farosctl create cluster

Observing Installation Progress

During the install process, the current action being taken is printed to the screen. However, once the OpenShift cluster begins bootstrapping and installing, the actions become long running and dont provide much feedback. During these phases, it is possible to get a detailed status of the progress being made on the install.

watch farosctl get cluster-upgrade-status

Important

The status percentage at the top of this display will routinely report nuisance errors during cluster installation. These may be safely ignored. If the install process itself reports an error, that should be observed.

Adding Application Nodes to the Cluster

Once the first step of the install has completed and the cluster control plane is available, the application nodes can be added to the cluster. To do this run the following command:

farosctl apply app-nodes

Public DNS records

This step is not required, but will usually be desired. If you would like to reach the cluster from an external machine, you will need to create a few public DNS records. The content of the records will be slightly difference depending on the gatway routing method used (NAT or PAT). Use the following command to determine what your records should be:

farosctl get public-dns-records

For a NATed network, you will get output like this:

; Public DNS records for faros.site zone.
bastion.edge.example.com.   IN  A   192.168.5.16
api.edge.example.com.       IN  A   192.168.8.2
*.apps.edge.example.com.    IN  A   192.168.8.2

For a PATed network, you will get output like this:

; Public DNS records for faros.site zone.
bastion.edge.example.com.   IN  A   192.168.5.16
api.edge.example.com.       IN  A   192.168.5.16
*.apps.edge.example.com.    IN  A   192.168.5.16

Connecting to the cluster

The cluster credentials are displayed at the end of the cluster deployment process. If you need to retreive them again, use the following command:

farosctl get credentials

This will return the cluster API domain, the cluster console domain, and the kubeadmin password in a human readable format.

The following command will give you the login information in the form of an oc login command. Run this command and then execute the string it returns to quickly login to the cluster as kubeadmin.

/bin/bash -c $(farosctl get oc-login)

Node auth certificates

Kubernetes uses OpenSSL certificates for internode authentication. The initial set of these certificates are only valid for 24 hours. At that point, they are automatically rotated by the cluster. Every subsequent set of certificates is valid for 30 days. To avoid issues, do not shutdown the cluster in the first 24 hours. After that, do not leave the cluster powered off for longer than 30 days.

Debugging deployment

If issues are encountered during the cluster deployment process, add a -v flag to the farosctl command for increased verbosity. Adding more v’s will increase the verbosity futher.

If the installlation times out waiting for the cluster nodes to start provisioning, connect to the nodes’ management interfaces and ensure they have PXE booted. This is typically indicative of the boot order not being properly set. If the nodes are failing to PXE boot, ensure that their MAC address has been properly set in the farosctl config interface. The syslog on the bastion node is also a good source for verifying MAC addresses while they are DHCP booting.

If the nodes have PXE booted and CoreOS has been installed, watch the nodes’ consoles as they boot for errors. If there are errors about certificate verification errors, either the cluster’s bootstrap CA hs expired of the system’s hardware clock is wrong. Veryify the hardware clock in the system’s settings. To generate a noot boostrap CA certificate, recreate the install repos.

farosctl create install-repos

If the installation times out waiting for the bootstrapping to complete, the bootstrap node will likely have the most informative logs. To get the bootstrap logs, ssh to the bootstrap node and monitor the bootkube service.

farosctl ssh bootstrap
journalctl -b -f -u bootkube.service

If the installer times out waiting for the cluster install to complete, you will need to log into the cluster and determine which services were unable to come up healthy.

Restarting deployment

After a failed cluster deployment, the deployment can be easily restarted without starting over. First, fix the issue that caused the deployment to fail. If a change to the cluster configuration was required, first re-apply the configuration settings.

farosctl apply

Then, regenerate the ignition files and CA certificates used for the deployment.

farosctl create install-repos

Finally, restart the cluster deployment.

farosctl create cluster