Objective: In this article, we will walk through running a distributed, production-quality database setup managed by Rancher and characterized by stable persistence. We will use Stateful Sets with a Kubernetes cluster in Rancher for the purpose of deploying a stateful distributed Cassandra database.
Pre-requisites: We assume that you have a Kubernetes cluster provisioned with a cloud provider. Consult the Rancher resource if you would like to create a K8s cluster in Amazon EC2 using Rancher 2.0.
Databases are business-critical entities and data loss or leak leads to major operational risk scenarios in any organization. A single operational or architectural failure can lead to significant loss of time and resources and this necessitates failover systems or procedures to mitigate a loss scenario. Prior to migrating a database architecture to Kubernetes, it is essential to complete a cost-benefit analysis of running a database cluster on a container architecture versus bare metal, including the potential pitfalls of doing so by evaluating disaster recovery requirements for Recovery Time Objective (RTO) and Recovery Point Objective (RPO). This is especially true in data-sensitive applications that require true high availability, geographic separation for scale and redundancy and low latency in application recovery. In the following walk-thru, we will analyze the various options that are available in Rancher High Availability and Kubernetes in order to design a production quality database.
A. Drawbacks of Container Architectures for Stateful Systems
Containers deployed in a Kubernetes-like cluster are naturally stateless and ephemeral, meaning they do not maintain a fixed identity and they lose and forget data in case of error or restart. In designing a distributed database environment that provides high availability and fault tolerance, the stateless architecture of Kubernetes presents a challenge as both replication and scale out requires state to be maintained for the following: (1) Storage; (2) Identity; (3) Sessions; and (4) Cluster Role.
Consider our containerized database application and we can immediately start to see challenges in going with a stateless architecture as our application is required to fulfill a set of requirements:
Our database is required to store Data and Transactions in files that are persistent and exclusive to each database container;
Each container in the database application is required to maintain a fixed identity as a database node in order that we may route traffic to it by either name, address or index;
Database client sessions are required to maintain state to ensure read-write transactions are terminated prior to state change for consistency and to ensure that state transformations survive failure for durability; and
Each database node requires a persistent role in its database cluster, such as master, replica or shard unless changed by an application-specific event and as necessitated by schema changes.
Transient solutions to these challenges may be to attach a
PersistentVolume to our Kubernetes pods that has a lifecycle independent of any individual pod that uses it. However,
PersistentVolume does not provide a consistent assignment of roles to cluster nodes, i.e. parent, child or seed nodes. The cluster does not guarantee that database states are maintained throughout the application lifecycle, and specifically, that new containers will be created with nondeterministic random names and pods can be scheduled to be started, terminated or scaled at any time and in any order. So our challenge remains.
B. Advantages of Kubernetes for a Distributed Database Deployment
Given the challenges of deploying a distributed database in a Kubernetes cluster, is it even worth the effort? There are a plethora of advantages and possibilities that Kubernetes opens up, including managing numerous database services together with common automated operations to support their healthy lifecycle with recoverability, reliability and scalability. Database clusters may be deployed at a fraction of the time and cost needed to deploy bare metal clusters, even in a virtualized environment.
Stateful Sets provides a way forward from the challenges outlined in the previous section. With Stateful Sets introduced in the 1.5 release, Kubernetes now implements Storage and Identity stateful qualities. The following is ensured:
- Each pod has a persistent volume attached, with a persistent link from pod to storage, solving storage state issue from (A);
- Each pod starts in the same order and terminates in reverse order, solving sessions state issue from (A);
- Each pod has a unique and determinable name, address and ordinal index assigned solving identity and cluster role issue from (A).
C. Deploying Stateful Set Pod with Headless Service
Note: We will use the kubectl service in this section. Consult the Rancher resource here on deploying the kubectl service using Rancher.
Stateful Set Pods require a headless service to manage the network identity of the Pods. Essentially, a headless service has a non-defined Cluster IP address, meaning that no cluster IP is defined on the service. Instead, the service definition has a selector and when the service is accessed, DNS is configured to return multiple address records or addresses. At this point, service fqdn gets mapped to all IPs of all the pod IPs behind that service with the same selector.
Let’s create a Headless Service for Cassandra using this template:
$ kubectl create -f cassandra-service.yml service "cassandra" created
Use get svc to list the attributes of the cassandra service.
$ kubectl get svc cassandra NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra None <none> 9042/TCP 10s
And describe svc to list the attributes of the cassandra service with verbose output.
$ kubectl describe svc cassandra Name: cassandra Namespace: default Labels: app=cassandra Annotations: <none> Selector: app=cassandra Type: ClusterIP IP: None Port: <unset> 9042/TCP TargetPort: 9042/TCP Endpoints: <none> Session Affinity: None Events: <none>
D. Creating Storage Classes for Persistent Volumes
In Rancher, we can use a variety of options to manage our persistent storage through native Kubernetes API resources,
PersistentVolumeClaim. Storage classes in Kubernetes tells us which storage classes are supported by our cluster. We can use dynamic provisioning for our persistent storage to automatically create and attach volumes to pods. For example, the following storage class will specify AWS as its storage provider and use type
gp2 and availability zone
storage-class.yml kind: StorageClass apiVersion: storage.k8s.io/v1beta1 metadata: name: standard provisioner: kubernetes.io/aws-ebs parameters: type: gp2 zone: us-west-2a
It is also possible to create a new Storage Class, if needed such as:
Kubectl create -f azure-stgcls.yml Storageclass “stgcls” created
Upon creation of a
PersistentVolumeClaim is initiated for the
StatefulSet pod based on its Storage Class. With dynamic provisioning, the
PersistentVolume is dynamically provisioned for the pod according to the Storage Class that was requested in the
You can manually create the persistent volumes via Static Provisioning. You can 了解更多 about Static Provisioning here.
Note: For static provisioning, it is a requirement to have the same number of Persistent Volumes as the number of Cassandra nodes in the Cassandra server.
E. Creating Stateful Sets
We can now create the StatefulSet which will provided our desired properties of ordered deployment and termination, unique network names and stateful processing. We invoke the following command and start a single Cassandra server:
$ kubectl create -f cassandra-statefulset.yml
F. Validating Stateful Set
We then invoke the following command to validate if the Stateful Set has been deployed in the Cassandra server.
$ kubectl get statefulsets NAME DESIRED CURRENT AGE cassandra 1 1 2h
The values under
CURRENT should be equivalent once the Stateful Set has been created. Invoke get pods to view an ordinal listing of the Pods created by the Stateful Set.
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE cassandra-0 1/1 Running 0 1m 172.xxx.xxx.xxx 169.xxx.xxx.xxx
During node creation, you can perform a
nodetool status to check if the Cassandra node is up.
$ kubectl exec -ti cassandra-0 -- nodetool status Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.xxx.xxx.xxx 109.28 KB 256 100.0% 6402e90e-7996-4ee2-bb8c-37217eb2c9ec Rack1
G. Scaling Stateful Set
Invoke the scale command to increase or decrease the size of the Stateful Set by replicating the setup in (F) x number of times. In the example, below we replicate with a value of
x = 3.
$ kubectl scale --replicas=3 statefulset/cassandra
get statefulsets to validate if the Stateful Sets have been deployed in the Cassandra server.
$ kubectl get statefulsets NAME DESIRED CURRENT AGE cassandra 3 3 2h
Invoke get pods again to view an ordinal listing of the Pods created by the Stateful Set. Note that as the Cassandra pods deploy, they are created in a sequential fashion.
$ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE cassandra-0 1/1 Running 0 13m 172.xxx.xxx.xxx 169.xxx.xxx.xxx cassandra-1 1/1 Running 0 38m 172.xxx.xxx.xxx 169.xxx.xxx.xxx cassandra-2 1/1 Running 0 38m 172.xxx.xxx.xxx 169.xxx.xxx.xxx
We can perform a nodetool status check after 5 minutes to verify that the Cassandra nodes have joined and formed a Cassandra cluster.
$ kubectl exec -ti cassandra-0 -- nodetool status Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.xxx.xxx.xxx 103.25 KiB 256 68.7% 633ae787-3080-40e8-83cc-d31b62f53582 Rack1 UN 172.xxx.xxx.xxx 108.62 KiB 256 63.5% e95fc385-826e-47f5-a46b-f375532607a3 Rack1 UN 172.xxx.xxx.xxx 177.38 KiB 256 67.8% 66bd8253-3c58-4be4-83ad-3e1c3b334dfd Rack1
We can perform a host of database operations by invoking
CQL once the status of our nodes in
nodetool changes to
H. Invoking CQL for database access and operations
Once we see a status of
U/N we can access the Cassandra container by invoking
kubectl exec -it cassandra-0 cqlsh Connected to Cassandra at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> describe tables Keyspace system_traces ---------------------- events sessions Keyspace system_schema ---------------------- tables triggers views keyspaces dropped_columns functions aggregates indexes types columns Keyspace system_auth -------------------- resource_role_permissons_index role_permissions role_members roles Keyspace system --------------- available_ranges peers batchlog transferred_ranges batches compaction_history size_estimates hints prepared_statements sstable_activity built_views "IndexInfo" peer_events range_xfers views_builds_in_progress paxos local Keyspace system_distributed --------------------------- repair_history view_build_status parent_repair_history
I. Moving Forward: Using Cassandra as a Persistence Layer for a High-Availability Stateless Database Service
In the foregoing exercise, we deployed a Cassandra service in a K8s cluster and provisioned persistent storage via
PersistentVolume. We then used
StatefulSets to endow our Cassandra cluster with stateful processing properties and scaled our cluster to additional nodes. We are now able to use a CQL schema for database access and operations in our Cassandra cluster. The advantage of a CQL schema is the ease with which we can use natural types and fluent APIs that makes for seamless data modeling especially in solutions involving scaling and time series data models, such as fraud detection solutions. In addition, CQL leverages partition and clustering keys which increases speed of operation in data modeling scenarios.
In the next sequence in this series, we will explore how we can use Cassandra as our persistence layer in a Database-as-a-Microservice or a stateless database by leveraging the unique architectural properties of Cassandra and using the Rancher toolset as our starting point. We will then analyze the operational performance and latency of our Cassandra-driven stateless database application and evaluate its usefulness in designing high-availability services with low latency between the edge and the cloud.
By combining Cassandra with a microservices architecture, we can explore alternatives to stateful databases, both in-memory SQL databases (such as SAP HANA) prone to poor latency /ifor read/write transactions and HTAP workloads as well as NoSQL databases that are slow in performing advanced analytics that require multi-table queries or complex filters. In parallel, a stateless architecture can deliver improvements on issues that stateful databases face arising from memory exceptions, both due to in-memory indexes in SQL databases and high memory usage in multi-model NoSQL databases. Improvements on both these fronts will deliver better operational performance for massively scaled queries and time-series modeling.
Hisham is a consulting Enterprise Solutions Architect with experience in leveraging container technologies to solve infrastructure problems and deploy applications faster and with higher levels of security, performance and reliability. Recently, Hisham has been leveraging containers and cloud-native architecture for a variety of middleware applications to deploy complex and mission-critical services across the enterprise. Prior to entering the consulting world, Hisham worked at Aon Hewitt, Lexmark and ADP in software implementation and technical support.