17  Amazon Relational Database Service (Amazon RDS)

Now that we’ve covered various data storage types, let’s discuss managing relationships between different data types, which can quickly become a challenge. For instance, consider an e-commerce business that needs to manage customer information, order details, and product inventories. Maintaining relationships between these different data types is crucial but can become complex as the business grows.

17.1 Relational databases

To address this complexity, it’s best to use a relational database management system (RDBMS). An RDBMS stores data in a structured way that relates different pieces of data to one another. For example, customer information can be stored in a customer table, while physical addresses can be stored in an address table. These tables can be related via a common attribute, such as a customer ID, allowing for efficient querying and data management.

SQL (Structured Query Language) is commonly used to query relational databases. Popular database systems supported by AWS include MySQL, PostgreSQL, Oracle, and Microsoft SQL Server. Many businesses run these databases on-premises, but they can be moved to the cloud for greater flexibility and scalability.

17.2 Moving databases to the cloud

One method for moving on-premises databases to the cloud is called Lift-and-Shift, which involves migrating your database to run on Amazon EC2. This allows you to maintain control over variables like the operating system, memory, CPU, and storage capacity, similar to your on-premises environment. Standard migration practices or tools like the Database Migration Service can facilitate this process.

17.3 Amazon RDS: A managed service

A more managed option for running databases in the cloud is Amazon Relational Database Service (Amazon RDS). Amazon RDS supports major database engines and offers numerous benefits, including automated patching, backups, redundancy, failover, and disaster recovery. These features reduce the administrative burden on database administrators, allowing them to focus on solving business problems rather than maintaining databases.

In a relational database, data is stored in a way that relates it to other pieces of data, making it easy to understand, consistent, and scalable. For example, a coffee shop owner can write an SQL query to identify all customers whose most frequently purchased drink is a medium latte.

17.4 Example of data in a relational database

Consider a relational database for an online bookstore. It might include the following tables:

  • Customers: Stores customer information like name, email, and contact details.
  • Books: Stores details about books like title, author, ISBN, and price.
  • Orders: Stores order information, linking customers to the books they purchased, including order date and quantity.

These tables are related through keys, allowing efficient data retrieval and management.

17.5 Amazon RDS

Amazon Relational Database Service (Amazon RDS) enables you to run relational databases in the AWS Cloud. It automates tasks such as hardware provisioning, database setup, patching, and backups, freeing up time to focus on using data to innovate your applications. Amazon RDS can be integrated with other services, such as AWS Lambda, to query your database from a serverless application.

Amazon RDS provides various security options, including encryption at rest (protecting data while it is stored) and encryption in transit (protecting data while it is being sent and received).

17.5.1 Amazon RDS database engines

Amazon RDS supports six database engines, each optimized for different aspects like memory, performance, or input/output (I/O). The supported database engines are:

  • Amazon Aurora
  • PostgreSQL
  • MySQL
  • MariaDB
  • Oracle Database
  • Microsoft SQL Server

By using Amazon RDS, you can leverage the power of these database engines with the added benefits of AWS’s managed services, ensuring efficient and secure data management.

17.5.2 Amazon Aurora

Amazon Aurora is an enterprise-class relational database that is compatible with MySQL and PostgreSQL. It offers significant performance improvements, being up to five times faster than standard MySQL databases and up to three times faster than standard PostgreSQL databases.

17.5.3 Key benefits of Amazon Aurora

  1. Performance:
    • Amazon Aurora is designed to deliver high performance with low latency. It achieves this by optimizing the database engine and reducing unnecessary input/output (I/O) operations. This results in faster query processing and improved overall performance compared to traditional MySQL and PostgreSQL databases.
  2. Cost efficiency:
    • Aurora helps reduce database costs by minimizing unnecessary I/O operations. This means you get more efficient use of your database resources, leading to cost savings. Additionally, Aurora’s pay-as-you-go pricing model ensures that you only pay for the resources you actually use.
  3. High availability:
    • Aurora is built for high availability and reliability. It replicates six copies of your data across three Availability Zones. This multi-AZ replication ensures that your data is always available, even in the event of an infrastructure failure in one of the Availability Zones. Additionally, Aurora continuously backs up your data to Amazon S3, providing an extra layer of data protection.
  4. Scalability:
    • Amazon Aurora can automatically scale storage as needed, up to 128 terabytes per database instance. This means you don’t have to worry about running out of storage space or manually provisioning additional storage. Aurora also supports read replicas, which allows you to distribute read traffic and scale read operations independently from write operations.
  5. Security:
    • Aurora offers robust security features, including encryption at rest and in transit, network isolation using Amazon Virtual Private Cloud (VPC), and fine-grained access control with AWS Identity and Access Management (IAM). These features help ensure that your data is secure and compliant with industry standards.

17.5.4 Use cases for Amazon Aurora

  • High-performance applications: If your applications require high transaction throughput and low latency, Aurora is an excellent choice. Examples include e-commerce platforms, financial applications, and online gaming.
  • Scalable web applications: Aurora’s ability to automatically scale storage and support read replicas makes it ideal for web applications that experience variable traffic patterns and need to handle large amounts of read traffic.
  • Data warehousing and analytics: Aurora’s high performance and scalability make it suitable for data warehousing and analytics workloads, where fast query processing and the ability to handle large datasets are critical.