Migrating Databases from Heroku Postgres to AWS
Heroku Postgres is powerful in its simplicity. Yet more and more dev teams are migrating to AWS. Let’s look at why, and how to get started.
Many teams host their relational data storage on Heroku. However, over time, a significant number find themselves migrating their Postgres workloads over to AWS. In this article, I'll discuss why many teams are making this jump. I'll also detail the steps to migrate from Heroku Postgres to AWS, some obstacles you might face, and some tricks for easing your migration path.
Why migrate from Heroku Postgres to AWS?
Heroku is the go-to platform for many development teams just beginning their journey into the cloud. I touched on some of the reasons why in my last post on Heroku.
First, Heroku is simple to use and understand. AWS tends to overwhelm users with features. Heroku's feature set is clean and often much easier to use than their AWS counterparts.
Second, Heroku's pricing model is straightforward. AWS pricing for individual products often involves multiple factors like usage and data transfer rates. This complexity makes it hard for teams to predict exactly how much they'll spend month over month. By contrast, Heroku offers all-in-one pricing packages that make cloud spend predictable and easy to estimate.
Heroku brings these upsides to its Postgres hosting offering as well. So it's not surprising that many teams go with Heroku for their initial launch.
But, as I also discussed last time, using Heroku has significant downsides as well. When it comes to Postgres, Heroku falls short on several fronts:
Speed. As I discussed previously, Heroku's pre-packaged offerings generally don't provide as much compute capacity as is available on a cloud provider such as AWS.
Pricing. While Heroku's pricing is easier to calculate, it tends to be more expensive than what you can get with a carefully managed offering on a cloud provider. (That's not surprising - Heroku is using AWS itself and simply charging an overhead for its ease of use!)
Security. Heroku Postgres instances are publicly reachable on the Internet. You can create much more secure architectures directly on AWS. At TinyStacks, for example, we create all Amazon RDS instances for our customer's stacks in a private subnet for enhanced security.
Regional availability. AWS supports spinning up compute in 20+ regions around the world. Heroku offers two "regions": The US and Europe. That makes it less attractive for teams servicing a global user base.
Data residency. Dozens of countries globally have enacted laws that require some or all data on its citizens be kept within a country's borders. With its limited regional reach and services (like Heroku Postgres) that rely on storage in US data centers, Heroku can't meet many - if any - of those requirements.
Permissions. Some operations in PostgreSQL require you to have root, or superuser, access to perform certain operations. Heroku limits you to an access level "one step below the superuser", in their own words. This means that some PostgreSQL operations - such as creating new types and functions - aren't allowed. Unfortunately, there is no way to resolve this, as Heroku does not grant superuser access to PostgreSQL on any of its plans.
Given these drawbacks, it's little wonder some teams who started small on Heroku would be looking to jump ship to AWS.
How to migrate from Heroku Postgres to AWS
Fortunately, migrating from Heroku Postgres to AWS is a fairly painless process, although it does involve some coordination.
Prerequisites
For this walk-through, I set up a Heroku app with a PostgreSQL database and loaded some sample data. The data set I used is food data from the United States Department of Agriculture.
Setting this up is outside of the scope of this article. If you're interested in detailed instructions, check out my post on Hashnode.
Install Postgres tools
To migrate data from one Postgres instance to another, you'll need access to the Postgres command line tools. These tools should be automatically available if you are on a Linux Ubuntu system. For other operating systems, see the instructions for your OS on the PostgreSQL Web site.
Note that, if you're using Windows, PostgreSQL may not put the PATH
to its binaries directory in your system path. I had to add the path C:\Program Files\PostgreSQL\14\bin
to my PATH
variable in order to use the PostgreSQL tools.
Create your AWS Postgres database using Amazon RDS
The first step is creating your target database. This means creating an Amazon RDS database instance.
For this walk-through, I followed the instructions for creating a PostgreSQL instance on AWS's web site. I used the following options during create:
- Select PostgreSQL as the engine.
- Create a new VPC/DB subnet group and select Public Access.
- For VPC security group, select Create new.
This creates a database that is exposed to the Internet. This isn't the world's most secure setup but will work for demonstration purposes. If this were a production database, we would put the database in a private subnet in a VPC and use a bastion host to connect to it. (Note that you can easily get this more secure setup with your app and Postgres database by using TinyStacks. I talk about that a little bit more below.)
Once created, I verified that I could connect to my database using the psql
command:
Stop writes to your Heroku Postgres database
You can't migrate a database if your applications are actively writing data to it. (Well, you can, but the results might be...messy.) You will need either to take your application offline briefly or put your app into read-only mode while you cut over to your Amazon RDS database.
Heroku users can do this easily by switching their app to maintenance mode. In maintenance mode, users who visit your app will see a message saying that the application is temporarily offline.
Export your data from Heroku Postgres
Next, you'll need to export your data from the Heroku Postgres instance.
There are several ways to do this. Heroku themselves advise one of two methods:
- For data sets under 20GB, users can utilize the backup feature of Postgres to generate a backup file.
- For larger data sets, Heroku recommends creating a short-lived fork of your database and generate a dump file using
pg_dump
.
For this walkthrough, my data was small enough that I created a backup. From my Heroku application directory, I ran the following command:
heroku pg:backups:capture
This created a backup. Once that was done, I downloaded the backup file using the following command:
heroku pg:backups:download
This downloaded a file named latest.dump
to my Heroku application directory.
Import your data into Amazon RDS
Next, you'll need to import your data into Amazon RDS. Before you do, make sure your Postgres database is configured correctly. AWS has some great documentation on how to configure your Amazon RDS Postgres instance to ensure import occurs smoothly.
You'll also need a database to import into. For this walkthrough, I created a database named usda
. To do that, I connected to my PostgreSQL instance on Amazon RDS using the following command:
psql -h <database-name>.rds.amazon.com -U postgres
After entering my password, I created the database like so:
CREATE DATABASE usda;
After that, import your data. For a backup, you can do this using the Postgres pg_restore
command. I used the following command:
pg_restore -h <database-name>.rds.amazon.com -U postgres -d usda --no-acl --no-owner latest.dump
The command options do the following:
-h
: The hostname of your Amazon RDS instance.-U
: The username with which to connect.-d
: The name of the database I created in the previous step.--no-acl
and--no-owner
. The dump file will contain access control list (ACL) as well as table ownership information that won't exist in your new database, as these are specific to Heroku's environment. Specifying these flags ignores these permissions assignments.latest.dump
: The export file we downloaded earlier.
For a dump created using pg_dump
, you can just redirect the dump file into the database from the command line. See the Postgres site for more details.
To verify that the import was successful, reconnect to your Amazon RDS PostgreSQL instance, this time specifying the database name:
psql -h <database-name>.rds.amazon.com -U postgres
You can now query your data like we did on Heroku and you should see the same results:
SELECT * from food_des;
If you want a more visual representation of your data, you can use a tool like pgAdmin. Follow the instructions from AWS on how to connect pgAdmin to your RDS instance. Once connected, find your database in the pgAdmin browser. Under Schemas->Tables, right-click food_des
and then select View/Edit Data->All Rows.
Redirect Heroku Postgres calls to your Amazon RDS instance
Finally, you'll need to tell your app running on Heroku to connect to the database hosted on AWS rather than the Heroku Postgres endpoint. This requires a small configuration that Heroku documents on their Web site. The broad steps are as follows:
- Download the Amazon RDS CA certificate from AWS's S3 bucket.
- Add the certificate to the Git repo connected to your Heroku app.
- Re-deploy your app.
- Set the Heroku app configuration to connect to your Amazon RDS-hosted PostgreSQL instance:
heroku config:set DATABASE_URL="mysql2://username:password@hostname/dbname?sslca=config/amazon-rds-ca-cert.pem" -a <app_id>
- Connect to your Amazon RDS instance and require SSL for all connections from the specified user:
GRANT USAGE ON *.* TO 'username'@'%' REQUIRE SSL;
Take your Heroku app out of maintenance mode
Finally, take your Heroku app out of maintenance mode so it starts serving traffic again.
Congratulations! You've successfully migrated your PostgreSQL database from Heroku to AWS!
Challenges of migrating to AWS
Diving straight into using AWS can be a challenge for teams that have been using Heroku exclusively. As I discussed, AWS, with its multitude of features, can be intimidating for users who are new to the cloud. Even creating an Amazon RDS instance presents you with a dizzying array of options and technical terms you may not be familiar with.
Many people who migrate to AWS also often struggle to contain costs. Hidden charges, such as data transfer rates, can cause your AWS bill to bloat quickly.
How TinyStacks can help
There are many benefits to moving from Heroku to AWS. These include speed, scalability, security, features, global reach, and cost.
The problem is that a large gulf exists between the two services. Jumping from Heroku to AWS requires expertise that many teams don't have. And in many cases, their existing staff members don't have the time to develop that expertise. They already have their hands full creating new features for end users!
We created TinyStacks to address this gulf. Using TinyStacks, you can deploy your app just like you would on Heroku. Our dashboard makes it easy to create and manage a full publishing pipeline for your applications.
Even better? We host your app on an AWS account that you own. And we constantly optimize our system to host our customer's code at the lowest possible cost. You get all of the scalability, security, and flexibility of running on AWS without having to become an AWS expert.
If you're ready to make the leap from Heroku to AWS, contact us today!