Setup Continuous Archiving and Point-in-Time Recovery (PITR) with PostgreSQL DB.

Setup Continuous Archiving and Point-in-Time Recovery (PITR) with PostgreSQL DB.

PostgreSQL is a powerful open-source relational database management system. It is widely used in various industries and applications due to its reliability, security, and scalability. One of its most useful features is Point-In-Time-Recovery (PITR), which allows you to recover your database to a specific point in time. PITR is essential in disaster recovery scenarios because it can minimize data loss.

Here are the steps to set up PITR with PostgreSQL:

  • Enable Archiving In the postgresql.conf file, set the archive_mode configuration parameter to on. This will enable archiving.

  • Configure Archive Command
    In the postgresql.conf file, set the archive_command configuration parameter to a command that will copy the WAL files to a safe location. For example, you can use the following command:

# PostgreSQL Version
PostgreSQL 13.10

# directory structure, I used in this tutorial
/u01    - PostgreSQL Data Directory Mount
/u02    - Wal Archives Mount
/backup - Backups Mount

# data dir
/u01/h1_dev_db/data

# create directory for wal_archives
mkdir -p /u02/hl_dev_db/wal_archive
chown postgres:postgres -R /u02/hl_dev_db/wal_archive

vim /u01/h1_dev_db/data/postgresql.conf
...
archive_mode = on
archive_command = 'cp %p /u02/hl_dev_db/wal_archive/%f'
...

This command will copy the WAL file to /u02/hl_dev_db/wal_archive/ directory.


Test PITR

  • Create a Full Backup Take a base backup of your database using the following command:
export DATESTAMP=$(date +%Y-%m-%d)

mkdir -p /backup/hl_dev_db_full_bkp_$DATESTAMP

chown postgres:postgres -R /backup 

# Create full backup using pg_basebackup
pg_basebackup -D /backup/hl_dev_db_full_bkp_2023-03-09 -Ft -z -P -Xs

# -F format = TAR ( t )
# -z compression = gzip
# -P Progress
# -Xs wal method = stream

This will create a backup folder at /backup/hl_dev_db_full_bkp_2023–03–09.

  • Generate Dummy Data
# Get Current Timestamp 
psql -U postgres -c "SELECT CURRENT_TIMESTAMP;"
        current_timestamp         
----------------------------------
 2023-03-08 15:43:15.644345+05:30

# NOTE: We started the work on db at 15:43

# Generate Some Data 
---dummy_data_01.sh
set -e
set -x 

psql -v ON_ERROR_STOP=1 -U postgres <<-EOSQL
  /* employees table */
  CREATE TABLE EMPLOYEES (id SERIAL PRIMARY KEY, first_name VARCHAR(50), last_name VARCHAR(50), email VARCHAR(50), mobile_no BIGINT, date_of_birth DATE);

  /* random series function */
  CREATE FUNCTION get_random_string() RETURNS TEXT LANGUAGE SQL AS \$\$ SELECT STRING_AGG ( SUBSTR ( 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', CEIL (RANDOM() * 52)::integer, 1), '') FROM GENERATE_SERIES(1, 10)\$\$;

  /* insert record function */
  CREATE FUNCTION insert_record() RETURNS VOID LANGUAGE PLPGSQL AS \$\$DECLARE first_name TEXT= INITCAP(get_random_string());
  DECLARE last_name TEXT= INITCAP(get_random_string());
  DECLARE email TEXT= LOWER(CONCAT(first_name, '.', last_name, '@gmail.com'));
  DECLARE mobile_no BIGINT=CAST(1000000000 + FLOOR(RANDOM() * 9000000000) AS BIGINT);
  DECLARE date_of_birth DATE= CAST( NOW() - INTERVAL '100 year' * RANDOM() AS DATE);

  BEGIN
    INSERT INTO EMPLOYEES (first_name, last_name, email, mobile_no, date_of_birth) VALUES (first_name, last_name, email, mobile_no, date_of_birth);
  END;

  \$\$;

  /* generate 1000000 rows of data */
  SELECT insert_record() FROM GENERATE_SERIES(1, 1000000);
EOSQL
---

bash dummy_data_01.sh

# Get Counts
psql -U postgres -c "SELECT COUNT(*) from EMPLOYEES;"
count  
---------
 1000000
(1 row)

# DB SIZE
psql -U postgres -c "SELECT pg_size_pretty( pg_database_size('postgres') );"
133MB 

# Relation SIZE
psql -U postgres -c "SELECT pg_size_pretty( pg_total_relation_size('employees'));"
126MB

psql -U postgres -c "SELECT CURRENT_TIMESTAMP;"
        current_timestamp         
----------------------------------
 2023-03-08 15:44:13.541959+05:30

# NOTE: At 15:45 we only have employees table with size 126MB

# Now create a new database and data to it

psql -U postgres -c "create database test_db;"

# dummy_data_02
psql -U postgres test_db -c "CREATE TABLE peoples (
    id BIGINT GENERATED ALWAYS AS IDENTITY,
    PRIMARY KEY(id),
    hash_firstname TEXT NOT NULL,
    hash_lastname TEXT NOT NULL,
    gender VARCHAR(6) NOT NULL CHECK (gender IN ('male', 'female'))
);
INSERT INTO peoples (hash_firstname, hash_lastname, gender)
SELECT md5(RANDOM()::TEXT), md5(RANDOM()::TEXT), CASE WHEN RANDOM() < 0.5 THEN 'male' ELSE 'female' END FROM generate_series(1, 1000000);"

psql -U postgres test_db -c "SELECT COUNT(*) from peoples;"
  count  
---------
 1000000
(1 row)

psql -U postgres -c "SELECT CURRENT_TIME;"
     current_time      
-----------------------
 15:46:05.012252+05:30
(1 row)

Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description                 
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 test_db   | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 |                       | 137 MB  | pg_default | 
 postgres  | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 |                       | 133 MB  | pg_default | default administrative connection database
 template0 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | unmodifiable empty database
           |          |          |             |             | postgres=CTc/postgres |         |            | 
 template1 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | default template for new databases
           |          |          |             |             | postgres=CTc/postgres |         |            | 
(4 rows)
  • Database Recovery

CASE1: Restore the Recovery Target Time

# create new pg_data dir
mkdir -p /u01/hl_dev_db_recovery/data
chown postgres:postgres -R /u01/hl_dev_db_recovery/data
chmod 0750 -R /u01/hl_dev_db_recovery/data

# Untar the backup
tar -xvf /backup/hl_dev_db_full_bkp_2023-03-09/base.tar.gz -C /u01/hl_dev_db_recovery/data/
tar -xvf /backup/hl_dev_db_full_bkp_2023-03-09/pg_wal.tar.gz -C /u01/hl_dev_db_recovery/data/pg_wal/

# For Instance, you want to restore the database for Timestamp 15:45
# at 15:45 : we only have employees table in postgres db

# add the recovery settings to postgresql
# Adding recovery settings 
vim /u01/hl_dev_db_recovery/data/postgresql.conf 
---
restore_command = 'cp /u02/hl_dev_db/wal_archive/%f "%p"'
recovery_target_time = '2023-03-08 15:45:00'
---

# start the database 
pg_ctl start -D /u01/hl_dev_db_recovery/data -w -t 300 -l logfile

# check logs for recovery process
tail -200f /u01/hl_dev_db_recovery/data/log/postgresql-Thu.log

# check database
psql -U postgres -c "\l+"

Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description                 
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 postgres  | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 |                       | 133 MB  | pg_default | default administrative connection database
 template0 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | unmodifiable empty database
           |          |          |             |             | postgres=CTc/postgres |         |            | 
 template1 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | default template for new databases
           |          |          |             |             | postgres=CTc/postgres |         |            | 
(4 rows)

# Get Counts
psql -U postgres -c "SELECT COUNT(*) from EMPLOYEES;"
count  
---------
 1000000
(1 row)

# DB SIZE
psql -U postgres -c "SELECT pg_size_pretty( pg_database_size('postgres') );"
133MB 

# Relation SIZE
psql -U postgres -c "SELECT pg_size_pretty( pg_total_relation_size('employees'));"
126MB

CASE2 : Restore to Recovery Target

# stop database
pg_ctl stop -D /u01/hl_dev_db_recovery/data

# change time in postgresql.conf
vim /u01/hl_dev_db_recovery/data/postgresql.conf 
---
recovery_target = 'immediate'
---

# start the database 
pg_ctl start -D /u01/hl_dev_db_recovery/data -w -t 300 -l logfile

# check logs for recovery process
tail -200f /u01/hl_dev_db_recovery/data/log/postgresql-Thu.log

# check counts
psql -U postgres test_db -c "SELECT COUNT(*) from peoples;"
  count  
---------
 1000000
(1 row)

# check database
psql -U postgres -c "\l+"
Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description                 
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 test_db   | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 |                       | 137 MB  | pg_default | 
 postgres  | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 |                       | 133 MB  | pg_default | default administrative connection database
 template0 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | unmodifiable empty database
           |          |          |             |             | postgres=CTc/postgres |         |            | 
 template1 | postgres | UTF8     | en_IN.UTF-8 | en_IN.UTF-8 | =c/postgres          +| 7777 kB | pg_default | default template for new databases
           |          |          |             |             | postgres=CTc/postgres |         |            | 
(4 rows)

That’s it! You have now set up PITR with PostgreSQL. This feature is very helpful in case of a disaster. You can recover your database to a specific point in time and minimize the data loss.


Find something useful? Hold down the 👏 to support and help others find this article. Thanks for reading!!

Follow me on Twitter @akash_Rajvanshi