PostgreSQL

From Segfault
Jump to navigation Jump to search

Installation

    dnf install postgresql-server postgresql postgresql-contrib        # Fedora
apt-get install postgresql postgresql-client                           # Debian, Ubuntu
  pkgin install postgresql15-server                                    # NetBSD (adjust version number as needed)[1]

The data directory needs to be created too:

sudo -u postgres initdb -D /var/lib/postgres/data                        # Any
doas -u pgsql sh -c "PATH=/usr/pkg/bin initdb -D /var/lib/postgres/data" # NetBSD
sudo postgresql-setup --initdb --unit postgresql                         # Fedora

User creation

After the installation, we need add a database user too.

System users

  1. Create a (Unix) system user.
  2. Use the predefined postgres user to connect to the database.
  3. Create a database user within the database.
$ sudo useradd -m -U -s /bin/bash pguser                           # --create-home, --user-group
$ sudo -u postgres LESS="-X" psql                                  # Don't clear the screen after using the pager

Database users

We can also create database-only users, with no relation to system users:

$ sudo -i -u postgres psql

postgres=# create user dummy;
postgres=# \password   dummy                                       # Avoid leaking the password into the command history[2]

postgres=# create database testdb owner dummy;

With that, we should have something like this:

postgres=# select usename, usesysid, usecreatedb, usesuper, passwd from pg_shadow;
 usename  | usesysid | usecreatedb | usesuper |               passwd[3]
----------+----------+-------------+----------+-------------------------------------
 postgres |       10 | t           | t        | 
 someuser |   112244 | f           | f        | md5a4d80eac9ab26a4a2da04125bc2c096a
 dummy    |    16384 | f           | t        | SCRAM-SHA-256$4096:oFQff[....]

postgres=# select datname, datcollate, datctype, datistemplate, datallowconn, dattablespace from pg_database;
  datname  | datcollate  |  datctype   | datistemplate | datallowconn | dattablespace 
-----------+-------------+-------------+---------------+--------------+---------------
 template1 | en_US.UTF-8 | en_US.UTF-8 | t             | t            |          1663
 template0 | en_US.UTF-8 | en_US.UTF-8 | t             | f            |          1663
 postgres  | en_US.UTF-8 | en_US.UTF-8 | f             | t            |          1663
 testdb    | en_US.UTF-8 | en_US.UTF-8 | f             | t            |          1663

Let's connect to our newly created database:

$ sudo -u pguser psql -h localhost -p 5432 testdb
testdb=> \x on \t on
Expanded display is on.
Tuples only is on.

testdb=> show server_version;
server_version | 9.4.4

testdb=> \conninfo 
You are connected to database "testdb" as user "pguser" on host "localhost" at port "5432".
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)

Configuration

Access

With PostreSQL, authentication appears to be a mix of system and application level methods.

For some reason, pg_dump would not work:

$ pg_dump -U pguser -W testdb
Password: 
pg_dump: [archiver (db)] connection to database "testdb" failed: FATAL:  Peer authentication failed for user "pguser"

The logfile had:

pguser@testdb LOG:  provided user name (pguser) and authenticated user name (root) do not match
pguser@testdb FATAL:  Peer authentication failed for user "pguser"
pguser@testdb DETAIL:  Connection matched pg_hba.conf line 90: \
                "local   all             all                                     peer"

So, for local connections, the peer authentication[4] is used:

$ cat /etc/postgresql/9.4/main/pg_hba.conf
[...]
# "local" is for Unix domain socket connections only
local   all             all                                     trust

# IPv4 local connections:
host    all             all             127.0.0.1/32            scram-sha-256

Using pg_dump over the network works:

$ pg_dump -U pguser -h localhost -W testdb | wc -l
Password: 
213

We could also use our database user to run pg_dump:

$ sudo -u pguser pg_dump testdb | wc -l
213

Let's change the local authentication method to password-based:

$ cat /etc/postgresql/9.4/main/pg_hba.conf
[...]
# "local" is for Unix domain socket connections only
local   all             all                                     scram-sha-256

Reload and test:

$ pkill -1 -u pguser postgres
$ pg_dump -U pguser -W testdb | wc -l
Password: 
213

PGROOT

If needed, edit postgresql.service for a different database root:

[Service]
Environment=PGROOT=/data/postgres

Usage

Connect to database postgres, then switch to database testdb

$ sudo -u pguser LESS="-X" psql postgres

postgres=# \c testdb 
You are now connected to database "testdb" as user "pguser".

Create a table, add some rows:

testdb=> create table users (id serial primary key not null, name text not null, email text not null);
CREATE TABLE

This created a table, and a sequence too for our id field:

testdb=> \d
             List of relations
Schema |     Name     |   Type   |  Owner   
-------+--------------+----------+----------
public | users        | table    | postgres
public | users_id_seq | sequence | postgres
(2 rows)

testdb=> insert into users (name, email) values ('alice', 'alice@example.org');
INSERT 0 1

testdb=> insert into users (name, email) values ('bob', 'bob@example.org');
INSERT 0 1

testdb=> select * from users;
 id | name  |       email       
----+-------+-------------------
  1 | alice | alice@example.org
  2 | bob   | bob@example.org
(2 rows)

Again, with different output formatting:

testdb=# \x on \t on
Expanded display is on.
Tuples only is on.

testdb=# select * from users;
id    | 1
name  | alice
email | alice@example.org
------+------------------
id    | 2
name  | bob
email | bob@example.org

Populate with random data:[5]

create table random as select s, md5(random()::text) as one, md5(random()::text) as two from generate_Series(1,5) s;

Display database sizes:

select pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size from pg_database order by size;

Display table sizes, across all databases:[6]

SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
 FROM pg_class C
 LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
 WHERE nspname NOT IN ('pg_catalog', 'information_schema')
   AND C.relkind <> 'i'
   AND nspname !~ '^pg_toast'
 ORDER BY pg_total_relation_size(C.oid) DESC
 LIMIT 5;

List databases, tables, indexes:

postgres=# \l
                         List of databases
  Name    | Owner | Encoding  | Collate | Ctype | Access privileges 
----------+-------+-----------+---------+-------+-------------------
postgres  | pgsql | SQL_ASCII | C       | C     | 
usda      | pgsql | SQL_ASCII | C       | C     | 

postgres=# \c usda
You are now connected to database "usda" as user "pgsql".

usda=# \d
        List of relations
Schema |   Name   | Type  | Owner 
-------+----------+-------+-------
public | data_src | table | pgsql
public | datsrcln | table | pgsql

usda=# \di
                     List of relations
Schema |          Name           | Type  | Owner |  Table   
-------+-------------------------+-------+-------+----------
public | data_src_pkey           | index | pgsql | data_src
public | datsrcln_datasrc_id_idx | index | pgsql | datsrcln
public | datsrcln_pkey           | index | pgsql | datsrcln
public | deriv_cd_pkey           | index | pgsql | deriv_cd

Backup, Restore

To backup one database:

pg_dump -U pguser testdb > testdb.sql

Backup roles and tablespaces also:

pg_dumpall -U pguser --globals > pg_globals.sql

To dump all databases, use pg_dumpall:

sudo -u postgres pg_dumpall > pg_all.sql

Let's drop the database and restore the backup:

$ psql -U pguser postgres -c "drop database testdb;"
DROP DATABASE

$ psql -U pguser postgres -c "create database testdb;"
CREATE DATABASE

$ psql -U pguser testdb -f testdb.sql                        # We could also use < instead of -f
$ psql -U pguser testdb -c "select * from users;"
Password for user pguser: 
 id | name  |       email       
----+-------+-------------------
  1 | alice | alice@example.org
  2 | bob   | bob@example.org
(2 rows)

Upgrade

Major Upgrades

Upgrading PostgreSQL (changing major versions[7]) appears to be a big deal, but the Arch Linux wiki has (of course) a neat procedure[8] on how to do this:

Stop postgresql, then install the new version, but keep the old binaries around:

systemctl stop postgresql.service
pacman -S postgresql postgresql-libs postgresql-old-upgrade

Move the old data directory out of the way and generate a new one:

export PGDIR=/var/lib/postgres

mv -iv ${PGDIR}/data{,.old}
mkdir  -p -m0700            ${PGDIR}/{data,tmp}
chown -cR postgres:postgres ${PGDIR}/{data,tmp}

Change to the postgresql user and:

su - postgres
export PGDIR=/var/lib/postgres

cd ${PGDIR}/tmp
initdb -D ${PGDIR}/data

Start the upgrade, providing both the path to old (v15) and new binaries and data directories:

pg_upgrade -b /opt/pgsql-15/bin/ -B /usr/bin/ -d ${PGDIR}/data.old -D ${PGDIR}/data

Note: if this fails with "The source cluster was not shut down cleanly" we have to start, and then properly stop the server again:

/opt/pgsql-15/bin/pg_ctl start -D ${PGDIR}/data.old
/opt/pgsql-15/bin/pg_ctl stop  -D ${PGDIR}/data.old

Once the upgrade is complete, we can start again and run some cleanup tasks:

sudo systemctl start postgresql.service

vacuumdb --all --analyze-in-stages
./delete_old_cluster.sh

Collation version mismatch

Even minor upgrades (e.g. 16.1-4 to 16.1-5) happened to have caused the following, for the second time now:

$ sudo -u postgres psql 
WARNING:  database "postgres" has a collation version mismatch
DETAIL:  The database was created using collation version 2.38, but the operating system provides version 2.39.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
psql (16.1)
Type "help" for help.

This is even mentioned in the documentation and can be remedied as described:

$ for d in $(sudo -u postgres psql -l | awk '/libc/ {print $1}'); do
   echo "### ${d}"
   psql -U postgres -d "${d}" -c "REINDEX DATABASE ${d}" -c "ALTER DATABASE ${d} REFRESH COLLATION VERSION"
   echo
done

### nextcloud
REINDEX
NOTICE:  changing version from 2.38 to 2.39
ALTER DATABASE
[...]

With that, these messages should be gone.

Links

References

  1. pkgsrc
  2. CREATE USER — define a new database role
  3. Password Authentication
  4. Client Authentication: Authentication Methods
  5. SQL Populate table with random data
  6. PostgreSQL: How to show table sizes
  7. PostgreSQL 10 Released: "This release also marks the change of the versioning scheme for PostgreSQL to a "x.y" format. This means the next minor release of PostgreSQL will be 10.1 and the next major release will be 11."
  8. Upgrading PostgreSQL