OmniPITR - omnipitr-restore

USAGE

/some/path/omnipitr/bin/omnipitr-restore [options] %f %p

Options:

--data-dir (-D)

Where PostgreSQL datadir is located (path) (defaults to current working directory).

--source (-s)

Where omnipitr-restore can find wal segments to use.

Check "Source specification" for more details.

--recovery-delay (-w)

Delay when recovering wal segments (in seconds).

This is primarily used to keep window of safety before DELETE * FROM main_table will be applied on slave database.

--finish-trigger (-f)

Name of file to watch for existence - if it exists, recovery process will stop, and PostgreSQL slave will become fully functional.

Check "Finishing recovery" section for more details.

--remove-unneeded (-r) [optional name]

Makes omnipitr-restore remove unneeded wal segments. These are not segments that were passed to Pg - omnipitr-restore checks last redo segment to make sure this is safe.

This option has optional value being last segment needed in walarchive (i.e. all earlier than this can be removed). This is specifically designed, so you can put:

omnipitr-restore ... -r %r ...

as your restore_command in recovery.conf. %r works in PostgreSQL since 8.3 version, so if you're using older Pg, you can't use it (though you can still use -r without value, and let omnipitr-restore find out what it can delete on its own).

--remove-before (-rb)

If --remove-unneeded is enabled, this option makes sure that segment removal will happen also before returning segment to PostgreSQL.

In case there are many unapplied WAL segments in archive, not having --remove-before means that old, obsolete, files will be removed only after PostgreSQL will catchup with replication. With --remove-before old files will be removed during catching up.

--remove-at-a-time (-rt)

When removing old segments, remove at most that many segments before re-entering loop to check for signals and/or new segment availability for Postgres.

Defaults to 3.

--removal-pause-trigger (-p)

Name of file to watch for existence. If it exists - omnipitr-restore will not remove unneeded wal segments regardless of --removal-unneeded option. This is to provide a way to make backups on slave.

--pre-removal-processing (-h)

If given, argument will be treated as shell command to run when any segment will be removed from archive.

If the hook will finish without errors - segment will be removed. If there will be errors - removal procedure will be postponed, and after some time, it will be retried.

There will be one extra parameter attached, which will be name of the segment file to be processed (prepared in such a way that it will be relative to current working directory).

Passed segment will always be uncompressed.

--temp-dir (-t)

Where to create temporary files (defaults to /tmp or $TMPDIR environment variable location). This is only used when using pre-removal-processing.

--log (-l)

Name of logfile (actually template, as it supports %% strftime(3) markers. Unfortunately due to the %x usage by PostgreSQL, We cannot use %% macros directly. Instead - any occurence of ^ character in log dir will be first changed to %, and later on passed to strftime.

Please note that on some systems (Solaris for example) default shell treats ^ as special character, which requires you to quote the log filename (if it contains ^ character). So you'd better write it as:

--log '/var/log/omnipitr-^Y-^m-^d.log'
--streaming-replication (-sr)

This option should be used if you're setting streaming replication slave. It causes omnipitr-restore to die as soon as it will be called for WAL segment that is not in walarchive.

--pid-file

Name of file to use for pidfile. If it is specified, than only one copy of omnipitr-restore (with this pidfile) can run at the same time.

Trying to run second copy of omnipitr-restore will result in an error.

--verbose (-v)

Log verbosely what is happening.

--gzip-path (-gp)

Full path to gzip program - in case you can't set proper PATH environment variable.

--bzip2-path (-bp)

Full path to bzip2 program - in case you can't set proper PATH environment variable.

--lzma-path (-lp)

Full path to lzma program - in case you can't set proper PATH environment variable.

--pgcontroldata-path (-pp)

Full path to pg_controldata program - in case you can't set proper PATH environment variable.

--error-pgcontroldata (-ep)

Sets handler for errors with pgcontroldata. Possible options:

  • ignore - warn in logs, but nothing else to be done - after some time, recheck if pg_controldata works

  • hang - enter infinite loop, waiting for admin interaction, but not finishing recovery

  • break - breaks recovery, and returns error status to PostgreSQL (default)

Please check ERRORS section below for more details.

--version (-V)

Prints version of omnipitr-restore, and exists.

--help (-?)

Prints this manual, and exists.

--config-file (--config / --cfg)

Loads options from config file.

Format of the file is very simple - each line is treated as argument with optional value.

Examples:

--verbose
--host 127.0.0.1
-h=127.0.0.1
--host=127.0.0.1

It is important that you don't need to quote the values - value will always be up to the end of line (trailing spaces will be removed). So if you'd want, for example, to have magic-option set to "/mnt/badly named directory", you'd need to quote it when setting from command line:

/some/omnipitr/program --magic-option="/mnt/badly named directory"

but not in config:

--magic-option=/mnt/badly named directory

Empty lines, and comment lines (starting with #) are ignored.

DESCRIPTION

Call to omnipitr-restore should be in restore_command variable in recovery.conf.

Which options should be given depends only on installation, but generally you will need at least:

  • --data-dir

    PostgreSQL "%p" passed file path is relative to DATADIR, so it is required to know it.

  • --log

    to make sure that information is logged someplace about archiving progress

  • --source

    to specify where to load WAL segments from

If you'll specify more than 1 destination, you will also need to specify --state-dir

Of couse you can provide many --dst-local or many --dst-remote or many mix of these.

Generally omnipitr-restore will try to deliver WAL segment to all destinations, and will fail if any of them will not accept new segment.

Segments will be transferred to destinations in this order:

1. All local destinations, in order provided in command line
2. All remote destinations, in order provided in command line

In case any destination will fail, omnipitr-restore will save state (which destinations it delivered the file to) and return error to PostgreSQL - which will cause PostgrerSQL to call omnipitr-restore again for the same WAL segment after some time.

State directory will be cleared after every successfull file send, so it should stay small in size (expect 1 file of under 500 bytes).

When constructing command line to put in restore_command PostgreSQL GUC, please remember that while providing "%p" "%f" will work, omnipitr-restore requires only "%p"

Source specification

If the wal segments are compressed you have to prefix source path with compression type followed by '=' sign.

Allowed compression types:

  • gzip

    Decompresses with gzip program, used file extension is .gz

  • bzip2

    Decompresses with bzip2 program, used file extension is .bz2

  • lzma

    Decompresses with lzma program, used file extension is .lzma

If you want to pass any extra arguments to compression program, you can either:

  • make a wrapper

    Write a program/script that will be named in the same way your actual compression program is named, but adding some parameters to call

  • use environment variables

    All of supported compression programs use environment variables:

    • gzip - GZIP

    • bzip2 - BZIP2

    • lzma - XZ_OPT

    For details - please consult manual to your choosen compression tool.

Finishing recovery

There are 2 ways omnipitr-restore can finish recovery, and there are 2 separate ways to signal it that it should finish.

First, the finishing procedures:

  • smart

    In this mode omnipitr-restore will feed all available WAL segments to PostgreSQL (without any --recovery-delay induced delay), and then finish restoration process.

  • immediate

    In this mode omnipitr-restore will skip all pending WAL segments, and make PostgreSQL finish recover immediately.

    This can be useful in case of running really bad query (think: TRUNCATE users), and wanting to prevent this change to be replicated to slave.

Now. omnipitr-restore can be signaled into finishing recovery in 2 ways, one of which is optional.

  • trigger file

    This one is optional. If you will use --finish-trigger switch, omnipitr-restore will look for this file, and if it exists - it will start finishing.

    If the file exists, and contains string "NOW" (without quotation characters, but with optional new line character "\n"), omnipitr-restore will enter "immediate finish" procedure. If the content is different, or the file is empty - it will proceed in smart finish mode.

    After OmniPITR will finish recovery, and PostgreSQL will enter normal mode of working, it's strongly advised to remove this file.

  • system signal

    This one works always, regardless of --finish-trigger switch. Generally you can send system signals (kill) to omnipitr-restore to make it go to finish recovery procedure.

    Only 1 signals are supported:

    • SIGUSR1

      makes the finish immediate

    It is currently not possible to forcing 'smart' finishing by signal, due to the fact that omnipitr-restore is restarted after every segment.

Segment removal

omnipitr-restore will automatically remove segments that are no longer necessary.

To make it happen, it will periodically run pg_controldata program, and check name of last segment required for redo.

If pre-removal-processing is defined, it will be called before actuall removal.

omnipitr-restore will remove segments chronologically - oldest segments first.

One useful idea for pre-removal-processing, is using omnipitr-archive for processing - to send xlog segments to some permanent storage place.

ERRORS

pg_controldata

Sometimes, for not yet known reason, pg_controldata fails (doesn't print anything, exits with status -1).

In such situation, omnipitr-restore died too, with error, but it had one bad consequence: it made PostreSQL assume that it should stop recovery and go directly into standalone mode, and start accepting connections.

Because this solution is not so good, there is now switch to change the behaviour in case of such error.

It has 3 possible values.

break

"Break" means break recovery, and return to PostgreSQL error. This is default behaviour to make it backward compatible.

ignore

With this error handling, omnipitr-restore will simply ignore all errors from pg_controldata - i.e. it will simply print information about error to logs, but it will not change anythin - it will still try to work on next wal segments, and after 5 minutes - it will retry to get pg_controldata.

This is the most transparent setting, but also kind of dangerous. It means that if there is non-permanent problem leading to pg_controldata failing not 100%, it might simply get overlooked - replication will work just fine.

And this can mean that the real source of the error can propagate and do more harm.

hang

With "hang" error handling, in case of pg_controldata failure, omnipitr-restore will basically hang - i.e. enter infinite loop, not doing anything.

Of course before entering infinite loop, it will log information about the problem.

While it might seem like pointless, it has two benefits:

  • It will not cause slave server to become standalone

  • It is easily detactable, as long as you're checking size of wal archive directory, or wal replication lag, or any other metric about replication - as replication will be 100% stoppped.

To recover from hanged recovery, you have to restart PostgreSQL, i.e. run this sequence of commands (or something similar depending on what you're using to start/stop your PostgreSQL):

pg_ctl -D /path/to/pgdata -m fast stop
pg_ctl -D /path/to/pgdata start

Of course when next pg_controldata error will happen - it will hang again.

EXAMPLES

Minimal setup:

restore_command='/.../omnipitr-restore -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ %f %p'

Minimal setup, but with defined finish trigger and recovery delay (5 mintues):

restore_command='/.../omnipitr-restore -D /mnt/data/ -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -w 300 -f /tmp/finish.trigger %f %p'

Setup as above, but with pause trigger defined for doing backups-on-slave and removing unneeded segments, working on PostgreSQL 8.3 (or newer) and using its "%r" to denote what can be removed:

restore_command='/.../omnipitr-restore -D /mnt/data/ -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -w 300 -f /tmp/finish.trigger -r %r -p /tmp/pause.trigger %f %p'

Minimal setup, but with backing up segments to remote server:

restore_command='/.../omnipitr-restore -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -h "/.../omnipitr-archive --force-data-dir -l /var/log/omnipitr/archive.log -dr bzip2=rsync://backup/postgres/xlogs/" %f %p'

The OmniPITR project is Copyright (c) 2009-2013 OmniTI. All rights reserved.