Extensions
- psl 1.0.0
- Canonicalize domain names using the public suffix list
Documentation
- latest-changes
- latest-changes
README
Contents
psl 1.0.0
This extension contains a single PostgreSQL function, registered_domain(), that uses the Public Suffix List to return the registered domain within which a hostname exists.
Installation
To build it, do this:
make
make install
make installcheck
If you encounter an error such as:
"Makefile", line 8: Need an operator
You need to use GNU make, which may well be installed on your system as
gmake
:
gmake
gmake install
gmake installcheck
If you encounter an error such as:
make: pg_config: Command not found
Be sure that you have pg_config
installed and in your path. If you used a
package management system such as RPM to install PostgreSQL, be sure that the
-devel
package is also installed. If necessary tell the build process where
to find it:
env PG_CONFIG=/path/to/pg_config make && make installcheck && make install
If you encounter an error such as:
ERROR: must be owner of database regression
You need to run the test suite using a super user, such as the default "postgres" super user:
make installcheck PGUSER=postgres
Once psl is installed you can add it to a database by running, as a superuser:
CREATE EXTENSION psl;
Psl uses a compiled-in copy of the public suffix list, with no way to
dynamically update it after it has been built. A snapshot is included
in the distributed soure, but you can update that to the latest version
by running make fetch
.
Usage
registered_domain()
will return the enclosing domain for any hostname,
folded to lower case.
For a registered domain it will return the domain itself. For a top level domain or a hostname without periods it will return null.
As a special case, if passed an apparently correct hostname with a top level domain it doesn't recognize it will return the final two components of the hostname.
steve=# select registered_domain('foo.bar.blighty.com');
registered_domain
-------------------
blighty.com
(1 row)
steve=# select registered_domain('blighty.co.uk');
registered_domain
-------------------
blighty.co.uk
(1 row)
steve=# select registered_domain('www.blighty.co.uk');
registered_domain
-------------------
blighty.co.uk
(1 row)
steve=# select registered_domain('co.uk');
registered_domain
-------------------
(1 row)
steve=# select registered_domain('co.uk.ie');
registered_domain
-------------------
uk.ie
(1 row)
Bugs
The upstream code from regdom-libs is broken in that the PHP code used to preprocess the PSL for the C code to read errors out. Attempting to fix that causes the embedded PSL to be corrupt in a way that causes things to SEGV. Either it needs to be fixed or replaced.
Copyright and License
Copyright 2018 Steve Atkins
This module is free software; you can redistribute it and/or modify it under the PostgreSQL License.
The core functionality is from regdom-libs, code released under the Apache license.
Test vectors were taken from libpsl, under MIT license.