dblink Functions

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.

dblink Functions

The dblink module supports connections to other Greenplum Database databases from within a database session. These databases may reside on the same database host, or on a remote host.

Greenplum Database supports dblink connections between databases in Greenplum Database installations with the same major version number. dblink may also connect to other Greenplum Database installations that use compatible libpq libraries.

dblink is intended for database users to perform short ad hoc queries in other databases. dblink is not intended for use as a replacement for external tables or for administrative tools such as gptransfer.

Limitations

In this release of Greenplum Database, statements that modify table data cannot use named or implicit dblink connections. Instead, you must provide the connection string directly in the dblink() function. For example:
gpadmin=# CREATE TABLE testdbllocal (a int, b text) DISTRIBUTED BY (a);
CREATE TABLE
gpadmin=# INSERT INTO testdbllocal select * FROM dblink('dbname=postgres', 'SELECT * FROM testdblink') AS dbltab(id int, product text);
INSERT 0 2
The Greenplum Database version of dblink disables the following asynchronous functions:
  • dblink_send_query()
  • dblink_is_busy()
  • dblink_get_result()

See dblink in the PostgreSQL documentation for more information about individual functions.

Using dblink

The following procedure identifies the basic steps for configuring and using dblink in Greenplum Database. Refer to dblink in the PostgreSQL documentation for more information about individual dblink functions.

Note: You must specify both a hostname and a password to connect with dblink as a non-superuser.
  1. Begin by creating a sample table to query using the dblink functions. These commands create a small table in the postgres database, which you will later query from the gpadmin database using dblink:
    $ psql -d postgres
    psql (8.3.23)
    Type "help" for help.
    
    postgres=# CREATE TABLE testdblink (a int, b text) DISTRIBUTED BY (a);
    CREATE TABLE
    postgres=# INSERT INTO testdblink VALUES (1, 'Cheese');
    INSERT 0 1
    postgres=# INSERT INTO testdblink VALUES (2, 'Fish');
    INSERT 0 1
    postgres=# \q
    $
  2. Log into a different database (gpadmin in this example) and install the dblink functions if they are not already available. You install the dblink functions using the $GPHOME/share/postgresql/contrib/dblink.sql script:
    $ psql -d gpadmin
    psql (8.3.23)
    Type "help" for help.
    
    gpadmin=# \i /usr/local/greenplum-db/share/postgresql/contrib/dblink.sql
    SET
    CREATE FUNCTION
    CREATE FUNCTION
    CREATE FUNCTION
    CREATE FUNCTION
    REVOKE
    REVOKE
    CREATE FUNCTION
    CREATE FUNCTION
    ...
  3. Use the dblink_connect() function to create both implicit and named connections to other databases. The connection string that you provide should be a libpq-style keyword/value string. For example, to create a named connection to the postgres database on the local Greenplum Database system:
    gpadmin=# SELECT dblink_connect('mylocalconn', 'dbname=postgres');
     dblink_connect
    ----------------
     OK
    (1 row)
    To make a connection to a remote database system, simply include host and port information in the connection string. For example, to create an implicit dblink connection to a remote system:
    gpadmin=# SELECT dblink_connect('host=remotehost port=5432 dbname=postgres');
    Note: You must specify both a hostname and a password in the connection string to connect as a non-superuser.
  4. Use the basic dblink() function to query a database using a configured connection. Keep in mind that this function returns a record type, so you must assign the columns returned in the dblink() query. For example, the following command uses the named connection to query the table you created in Step 1:
    gpadmin=# SELECT * FROM dblink('mylocalconn', 'SELECT * FROM testdblink') AS dbltab(id int, product text);
     id | product
    ----+---------
      1 | Cheese
      2 | Fish
    (2 rows)