Configuring the Streaming Server for Encryption and Authentication

Configuring the Streaming Server for Encryption and Authentication

GPSS supports authenticating with Kerberos to obtain both Kafka and Greenplum Database credentials. GPSS also supports using SSL to encrypt communication between Kafka and GPSS, between the GPSS client and server, and on the data channel between GPSS/gpkafka and Greenplum.

Configuring gpss and gpkafka for SSL-Encrypted Communications with Kafka

If your Kafka version 0.9 and newer cluster is configured to use SSL encryption, you must configure GPSS to use this encryption method when communicating with Kafka. You perform this configuration at both the GPSS service instance and client levels.

  1. Create Kafka client keys for the gpss or gpkafka instance.
  2. Specify the location of the GPSS client certificates via Kafka properties in the PROPERTY block of the gpkafka.yaml load configuration file. For example:
    PROPERTIES:
        security.protocol: SSL
        ssl.ca.location: /path/to/cert/kafka-ca.crt
        ssl.certificate.location: /path/to/cert/gpssclient.crt
        ssl.key.location: /path/to/cert/gpssclient.key
    
  3. If you are using the gpsscli subcommands to load data, ensure that the ListenAddress:Host that you specify for the GPSS server identifies the common name (CN) in the certificate.

Configuring gpss and gpkafka for SSL-Encrypted Communications with Greenplum

There are two communication channels between GPSS and Greenplum Database: a control channel and a data channel. GPSS supports SSL encryption only on the data channel to Greenplum, and uses the gpfdists protocol for encrypted communications.

If your Greenplum Database cluster is configured to use SSL, you must configure GPSS to use this encryption method for the data channel when it communicates with Greenplum.

  1. Create GPSS keys for the gpfdist protocol instance.
  2. Configure the gpfdist protocol to use SSL encryption to Greenplum by providing a Gpfdist:Certificate block in the GPSS configuration file, and identify the file system location of the SSL certificates. Sample gpss.json or gpfdistconfig.json excerpt:
    "Gpfdist": {
        "Host": "127.0.0.1",
        "Port": 5001,
        "Certificate": {
            "CertFile": "/home/gpadmin/cert/gpss.crt",
            "KeyFile": "/home/gpadmin/cert/gpss.key",
            "CAFile": "/home/gpadmin/cert/root_client.crt"
        }
    }
  3. If you are using gpkafka to load data, ensure that the Gpfdist:Host that you specify identifies the common name (CN) in the certificate.

Configuring gpss and gpsscli for Encrypted gRPC Communications

GPSS supports encrypting communications between the gpsscli client and gpss server.

To use encrypted gRPC on connections between gpsscli and gpss, you must create server and client keys, and provide the keys via configuration files that you provide to the commands.

  1. Create server keys for the gpss server instance.
  2. Create client keys for the gpsscli client.
  3. Configure the gpss service instance to use SSL encryption to the client by providing a ListenAddress:Certificate block in the gpss.json GPSS configuration file. The properties in this block should identify the file system location of the SSL server keys. Sample gpss.json excerpt:
    "ListenAddress": {
        "Host": "",
        "Port": 5019,
        "Certificate": {
            "CertFile": "/home/gpadmin/cert/gpss.crt",
            "KeyFile": "/home/gpadmin/cert/gpss.key",
            "CAFile": "/home/gpadmin/cert/root_cli.crt"
        }
    }
  4. Configure the GPSS client to use SSL encryption to the server by specifying the client keys in the ListenAddress:Certificate block of a GPSS configuration file that you provide to the gpsscli subcommand via the --config gpsscliconfig.json option. Sample gpsscliconfig.json excerpt:
    "ListenAddress": {
        "Host": "",
        "Port": 5019,
        "Certificate": {
            "CertFile": "/home/gpadmin/cert/gpsscli.crt",
            "KeyFile": "/home/gpadmin/cert/gpsscli.key",
            "CAFile": "/home/gpadmin/cert/root.crt"
        }
    }

If you encrypt communications between the GPSS client and server, but you want ito disable certificate verification, specify the --no-check-ca option when you run the gpsscli subcommand.

Configuring gpss and gpkafka for Kerberos Authentication to Greenplum

If Kerberos authentication is enabled for Greenplum Database, you must configure gpss to authenticate with Kerberos.

GPSS uses a kerberos ticket, and the USER name specified in the load configuration file, to connect to Greenplum Database.

  1. Create a Kerberos principal for each Greenplum Database user that will use GPSS to load data into Greenplum.
  2. Specify the principal name in the load configuration file USER property value.
  3. Generate a Kerberos ticket for this principal before you submit a load job with the gpsscli submit, gpsscli load, or gpkafka load commands.
Note: If your Greenplum Database Kerberos service name is not the default (postgres), set the PGKRBSRVNAME environment variable to the correct service name before you start the gpss service instance or run gpkafka load.

Configuring gpss for Kerberos Authentication to Kafka

If your Kafka version 0.9 and newer cluster is configured for Kerberos authentication, you must configure GPSS to use this authentication method. You perform this configuration at both the gpss service instance level and the GPSS client level.

GPSS is a Kafka client. You must create a Kerberos principal for the gpss server instance accessing Kafka, and generate a keytab file for this principal. By default, GPSS runs kinit using this principal and keytab to generate the Kerberos ticket.

You must set certain Kafka properties in your load configuration file to use Kerberos user authentication to Kafka. The following table identifies keywords and values that you can add to the PROPERTIES block in your gpkafka.yaml load configuration file:

Keyword Value
security.protocol The Kafka security protocol. Obtain the value from the Kafka server server.properties configuration file. GPSS supports the SASL_SSL (Kerberos and SSL) and SASL_PLAINTEXT (Kerberos, no SSL) protocols.
sasl.kerberos.keytab The absolute path to the GPSS or user Kerberos keytab file for Kafka on the local system.
sasl.kerberos.kinit.cmd The Kerberos kinit command string. If this property is not specified, GPSS uses the default value as described in librdkafka Global configuration properties when it runs the kinit command. If you do not want GPSS to run kinit, set the sasl.kerberos.kinit.cmd property to an empty value ("") or no value.
sasl.kerberos.principal The GPSS or user Kerberos service principal name; typically of the format <name>@<realm> or <primary>/<instance>@<realm>.
sasl.kerberos.service.name The Kafka Kerberos principal name. Obtain the value from the Kafka server server.properties configuration file. The default Kafka Kerberos service name is kafka.

For example:

PROPERTIES:
    security.protocol: SASL_PLAINTEXT
    sasl.kerberos.service.name: kafka
    sasl.kerberos.keytab: /var/kerberos/krb5kdc/gpss.keytab
    sasl.kerberos.principal: gpss/localhost@REALM.COM
    sasl.kerberos.kinit.cmd: 

If you are accessing Kafka using both Kerberos authentication and SSL encryption, you must also specify the Kafka SSL properties identified in Configuring gpss and gpkafka for SSL-Encrypted Communications with Kafka.