Configuring the Greenplum Stream Server for Encryption and Authentication
Configuring the Greenplum Stream Server for Encryption and Authentication
GPSS supports authenticating with Kerberos to obtain both Kafka and Greenplum Database credentials. GPSS also supports using SSL to encrypt communication between Kafka and GPSS, and on the data channel between GPSS and Greenplum.
Configuring gpss for SSL Encryption to Kafka
If your Kafka version 0.9 and newer cluster is configured to use SSL encryption, you must configure GPSS to use this encryption method when communicating with Kafka. You perform this configuration at both the GPSS service instance and client levels.
- Create client keys for the gpss service instance.
- Configure the gpss service instance to use SSL encryption
to Kafka by providing a Certificate block in the GPSS
configuration file that identifies the file system location of the
SSL certificates. Sample gpss.json excerpt:
"Certificate": { "CertFile": "/home/gpadmin/cert/multiCA/server.crt", "KeyFile": "/home/gpadmin/cert/multiCA/server.key", "CAFile": "/home/gpadmin/cert/rootCA.pem" }
- The gpkafka.yaml load configuration file KAFKA:INPUT:SOURCE:ENCRYPTION property governs GPSS's use of encrypted communication to Kafka. You must set this property to true before you submit the job to identify that you want to use SSL encryption.
Configuring gpss for SSL Encryption to Greenplum
There are two communication channels between GPSS and Greenplum Database: a control channel and a data channel. GPSS supports SSL encryption only on the data channel to Greenplum.
If your Greenplum Database cluster is configured to use SSL, you must configure GPSS to use this encryption method for the data channel when it communicates with Greenplum.
- Create client keys for the gpss service instance.
- Configure the gpss service instance to use SSL encryption
to Greenplum by providing a Gpfdist:Encryption block in the
GPSS configuration file that identifies the file system location of the
SSL certificates. Sample gpss.json excerpt:
"Gpfdist": { "Host": "127.0.0.1", "Port": 5001, "Encryption": { "CertFile": "/home/gpadmin/cert/gpfdists/server.crt", "KeyFile": "/home/gpadmin/cert/gpfdists/server.key", "CAFile": "/home/gpadmin/cert/gpfdists/root.crt" } }
Configuring gpss for Kerberos Authentication to Greenplum
If Kerberos authentication is enabled for Greenplum Database, you must configure gpss to authenticate with Kerberos.
GPSS uses a kerberos ticket, and the USER name specified in the load configuration file, to connect to Greenplum Database.
- Create a Kerberos principal for each Greenplum Database user that will use GPSS to load data into Greenplum.
- Specify the principal name in the load configuration file USER property value.
- Generate a Kerberos ticket for this principal before you submit a load job with the gpsscli submit, gpsscli load, or gpkafka load commands.
Configuring gpss for Kerberos Authentication to Kafka
If your Kafka version 0.9 and newer cluster is configured for Kerberos authentication, you must configure GPSS to use this authentication method. You perform this configuration at both the gpss service instance level and the GPSS client level.
GPSS is a Kafka client. You must create a Kerberos principal for the gpss server instance accessing Kafka, and generate a keytab file for this principal. By default, GPSS runs kinit using this principal and keytab to generate the Kerberos ticket.
You must set certain Kafka properties in your load configuration file to use Kerberos user authentication to Kafka. The following table identifies keywords and values that you can add to the PROPERTIES block in your gpkafka.yaml load configuration file:
Keyword | Value |
---|---|
security.protocol | The Kafka security protocol. Obtain the value from the Kafka server server.properties configuration file. GPSS supports the SASL_SSL (Kerberos and SSL) and SASL_PLAINTEXT (Kerberos, no SSL) protocols. |
sasl.kerberos.keytab | The absolute path to the GPSS or user Kerberos keytab file for Kafka on the local system. |
sasl.kerberos.kinit.cmd | The Kerberos kinit command string. If this property is not specified, GPSS uses the default value as described in librdkafka Global configuration properties when it runs the kinit command. If you do not want GPSS to run kinit, set the sasl.kerberos.kinit.cmd property to an empty value ("") or no value. |
sasl.kerberos.principal | The GPSS or user Kerberos service principal name; typically of the format <name>@<realm> or <primary>/<instance>@<realm>. |
sasl.kerberos.service.name | The Kafka Kerberos principal name. Obtain the value from the Kafka server server.properties configuration file. The default Kafka Kerberos service name is kafka. |
For example:
PROPERTIES: security.protocol: SASL_PLAINTEXT sasl.kerberos.service.name: kafka sasl.kerberos.keytab: /var/kerberos/krb5kdc/gpss.keytab sasl.kerberos.principal: gpss/localhost@REALM.COM sasl.kerberos.kinit.cmd: