Understanding Kafka Message Offset Management

Understanding Kafka Message Offset Management

As a Kafka consumer, GPSS must manage the progress of each load operation.

Legacy Consumer

The default behaviour of GPSS is that of a legacy Kafka consumer; it always stores the message offset for each load job in a history table in Greenplum Database.

High-Level Consumer

GPSS can also act as a high-level consumer when you specify a consumer group using the group.id Kafka client configuration property. High-level consumers take advantage of Kafka broker-based offset management. When the enable.auto.commit Kafka client property is also enabled (the default), GPSS automatically commits offsets to the Kafka broker by group. This allows you to monitor the Kafka consumed offset directly from the broker.

Recall that you specify Kafka client properties in the PROPERTIES (version 2) and rdkafka_prop (version 3 (Beta)) load configuration file block. For example:

PROPERTIES:
  group.id: gpss

Or,

rdkafka_prop:
  group.id: gpss
  enable.auto.commit: false

When acting as a high-level consumer, GPSS uses the CONSISTENCY (version 2) or consistency (version 3 (Beta)) load configuration file property and client enable.auto.commit settings to govern how it manages offsets. The CONSISTENCY/consistency setting identifies how, when (before commit, after commit, or never), and where (history table, broker, both, nowhere) GPSS writes the offset.

GPSS supports the following CONSISTENCY settings:

CONSISTENCY: { strong | at-least | at-most| none }

Summary

The following table summarizes the offset commit behaviour of GPSS:

Consistency Value Legacy Consumer High-Level Consumer
strong

[or empty]

GPSS stores offsets in a history table. GPSS stores offsets in both a history table and the Kafka broker.
at-least GPSS stores offsets in a history table. GPSS stores offsets in the Kafka broker before Commit().
at-most GPSS stores offsets in a history table. GPSS stores offsets in the broker after Commit().
none GPSS stores offsets in a history table. When enable.auto.commit=true, GPSS stores offsets in the broker automatically.

When enable.auto.commit=false, GPSS does not store offsets anywhere.