Building a Connector

A newer version of this documentation is available. Click here to view the most up-to-date release of the Greenplum 5.x documentation.

This section describes considerations for building a connector, including selecting an IDE and/or build tool, setting up the PXF development environment, and identifying compile-time dependencies.

Note: The PXF developer documentation includes a set of exercises that expand on the topics covered. These exercises operate on a built-in PXF connecter named Demo, an example connector to the local file system. You will extend a local copy of the Demo connector throughout exercises in this guide.

Identifying Connector Compile-Time Dependencies

You must identify and satisfy the compile-time dependencies that your connector or plug-in has on external components. For example, the PXF HDFS connector utilizes the Hadoop Java API. This connector has compile-time dependencies on several classes in the com.apache.hadoop package. These dependencies are fulfilled via a number of JAR files provided in a Hadoop client installation.

You can set up the build environment for your connector to retrieve compile-time dependencies from maven or other remote repositories. Your connector may also have compile-time dependencies that you fulfill via a JAR file local to your build system.

All PXF connectors have a compile-time dependency on the pxf-api-<version>.jar file. Your connector may also have compile-time dependencies on other PXF jar files if you extend any of the PXF classes implemented in those JAR files.

PXF JAR files are not currently available from a remote repository. You must copy the JAR file(s) from your Greenplum Database installation to your development system before you build your connector.

PXF JAR files are available from the following directory in your Greenplum Database installation:

$GPHOME/pxf/lib

Setting up the PXF Development Environment

You can develop with the PXF SDK on your operating system of choice and with the IDE or build environment of your choice.

You must install the Java Development Kit on your development system to develop with the PXF SDK. You must also obtain the PXF API JAR file, and the JAR file(s) for any PXF built-in connectors whose plug-in classes you will extend.

Prerequisites

Before setting up your PXF development environment, ensure that you have:

  • Access to a system on which you can develop Java code.
  • Administrative access to your development system.
  • Secure shell access to the Greenplum Database master host to copy files.

Procedure

Perform the following procedure to set up your PXF development environment. This procedure assumes a Linux-based development system.

  1. Create a work directory. For example:

    user@devsystem$ mkdir pxf_dev
    user@devsystem$ cd pxf_dev
    user@devsystem$ export PXFDEV_BASE=`pwd`
    

    Exercises in this guide reference your work directory. You may consider adding $PXFDEV_BASE to your .bash_profile or equivalent shell initialization script.

  2. If not already present on your development system, install Java Development Kit version 1.7 or 1.8. You must have superuser permissions to install operating system packages. For example, to install the JDK on a CentOS development system:

    root@devsystem$ sudo yum install java-1.8.0-openjdk-1.8.0*
    
  3. Obtain the PXF API JAR file pxf-api-<version>.jar and copy it to your work directory. You can copy this file from your Greenplum Database installation. For example:

    user@devsystem$ cd $PXFDEV_BASE
    user@devsystem$ scp gpuser@gpmaster:/usr/local/greenplum-db/pxf/lib/pxf-api-<version>.jar .
    
  4. Copy any other PXF JAR files that you require. For example:

    user@devsystem$ scp gpuser@gpmaster:/usr/local/greenplum-db/pxf/lib/pxf-hdfs-<version>.jar .
    

Example: Building the Demo Connector JAR file

In this exercise, you create a local copy of the Demo connector and use the build tool gradle to build your local copy. You may choose to use an IDE or a different, equivalent, build tool.

About the Demo Connector

The Demo connector supports read and write operations on text format files stored on the local file system. The Demo connector read operation currently returns static data. The Demo connector fully supports writing to the local file system.

Package and Class Names

Your local copy of the Demo connector source code will reside in a packaged named org.greenplum.pxf.example.demo.

The Demo connector plug-in classes:

Class Name Description
DemoFragmenter Template implementation of the Fragmenter class. Returns static fragment information.
DemoAccessor Template implementation of the ReadAccessor interface. Returns static data.
DemoTextResolver Implements ReadResolver and WriteResolver interfaces. Deserializes and serializes text format data.
DemoFileWritableAccessor Implements the WriteAccessor interface. Writes text format data to the local file system.

Compile-Time Dependencies

The Demo connector has compile-time dependencies on the pxf-api-<version>.jar file and the Apache Commons Logging JAR file, commons-logging.jar.

When you build the Demo connector in this exercise, you will satisfy these compile-time dependencies via a local file and a maven repository.

Prerequisites

Before building the Demo connector, ensure that you have:

  • Set up your development environment as described in an earlier topic.
  • Installed gradle on your development system. Refer to Gradle Build Tool Installation for instructions.

Procedure

Perform the following procedure to create a local copy of the Demo connector source code, update package names, configure compile-time dependencies, and use gradle to build the connector.

  1. Download the PXF Demo connector source code. You can obtain the PXF source code from the Apache HAWQ (incubating) incubator-hawq github repository. For example:

    user@devsystem$ cd $PXFDEV_BASE
    user@devsystem$ git clone https://github.com/apache/incubator-hawq.git
    

    The clone operation creates a directory named incubator-hawq/ in your current working directory.

  2. Create a project directory for your copy of the source code and navigate to that directory. For example:

    user@devsystem$ mkdir demo_example
    user@devsystem$ cd demo_example
    
  3. Create a libs directory for dependent packages, and copy the PXF API JAR file you previously downloaded to libs/. For example:

    user@devsystem$ mkdir libs
    user@devsystem$ cp $PXFDEV_BASE/pxf-api-<version>.jar libs/
    
  4. The source code for the PXF Demo connector is located in the incubator-hawq/pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/examples directory of the repository you cloned in Step 1. Copy this code to your work area. For example:

    user@devsystem$ mkdir -p src/main/java/org/greenplum/pxf/example/demo
    user@devsystem$ cd src/main/java/org/greenplum/pxf/example/demo
    user@devsystem$ cp $PXFDEV_BASE/incubator-hawq/pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/examples/* .
    
  5. The original PXF Demo connector resides in the org.apache.hawq.pxf.api.examples package. Your Demo connector resides in a package named org.greenplum.pxf.example.demo. Update the package name in your local copy of the Demo connector source code. You can edit the files, run a script, etc. For example:

    user@devsystem$ sed -i.bak s/"org.apache.hawq.pxf.api.examples"/"org.greenplum.pxf.example.demo"/g *.java
    

    The sed command above creates a backup of each file. Remove the backup files. For example:

    user@devsystem$ rm *.bak
    
  6. Initialize a gradle Java library project for your Demo connector. For example:

    user@devsystem$ cd $PXFDEV_BASE/demo_example
    user@devsystem$ gradle init --type java-library
    

    This command generates build configuration files and scripts. Before building your Demo connector project, you must customize the build.gradle and settings.gradle files.

  7. Gradle uses the settings.gradle file rootProject.name setting for the base name of the built Java library JAR file. The default rootProject.name setting value is the base name of current working directory. Edit the settings.gradle file and supply a custom root project name for your Demo connector. For example:

    user@devsystem$ vi settings.gradle
    
    rootProject.name = 'my-demo-connector'
    
  8. Gradle uses the build.gradle file to identify the compile time dependencies for a project and other configuration. Your Demo connector depends on the PXF API JAR file (available locally) and the commons-logging.jar file (available from a maven repository). Edit your build.gradle file to supply these dependencies. For example:

    user@devsystem$ vi build.gradle
    

    Search for the repositories block, and add the bolded text to identify the location of the PXF API JAR file. Recall that you copied this file to the libs/ directory in Step 3. For example:

    repositories {
        // Use 'jcenter' for resolving your dependencies.
        // You can declare any Maven/Ivy/file repository here.
        jcenter()
        flatDir {
          dirs './libs'
        }
    }
    

    Search for the dependencies block and add the bolded text to identify the pxf-api-<version>.jar and common-logging-<version>.jar file as dependencies of your gradle project. For example:

    dependencies {
        ...
        testCompile 'junit:junit:4.12'
    
        compile 'commons-logging:commons-logging:1.1.3'
        compile 'org.apache.hawq.pxf.api:pxf-api:3.3.0.0'
    }
    
  9. Build your connector JAR file. For example:

    user@devsystem$ ./gradlew build
    

    gradle builds your code and writes the built my-demo-connector.jar JAR file to the build/libs directory.

  10. Locate your connector JAR file, and note this location:

    user@devsystem$ ls build/libs
    my-demo-connector.jar
    

    You will deploy and test your Demo connector JAR file in an upcoming exercise.