Introduction

1. Introducing Spring Cloud Data Flow for Nomad

This project provides support for orchestrating long-running (streaming) and short-lived (task/batch) data microservices to Hashicorp Nomad.

2. Spring Cloud Data Flow

Spring Cloud Data Flow is a cloud-native orchestration service for composable data microservices on modern runtimes. With Spring Cloud Data Flow, developers can create and orchestrate data pipelines for common use cases such as data ingest, real-time analytics, and data import/export.

The Spring Cloud Data Flow architecture consists of a server that deploys Streams and Tasks. Streams are defined using a DSL or visually through the browser based designer UI. Streams are based on the Spring Cloud Stream programming model while Tasks are based on the Spring Cloud Task programming model. The sections below describe more information about creating your own custom Streams and Tasks

For more details about the core architecture components and the supported features, please review Spring Cloud Data Flow’s core reference guide. There’re several samples available for reference.

3. Spring Cloud Stream

Spring Cloud Stream is a framework for building message-driven microservice applications. Spring Cloud Stream builds upon Spring Boot to create standalone, production-grade Spring applications, and uses Spring Integration to provide connectivity to message brokers. It provides opinionated configuration of middleware from several vendors, introducing the concepts of persistent publish-subscribe semantics, consumer groups, and partitions.

For more details about the core framework components and the supported features, please review Spring Cloud Stream’s reference guide.

There’s a rich ecosystem of Spring Cloud Stream Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, we have generated RabbitMQ and Apache Kafka variants of these application-starters that are available for use from Maven Repo and Docker Hub as maven artifacts and docker images, respectively.

Do you have a requirement to develop custom applications? No problem. Refer to this guide to create custom stream applications. There’re several samples available for reference.

4. Spring Cloud Task

Spring Cloud Task makes it easy to create short-lived microservices. We provide capabilities that allow short-lived JVM processes to be executed on demand in a production environment.

For more details about the core framework components and the supported features, please review Spring Cloud Task’s reference guide.

There’s a rich ecosystem of Spring Cloud Task Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, the generated application-starters are available for use from Maven Repo. There are several samples available for reference.

Features

The Data Flow Server for Nomad includes the following highlighted features.

5. Support for Maven and Docker resources

Nomad supports both Java and Docker drivers that the Data Flow server can utilise to support apps registered as Maven and Docker resources.

For example, both the below app registrations (via the Data Flow Shell) are valid and supported:

dataflow:>app register --name http-mvn --type source --uri maven://org.springframework.cloud.stream.app:http-source-rabbit:1.1.0.RELEASE
dataflow:>app import --name http-docker --type source --uri app register --name http --type source --uri docker:springcloudstream/http-source-rabbit:1.1.0.RELEASE

See the Getting Started section for examples of deploying both Docker and Maven resource types.

6. Docker volume support

Docker volume support was added in Nomad 0.5. which allows the Data Flow Server for Nomad to support defining Docker volumes, both as deployer and app deployment properties.

Volumes defined at deployer level will be added to all deployed apps. This is handy for common shared folders that should be available to all apps.

Below is an example of volumes defined as a server deployer property:

spring.cloud.deployer.nomad:
  volumes: /opt/data:/data,/opt/config:/config

where volumes are defined as a comma separated list in the form of host_path:container_path.

Examples of the deployment property (via the Data Flow Shell) variation of defining volumes below:

dataflow:>stream create --name test --definition "time | file"
Created new stream 'timezoney'

dataflow:>stream deploy test --properties "app.file.spring.cloud.deployer.nomad.volumes=/opt/data:/data,/opt/config:/config"

See the Nomad Docker volume documentation for more details.

7. Ephemeral disks

The new ephemeral_disk stanza added in Nomad 0.5 is supported, allowing you to configure an ephemeral disk with app deployment properties.

8. App status derived from Consul

The default states for a Nomad job are not really indicative of a healthy application. For instance an app may appear to be running based on the Nomad Job/Allocation state but is in fact not healthy as determined by invoking the app’s health endpoint (e.g. /health). For a more accurate app status, the Data Flow Server for Nomad can now include the Consul (if available) health check status fo the registered app’s Service.

This feature is implemented using Spring Cloud Consul.

Getting Started

9. Deploying Streams on Nomad

The following guide assumes that you have a Nomad 0.5+ cluster available. If you do not have a Nomad cluster available, see the next section which describes running a local Nomad for development/testing otherwise continue to Installing the Data Flow Server for Nomad.

9.1. A local Nomad cluster with Vagrant Hashistack

There are a few ways to stand up a local Nomad cluster on your machine for testing. For the purpose of this guide, the hashistack-vagrant project will be used.

The hashistack-vagrant VM is configured by default with 2048 MB of memory and 2 CPUs. If you run into issues with job allocations failing because of resource starvation, you can tweak the memory and CPU configuration in the Vagrantfile.

Please see the Resource Allocations section in the spring-cloud-dataflow-server-nomad project Wiki for more information.

9.1.1. Installation and Getting Started

Follow the Quickstart section of the hashistack-vagrant.

Once you have successfully started the Vagrant VM and (with tmuxp load full-hashistack.yml) "hashistack", make sure you can use the nomad client to query the local instance:

vagrant@hashistack:~$ nomad status
No running jobs

You could also install the nomad binary locally and connect to the Nomad client running inside the VM. For example on Mac you could install Nomad with homebrew and add the --address option to your commands:

$ brew install nomad
...
$ nomad status --address=http://172.16.0.2:4646
ID    Type     Priority  Status
scdf  service  50        running

9.2. Installing the Data Flow Server for Nomad

To install a Data Flow Server and supporting infrastructure components to Nomad we will use the provided Job specification provided in the src/etc/nomad/ directory of the project’s GitHub repository.

This Job requires the Docker driver.

This job specification includes the following tasks:

  • Spring Cloud Data Flow Server for Nomad

  • MySQL - as the datasource for the Data Flow Server

  • Redis - for analytics support

  • Kafka - as the Spring Cloud Stream default binder implementation

If you are not using the hashistack-vagrant VM, please adjust the region and datacenters values in the scdf.nomad job specification file accordingly.

Next, using the SSH session to the VM, run the job with:

vagrant@hashistack:~$ nomad run https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-nomad/v1.1.0.RELEASE/src/etc/nomad/nexus.nomad
==> Monitoring evaluation "67f078e6"
    Evaluation triggered by job "scdf"
    Allocation "966c4cbd" created: node "c99cf24d", group "scdfr"
    Allocation "966c4cbd" status changed: "pending" -> "running"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "67f078e6" finished with status "complete"

Allow some time for the Docker images to be pulled and all containers to be started. You can verify that all tasks have been successfully started by checking the corresponding allocation status:

vagrant@hashistack:~$ nomad alloc-status 966c4cbd
ID                 = 3d235c82
Eval ID            = 04164728
Name               = scdf.scdf[0]
Node ID            = 2dd82384
Job ID             = scdf
Client Status      = running
Client Description = <none>
Created At         = 12/13/16s 09:15:20 UTC

Task "kafka" is "running"
Task Resources
CPU         Memory           Disk  IOPS  Addresses
13/500 MHz  506 MiB/512 MiB  0 B   0     kafka: 10.0.2.15:9092

Recent Events:
Time                   Type        Description
12/13/16 09:15:47 UTC  Started     Task started by client
12/13/16 09:15:36 UTC  Restarting  Task restarting in 10.338625681s
12/13/16 09:15:36 UTC  Terminated  Exit Code: 1, Exit Message: "Docker container exited with non-zero exit code: 1"
12/13/16 09:15:35 UTC  Started     Task started by client
12/13/16 09:15:24 UTC  Restarting  Task restarting in 10.118221449s
12/13/16 09:15:24 UTC  Terminated  Exit Code: 1, Exit Message: "Docker container exited with non-zero exit code: 1"
12/13/16 09:15:22 UTC  Started     Task started by client
12/13/16 09:15:20 UTC  Received    Task received by client

Task "mysql" is "running"
Task Resources
CPU        Memory           Disk  IOPS  Addresses
2/500 MHz  115 MiB/128 MiB  0 B   0     db: 10.0.2.15:3306

Recent Events:
Time                   Type      Description
12/13/16 09:15:22 UTC  Started   Task started by client
12/13/16 09:15:20 UTC  Received  Task received by client

Task "redis" is "running"
Task Resources
CPU        Memory          Disk  IOPS  Addresses
2/256 MHz  6.2 MiB/64 MiB  0 B   0     redis: 10.0.2.15:6379

Recent Events:
Time                   Type      Description
12/13/16 09:15:22 UTC  Started   Task started by client
12/13/16 09:15:20 UTC  Received  Task received by client

Task "scdf-server" is "running"
Task Resources
CPU        Memory           Disk  IOPS  Addresses
3/500 MHz  304 MiB/384 MiB  0 B   0     http: 10.0.2.15:9393

Recent Events:
Time                   Type        Description
12/13/16 09:15:42 UTC  Started     Task started by client
...

Task "zookeeper" is "running"
Task Resources
CPU        Memory          Disk  IOPS  Addresses
3/500 MHz  84 MiB/128 MiB  0 B   0     zookeeper: 10.0.2.15:2181
                                       follower: 10.0.2.15:2888
                                       leader: 10.0.2.15:3888

...

or alternatively check the health status of all services using the Consul UI:

Data Flow Server and components up

If you are using a local nomad binary you can reference the remote scdf.nomad file directly.

$ nomad run --address=http://172.16.0.2:4646 https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-nomad/v1.1.0.RELEASE/src/etc/nomad/scdf.nomad
...

9.3. Download and run the Spring Cloud Data Flow Shell

Download and run the Shell, targeting the Data Flow Server exposed via a Fabio route.

$ wget http://repo.spring.io/release/org/springframework/cloud/spring-cloud-dataflow-shell/1.1.0.RELEASE/spring-cloud-dataflow-shell-1.1.0.RELEASE.jar
$ java -jar spring-cloud-dataflow-shell-1.1.0.RELEASE.jar --dataflow.uri=http://scdf-server.hashistack.vagrant/

  ____                              ____ _                __
 / ___| _ __  _ __(_)_ __   __ _   / ___| | ___  _   _  __| |
 \___ \| '_ \| '__| | '_ \ / _` | | |   | |/ _ \| | | |/ _` |
  ___) | |_) | |  | | | | | (_| | | |___| | (_) | |_| | (_| |
 |____/| .__/|_|  |_|_| |_|\__, |  \____|_|\___/ \__,_|\__,_|
  ____ |_|    _          __|___/                 __________
 |  _ \  __ _| |_ __ _  |  ___| | _____      __  \ \ \ \ \ \
 | | | |/ _` | __/ _` | | |_  | |/ _ \ \ /\ / /   \ \ \ \ \ \
 | |_| | (_| | || (_| | |  _| | | (_) \ V  V /    / / / / / /
 |____/ \__,_|\__\__,_| |_|   |_|\___/ \_/\_/    /_/_/_/_/_/

1.1.0.RELEASE

Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>

9.4. Registering Stream applications with Docker resource

Now register all out-of-the-box stream applications using the Docker resource type, built with the Kafka binder in bulk with the following command.

For more details, review how to register applications.
dataflow:>app import --uri http://bit.ly/stream-applications-kafka-docker
Successfully registered applications: [source.tcp, sink.jdbc, source.http, sink.rabbit, source.rabbit, source.ftp, sink.gpfdist, processor.transform, source.loggregator, source.sftp, processor.filter, sink.cassandra, processor.groovy-filter, sink.router, source.trigger, sink.hdfs-dataset, processor.splitter, source.load-generator, processor.tcp-client, source.time, source.gemfire, source.twitterstream, sink.tcp, source.jdbc, sink.field-value-counter, sink.redis-pubsub, sink.hdfs, processor.bridge, processor.pmml, processor.httpclient, source.s3, sink.ftp, sink.log, sink.gemfire, sink.aggregate-counter, sink.throughput, source.triggertask, sink.s3, source.gemfire-cq, source.jms, source.tcp-client, processor.scriptable-transform, sink.counter, sink.websocket, source.mongodb, source.mail, processor.groovy-transform, source.syslog]

9.5. Deploy a simple stream in the shell

Create a simple ticktock stream definition and deploy it immediately using the following command:

dataflow:>stream create --name ticktock --definition "time | log" --deploy
Created new stream 'ticktock'
Deployment request has been sent

Verify the deployed apps using the by checking the status of the apps using the Shell:

ticktock streamd deployed

To verify that the stream is working as expected, tail the logs of the ticktock-log using nomad:

vagrant@hashistack:~$ nomad logs 71f7aba1
...
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:49:59
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:01
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:02
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:03
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:04
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:05
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 14:50:06
...

9.6. Registering Stream applications with Maven resource

The Data Flow Server for Nomad also supports apps registered with a Maven resource URI in addition to the Docker resource type. Using the ticktock stream example above, we will create a similar stream definition but using the Maven resource versions of the apps.

For this example we will register the apps individually using the following command:

dataflow:>app register --type source --name time-mvn --uri maven://org.springframework.cloud.stream.app:time-source-kafka:1.1.0.RELEASE
Successfully registered application 'source:time-mvn'
dataflow:>app register --type sink --name log-mvn --uri maven://org.springframework.cloud.stream.app:log-sink-kafka:1.1.0.RELEASE
Successfully registered application 'sink:log-mvn'
We couldn’t bulk import the Maven version of the apps as we did for the Docker versions because the app names would conflict, as the names defined in the bulk import files are the same across resource types. Hence we register the Maven apps with a -mvn suffix.

9.7. Deploy a simple stream in the shell

Create a simple ticktock-mvn stream definition and deploy it immediately using the following command:

dataflow:>stream create --name ticktock-mvn --definition "time-mvn | log-mvn" --deploy
Created new stream 'ticktock-mvn'
Deployment request has been sent
There could be a slight delay once the above command is issued. This is due to the Maven artifacts being resolved and cached locally. Depending on the size of the artifacts, this could take some time.

To verify that the stream is working as expected, tail the logs of the ticktock-mvn-log-mvn using nomad:

$ nomad logs -f 3f474cc7
...
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 18:34:23
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 18:34:25
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 18:34:26
...  INFO 1 --- [afka-listener-1] log-sink                                 : 11/29/16 18:34:27

10. Deploying Tasks on Nomad

Deploying Task applications using the Data Flow Server for Nomad is a similar affair to deploying Stream apps. Therefore, for brevity, only the Maven resource version of the task will be shown as an example.

10.1. Registering Task application with Maven resource

This time we will bulk import the Task application, as we do not have any Docker resource versions imported which would cause conflicts in naming. Import all Maven task applications with the following command:

dataflow:>app import --uri http://bit.ly/1-0-1-GA-task-applications-maven

10.2. Launch a simple task in the shell

Let’s create a simple task definition and launch it.

dataflow:>task create task1 --definition "timestamp"
dataflow:>task launch task1

Verify that the task executed successfully by executing these commands:

dataflow:>task list
╔═════════╤═══════════════╤═══════════╗
║Task Name│Task Definition│Task Status║
╠═════════╪═══════════════╪═══════════╣
║task1    │timestamp      │complete   ║
╚═════════╧═══════════════╧═══════════╝

dataflow:>task execution list
╔═════════╤══╤═════════════════════════════╤═════════════════════════════╤═════════╗
║Task Name│ID│         Start Time          │          End Time           │Exit Code║
╠═════════╪══╪═════════════════════════════╪═════════════════════════════╪═════════╣
║task1    │1 │Tue Dec 13 15:34:01 SAST 2016│Tue Dec 13 15:34:01 SAST 2016│0        ║
╚═════════╧══╧═════════════════════════════╧═════════════════════════════╧═════════╝

You can also view the task execution status by using the Data Flow Server UI.

10.2.1. Cleanup completed tasks

If you want to delete the Build and Pod created by this task execution, execute the following:

dataflow:>task destroy --name task1

Configuration

The Data Flow Server for Nomad supports all the common configuration options. See NomadDeployerProperties for the supported deployer configuration items.

11. Maven Configuration

The Maven configuration is important for resolving Maven app artifacts. The following example configures a spring remote Maven repository:

maven:
  remote-repositories.spring:
    url: http://repo.spring.io/libs-snapshot
    auth:
      username:
      password:
More configuration options can be seen in the Configure Maven Properties section in the Data Flow reference documentation.

Server Implementation

12. Server Properties

The following properties can be used to configure the Data Flow Nomad Server.

Name Usage Example Description

API Hostname/IP

spring.cloud.deployer.nomad.nomadHost=nomad-client.cluster.com

The hostname/IP address where a Nomad client is listening. Default is localhost

API Port

spring.cloud.deployer.nomad.nomadPort=14646

The port where a Nomad client is listening. Default is 4646

Region

spring.cloud.deployer.nomad.region=eu-west

The region to deploy apps into. Default to global. See here

Datacenters

spring.cloud.deployer.nomad.datacenters=dc1,dc2

A comma separated list of datacenters that should be targeted for deployment. Default value is dc1. See here

Priority

spring.cloud.deployer.nomad.priority=25

The default job priority. Default value is 50. See here

Environment Variables

spring.cloud.deployer.nomad.environmentVariables=ENV_VAR_1=test

Common environment variables to set for any deployed app.

Expose app via Fabio

spring.cloud.deployer.nomad.exposeViaFabio=true

Flag to indicate whether an app should be exposed via Fabio

HTTP Health Check - Path

spring.cloud.deployer.nomad.checkHttpPath=/management/health

The path of the HTTP endpoint which Consul (if available) will query to query the health

HTTP Health Check - Interval

spring.cloud.deployer.nomad.checkInterval=60000

This indicates the frequency of the health checks that Consul will perform. Specified in milliseconds. See here

HTTP Health Check - Timeout

spring.cloud.deployer.nomad.checkTimeout=14646

This indicates how long Consul will wait for a health check query to succeed. Specified in milliseconds. See here

Resource - CPU

spring.cloud.deployer.nomad.resources.cpu=500

The CPU required in MHz. Default is 1000MHz

Resource - Memory

spring.cloud.deployer.nomad.resources.memory=1024

The memory required in MB. Default is 512MB

Resource - Network

spring.cloud.deployer.nomad.resources.networkMBits=100

The number of MBits in bandwidth required

Ephemeral Disk - Sticky

spring.cloud.deployer.nomad.resources.ephemeralDisk.sticky=false

Specifies that Nomad should make a best-effort attempt to place the updated allocation on the same machine. See here

Ephemeral Disk - Migrate

spring.cloud.deployer.nomad.ephemeralDisk.migrate=false

Specifies that the Nomad client should make a best-effort

Ephemeral Disk - Size

spring.cloud.deployer.nomad.ephemeralDisk.size=500

Specifies the size of the ephemeral disk in MB

Logging - Max Files

spring.cloud.deployer.nomad.loggingMaxFiles=2

The maximum number of rotated files Nomad will retain. The default is 1 log file retention size

Logging - Max File Size

spring.cloud.deployer.nomad.loggingMaxFileSize=20

The size of each rotated file. The size is specified in MB. The default is 10MB max log file size

Restart Policy - Delay

spring.cloud.deployer.nomad.restartPolicyDelay=10000

A duration to wait before restarting a task. Specified in milliseconds. Default is 30000 milliseconds (30 seconds). See here

Restart Policy - Interval

spring.cloud.deployer.nomad.restartPolicyInterval=5000

The Interval begins when the first task starts and ensures that only X number of attempts number of. Specified in milliseconds. Default is 120000 milliseconds (120 seconds / 3 minutes). See here

Restart Policy - Attempts

spring.cloud.deployer.nomad.restartPolicyAttempts=1

Attempts is the number of restarts allowed in an Interval. Default is 3 attempts within 3 minutes. See here

Restart Policy - Mode

spring.cloud.deployer.nomad.restartPolicyMode=1

Mode is given as a string and controls the behavior when the task fails more than X number of attempts times in an Interval. Default value is "delay". See here

Docker - Entrypoint Style

spring.cloud.deployer.nomad.entryPointStyle=shell

Entry point style used for the Docker image. To be used to determine how to pass in properties

Docker - Volumes

spring.cloud.deployer.nomad.volumes=/var/test:/test

A comma separated list of host_path:container_path values. See here

Maven - Artifact Destination

spring.cloud.deployer.nomad.artifactDestination=local/app

The destination (path) where artifacts will be downloaded by default. Default value is local. See here

Maven - Java Options

spring.cloud.deployer.nomad.javaOpts=-Xms64m,-Xmx128m

A comma separated list of default Java options to pass to the JVM. Only applicable to the Maven resource deployer implementation. See for reference

Maven - Deployer Scheme

spring.cloud.deployer.nomad.deployerScheme=https

The URI scheme that the deployer server is running on. When deploying Maven resource based apps the artifact source URL includes the servers host and port. This property value is used when constructing the source URL. See here

Maven - Deployer Host

spring.cloud.deployer.nomad.deployerHost=scdf-nomad.cluster.com

The resolvable hostname of IP address that the deployer server is running on. When deploying Maven resource based apps the artifact source URL includes the servers host and port. This property value is used when constructing the source URL. See here

Maven - Deployer Port

spring.cloud.deployer.nomad.deployerPort=443

The port that the deployer server is listening on. When deploying Maven resource based apps the artifact source URL includes the servers host and port. This property value is used when constructing the source URL. See here

Maven - Minimum Java Version

spring.cloud.deployer.nomad.minimumJavaVersion=1.7

If set, the allocated node must support at least this version of a Java runtime environment. E.g. '1.8' for a minimum of a Java 8 JRE/JDK. See here

Deployment Properties

The following deployment properties are supported by the Data Flow Server for Nomad. These properties are passed as deployment properties when deploying streams or tasks. Below is an example of deploying a stream definition:

dataflow:>stream create --name test --definition "time | custom | log"
Created new stream 'test'

dataflow:>stream deploy test --properties "app.custom.spring.cloud.deployer.nomad.job.priority=75"
Deployment request has been sent for stream 'test'

Note the deployment property app.custom.spring.cloud.deployer.nomad.job.priority=75.

13. Supported Deployment Properties

Name Usage Example Description

Job Priority

spring.cloud.deployer.nomad.job.priority=75

Job priority. See here

Fabio - Expose flag

spring.cloud.deployer.nomad.fabio.expose=true

A flag to indicate whether the tags/labels that enable Fabio to configure routing are added or not. Specify a value of true to enable adding the URL prefix tags. See here

Fabio - Hostname

spring.cloud.deployer.nomad.fabio.route.hostname=testapp

The hostname that will be exposed via Fabio. If no hostname is provided, the deploymentId will be used. See here

Resources - CPU

spring.cloud.deployer.nomad.cpu=500

The CPU required in MHz

Resources - Memory

spring.cloud.deployer.nomad.memory=1024

The memory required in MB

Environment Variables

spring.cloud.deployer.nomad.environmentVariables=ENV_VAR_1=ENV_VAL_1

Environment variables passed at deployment time. This is to cater for adding variables like JAVA_OPTS to supported deployer types

Docker - Entrypoint Style

spring.cloud.deployer.nomad.entryPointStyle=shell

Entry point style used for the Docker image. To be used to determine how to pass in properties

Docker - Volumes

spring.cloud.deployer.nomad.volumes=/var/test:/test

A comma separated list of host_path:container_path values. See here

Meta

spring.cloud.deployer.nomad.meta=streamVersion=1.0.0,streamDescription=A test stream

An optional comma separated list of meta stanzas to add at the Job level

Ephemeral Disk - Sticky

spring.cloud.deployer.nomad.resources.ephemeralDisk.sticky=false

Specifies that Nomad should make a best-effort attempt to place the updated allocation on the same machine. See here

Ephemeral Disk - Migrate

spring.cloud.deployer.nomad.ephemeralDisk.migrate=false

Specifies that the Nomad client should make a best-effort

Ephemeral Disk - Size

spring.cloud.deployer.nomad.ephemeralDisk.size=500

Specifies the size of the ephemeral disk in MB

Maven - Java Options

spring.cloud.deployer.nomad.javaOpts=-Xms64m,-Xmx128m

A comma separated list of default Java options to pass to the JVM. Only applicable to the Maven resource deployer implementation. See for reference

‘How-to’ guides

This section provides answers to some common ‘how do I do that…​’ type of questions that often arise when using Spring Cloud Data Flow.

14. Deploying Custom Stream App as a Maven Resource

This section walks you through deploying a simple Spring Cloud Stream based application, packaged as a Maven artifact, with Nomad. The source code for this app is available in the following GitHub repository.

This guide assumes that you have gone through the Getting Started section and are using a local hashistack-vagrant environment. Adjust the steps accordingly if you are using an existing Nomad cluster.

14.1. Deploy a Nexus Repository

For Nomad to deploy the Maven artifact, it must be able to resolve and download the custom app’s Jar artifact. This means that the custom app must be deployed to an accessible Maven repository.

We will deploy a Nexus job specification to which we can deploy our custom application. Deploying the Nexus image is trivial thanks to a provided Nexus job specification available here.

Using nomad, run the Nexus job with:

vagrant@hashistack:~$ nomad run https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-nomad/v1.1.0.RELEASE/src/etc/nomad/nexus.nomad
...

If you are using a local nomad binary you can reference the remote scdf.nomad file directly.

$ nomad run --address=http://172.16.0.2:4646 https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-nomad/v1.1.0.RELEASE/src/etc/nomad/nexus.nomad
...

Wait for the Nexus image to be pulled and the deployment to be successful. Once the task is successfully running you should be able to access the Nexus UI at nexus.hashistack.vagrant.

The default credential for Nexus is admin/admin123 or deployment/deployment123

14.2. Configuring the Data Flow Server for Nomad

We need to configure the Data Flow Server to use this new Nexus instance as a remote Maven repository. If you have an existing deployment from the Getting Started section you will have to change it’s configuration.

You could edit the scdf.nomad job specification and rerun the job but possibly the easiest way to include the Nexus configuration is to remove the existing scdf job and run the scdf-with-nexus.nomad job specification.

vagrant@hashistack:~$ nomad stop scdf
...
vagrant@hashistack:~$ nomad run https://raw.githubusercontent.com/donovanmuller/spring-cloud-dataflow-server-nomad/v1.1.0.RELEASE/src/etc/nomad/scdf-with-nexus.nomad
...

Wait for the job and all tasks to be in a running state. You can verify that everything including Nexus has been started by checking Consul:

nexus

14.3. Cloning and Deploying the App

Next step is to deploy our custom app into the Nexus instance. First though, we need to clone the custom app source.

$ git clone https://github.com/donovanmuller/timezone-processor-kafka.git
$ cd timezone-processor-kafka

Next we deploy the application into our Nexus repository with

$ ./mvnw -s .settings.xml deploy -Dnexus.url=http://nexus.hashistack.vagrant/content/repositories/snapshots
...
Uploading: http://nexus.hashistack.vagrant/content/repositories/snapshots/io/switchbit/timezone-processor-kafka/maven-metadata.xml
Uploaded: http://nexus.hashistack.vagrant/content/repositories/snapshots/io/switchbit/timezone-processor-kafka/maven-metadata.xml (294 B at 9.3 KB/sec)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.171 s
[INFO] Finished at: 2016-12-13T15:39:43+02:00
[INFO] Final Memory: 25M/326M
[INFO] ------------------------------------------------------------------------
Substitute the value for -Dnexus.url with the URL matching your Nexus instance.

14.4. Deploying the Stream

Now that our custom app is ready, let’s register it with the Data Flow Server. Using the Data Flow Shell, targeted to our Nomad instance, register the timezone app with:

dataflow:>app register --name timezone --type processor --uri maven://io.switchbit:timezone-processor-kafka:1.0-SNAPSHOT
Successfully registered application 'processor:timezone'

The assumption is that the out-of-the-box apps have been imported previously as part of the Getting Started section. If the apps are not imported, import them now with:

dataflow:>app import --uri http://bit.ly/stream-applications-kafka-docker

It does not really matter whether the Docker or Maven out-of-the-box apps are registered.

Now we can define a stream using our timezone processor with:

dataflow:>stream create --name timezoney --definition "time | timezone | log"
Created new stream 'timezoney'

and deploy it with:

dataflow:>stream deploy timezoney --properties "app.timezone.timezone=Africa/Johannesburg"
Deployment request has been sent for stream 'timezoney'
The provided deployment property (app.timezone.timezone=Africa/Johannesburg) is the required timezone to convert the input times too.

Check the jobs created with:

vagrant@hashistack:~$ nomad status
ID                  Type     Priority  Status
nexus               service  50        running
scdf-nexus          service  50        running
timezoney-log       service  50        running
timezoney-time      service  50        running
timezoney-timezone  service  50        running

or check the Consul health checks for the timezoney-* stream apps:

timezoney stream deployed

View both the timezoney-timezone-0 and timezoney-log-0 apps for the expected log outputs.

vagrant@hashistack:~$ nomad logs -f 35c70aec
...
...  INFO 17275 --- [afka-consumer-1] o.s.c.s.b.k.KafkaMessageChannelBinder$1  : partitions revoked:[timezoney.time-0]
...  INFO 17275 --- [afka-consumer-1] o.s.c.s.b.k.KafkaMessageChannelBinder$1  : partitions assigned:[timezoney.time-0]
...  INFO 17275 --- [afka-listener-2] io.switchbit.TimezoneProcessor           : Converting time '12/13/16 20:01:29' to timezone: 'Africa/Johannesburg'
...  INFO 17275 --- [afka-listener-2] io.switchbit.TimezoneProcessor           : Converting time '12/13/16 20:01:30' to timezone: 'Africa/Johannesburg'
...  INFO 17275 --- [afka-listener-2] io.switchbit.TimezoneProcessor           : Converting time '12/13/16 20:01:31' to timezone: 'Africa/Johannesburg'
...  INFO 17275 --- [afka-listener-2] io.switchbit.TimezoneProcessor           : Converting time '12/13/16 20:01:32' to timezone: 'Africa/Johannesburg'
...  INFO 17275 --- [afka-listener-2] io.switchbit.TimezoneProcessor           : Converting time '12/13/16 20:01:33' to timezone: 'Africa/Johannesburg'

vagrant@hashistack:~$ nomad logs -f 047ef240
...
...  INFO 1 --- [afka-listener-1] log-sink                                 : 12/13/16 22:03:17
...  INFO 1 --- [afka-listener-1] log-sink                                 : 12/13/16 22:03:18
...  INFO 1 --- [afka-listener-1] log-sink                                 : 12/13/16 22:03:19
...  INFO 1 --- [afka-listener-1] log-sink                                 : 12/13/16 22:03:20
...  INFO 1 --- [afka-listener-1] log-sink                                 : 12/13/16 22:03:21

Once you’re done, destroy the stream with:

dataflow:>stream destroy timezoney
Destroyed stream 'timezoney'