ARMHF Docker Swarm cluster with Consul+Registrator

This blog introduces the practices and configuration of a docker swarm cluster in Beagle Bone Black based testbeds. The goal is to have the orchestration of the different nodes that connected to the Beagle Bone Black, such as firmware update, configuration coordination, data concentration. To achieve this goal, we use Consul as the backend and Registrator as the tool for automatic service registry. It turns out to be that the combination of Swarm, Consul and Registrator is the killer for this kind of embedded-distributive system.

Introduction of Consul

Consul is a distributed, highly available, datacenter-aware, service discovery and configuration system. It can be used to present services and nodes in a flexible and powerful interface that allows clients to always have an up-to-date view of the infrastructure they are a part of.

Consul provides many different features that are used to provide consistent and available information about your infrastructure. This includes service and node discovery mechanisms, a tagging system, health checks, consensus-based election routines, system-wide key/value storage, and more. By leveraging consul within our ARMHF cluster like Beagle Bone Black or Raspberry Pi, we can easily build a sophisticated level of awareness into your applications and services.

There are a few neat properties of Consul:

  • Service discovery
  • Failure detection
  • Key/Value store
  • Bonus: supports multiple datacenter

Introduction of Registrator

Registrator is a single, host-level service you run as a Docker container. It watches for new containers, inspects them for service information, and registers them with a service registry. It also deregisters them when the container dies. It has a pluggable registry system, meaning it can work with a number of service discovery systems. Currently it supports Consul and etcd.

There are a few neat properties of Registrator:

  • Automatic
  • Uses environment variables as generic metadata to define the services.
  • The metadata Registrator uses could become a common interface for automatic service registration beyond Registrator and even beyond Docker.

Hardware Configuration

652403742704470930

 

As you can see from the picture above, we have three Beagle Bone Blacks connect to OpenWrt router via Ethernet cables. Then two of them are connected with CC3200 launchpad. In the real production environment, we will have all the Beagle Bone with CC3200 respectively, but now I replace one with USB-UART bridge to mimic the data sent from the CC3200, just for demonstrating the infrastructure.

Setting up the Consul servers

Currently we are running Consul in Beagle Bone Black boards with CC3200 connected to it. The idea is to have a Wi-Fi dense testbeds with a sniffer project. According to the recommendation of Consul developers, we should set up at least 3 servers in a cluster to prevent the fail of servers. Thus we might have 3 hosts, each running one or more docker containers with our CC3200 services and each running a Consul agent.

Okay, here is the commands:

(BeagleBoneBlack)ubuntu@localhost:~$ docker run -d -p 192.168.1.146:8300:8300 -p 192.168.1.146:8301:8301 -p 192.168.1.146:8301:8301/udp -p 192.168.1.146:8302:8302 -p 192.168.1.146:8302:8302/udp -p 192.168.1.146:8400:8400 -p 192.168.1.146:8500:8500 -p 192.168.1.146:8600:8600/udp --name supernode1 -h supernode1 sheenhx/consul -advertise 192.168.1.146

(BeagleBoneBlack)ubuntu@localhost:~$ docker run -d -p 192.168.1.115:8300:8300 -p 192.168.1.115:8301:8301 -p 192.168.1.115:8301:8301/udp -p 192.168.1.115:8302:8302 -p 192.168.1.115:8302:8302/udp -p 192.168.1.115:8400:8400 -p 192.168.1.115:8500:8500 -p 192.168.1.115:8600:8600/udp --name supernode2 -h supernode2 sheenhx/consul -advertise 192.168.1.115 -join 192.168.1.146

(BeagleBoneBlack)ubuntu@localhost:~$ docker run -d -p 192.168.1.184:8300:8300 -p 192.168.1.184:8301:8301 -p 192.168.1.184:8301:8301/udp -p 192.168.1.184:8302:8302 -p 192.168.1.184:8302:8302/udp -p 192.168.1.184:8400:8400 -p 192.168.1.184:8500:8500 -p 192.168.1.184:8600:8600/udp --name supernode3 -h supernode3 sheenhx/consul -advertise 192.168.1.184 -join 192.168.1.146

 

On each node, we expose the ports, give it a number for acting as a supernode. And most importantly, join the first one by specifying the IP address 192.168.1.146.

I composed the Dockerfile for this consul container, since currently there is no image for ARMHF platform. This image contains the JSON configuration file for the server, so it can only act as a server, I will release the client version for ARMHF.

Once you have all the server running, check it by:

$ docker logs supernode2
==> WARNING: Expect Mode enabled, expecting 3 servers
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Joining cluster...
Join completed. Synced with 1 initial agents
==> Consul agent running!
Node name: 'supernode2'
Datacenter: 'twist_wifi1'
Server: true (bootstrap: false)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.1.115 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>

==> Log data will now stream in as it occurs:

2016/04/03 15:27:57 [INFO] raft: Node at 192.168.1.115:8300 [Follower] entering Follower state
2016/04/03 15:27:57 [INFO] serf: EventMemberJoin: supernode2 192.168.1.115
2016/04/03 15:27:57 [INFO] serf: EventMemberJoin: supernode2.twist_wifi1 192.168.1.115
2016/04/03 15:27:57 [INFO] consul: adding LAN server supernode2 (Addr: 192.168.1.115:8300) (DC: twist_wifi1)
2016/04/03 15:27:57 [INFO] consul: adding WAN server supernode2.twist_wifi1 (Addr: 192.168.1.115:8300) (DC: twist_wifi1)
2016/04/03 15:27:57 [INFO] agent: (LAN) joining: [192.168.1.146]
2016/04/03 15:27:57 [INFO] serf: EventMemberJoin: supernode1 192.168.1.146
2016/04/03 15:27:57 [INFO] consul: adding LAN server supernode1 (Addr: 192.168.1.146:8300) (DC: twist_wifi1)
2016/04/03 15:27:57 [INFO] serf: EventMemberJoin: supernode3 192.168.1.184
2016/04/03 15:27:57 [INFO] consul: Attempting bootstrap with nodes: [192.168.1.115:8300 192.168.1.146:8300 192.168.1.184:8300]
2016/04/03 15:27:57 [INFO] consul: adding LAN server supernode3 (Addr: 192.168.1.184:8300) (DC: twist_wifi1)
2016/04/03 15:27:57 [INFO] agent: (LAN) joined: 1 Err: <nil>
2016/04/03 15:27:57 [ERR] agent: failed to sync remote state: No cluster leader
2016/04/03 15:27:58 [INFO] consul: New leader elected: supernode3
2016/04/03 15:28:00 [INFO] agent: Synced service 'consul'
==> Failed to check for updates: Get https://checkpoint-api.hashicorp.com/v1/check/consul?arch=arm&os=linux&signature=c4206bd4-2603-41bb-8685-ad6c2feb0cbc&version=0.6.4: x509: failed to load system roots and no roots provided
2016/04/03 15:32:59 [WARN] memberlist: Was able to reach supernode3 via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP
2016/04/03 19:42:50 [INFO] raft: Duplicate RequestVote for same term: 2
2016/04/03 19:42:50 [INFO] consul: New leader elected: supernode3
2016/04/03 19:42:51 [INFO] consul: New leader elected: supernode3
2016/04/03 19:43:50 [INFO] consul: New leader elected: supernode3
2016/04/03 19:43:51 [ERR] agent: coordinate update error: rpc error: No cluster leader
2016/04/03 19:43:52 [WARN] raft: Rejecting vote request from 192.168.1.146:8300 since we have a leader: 192.168.1.184:8300
2016/04/03 19:43:52 [INFO] consul: New leader elected: supernode1
2016/04/04 10:10:02 [INFO] consul.fsm: snapshot created in 359.436µs
2016/04/04 10:10:02 [INFO] raft: Starting snapshot up to 8195
2016/04/04 10:10:02 [INFO] snapshot: Creating new snapshot at /var/consul/raft/snapshots/5-8195-1459764602702.tmp
2016/04/04 10:10:02 [INFO] raft: Snapshot to 8195 complete

I ran it for two days, as we can see we have a leader initially in supernode3, then it changed to supernode1. It is done automatically.

Here comes to the problem, we will run different containers in each nodes, they all expose different ports ,have different names. And we want to check their health regularly, if something wrong we would respawn it immediately. How can we do that?

Luckily, we have registrator.

Setting up the registrator for auto service registry

It will sit there quietly, watching for new containers that are started on the same host where it is currently running, extracting information from them and then registering those containers with your service discovery solution. It will also watch for containers that are stopped (or simply die) and will deregister them. Additionally, it supports pluggable service discovery mechanisms so you are not restricted to any particular solution.

Let’s run the docker image I compiled for ARMHF:

$ docker run -d \
-v /var/run/docker.sock:/tmp/docker.sock \
--name registrator -h registrator \
sheenhx/armhf-registrator:latest consul://192.168.1.115:8500

$ docker logs registrator
2016/04/04 21:00:49 Starting registrator v7 ...
2016/04/04 21:00:49 Using consul adapter: consul://192.168.1.115:8500
2016/04/04 21:00:49 Connecting to backend (0/0)
2016/04/04 21:00:49 consul: current leader 192.168.1.146:8300
2016/04/04 21:00:49 Listening for Docker events ...
2016/04/04 21:00:49 Syncing services on 2 containers
2016/04/04 21:00:49 ignored: 6b814ce92196 no published ports
2016/04/04 21:00:49 ignored: e24064b4d1ef port 8600 not published on host
2016/04/04 21:00:49 added: e24064b4d1ef registrator:supernode2:8600:udp
2016/04/04 21:00:50 added: e24064b4d1ef registrator:supernode2:8300
2016/04/04 21:00:50 added: e24064b4d1ef registrator:supernode2:8301
2016/04/04 21:00:50 added: e24064b4d1ef registrator:supernode2:8301:udp
2016/04/04 21:00:50 added: e24064b4d1ef registrator:supernode2:8302
2016/04/04 21:00:50 added: e24064b4d1ef registrator:supernode2:8400
2016/04/04 21:00:51 added: e24064b4d1ef registrator:supernode2:8500
2016/04/04 21:00:51 added: e24064b4d1ef registrator:supernode2:8302:udp

 

Then we will have all the services registered on the consul backend:

20160404232342
Lets run a MQTT client on supernode1, then take a look at the service backend.

 

supernode1$ docker run -d -p 1883:1883 --name mqtt1 vlabakje/rpi-mqtt

 

Then check the backend20160404231808

 

It is there! Magic, uh?

Let’s it another time on supernode2, then query the Consul from OpenWrt:

supernode2$ docker run -d -p 1883:1883 --name mqtt2 vlabakje/rpi-mqtt

 

root@OpenWrt:~# curl 192.168.1.146:8500/v1/catalog/service/rpi-mqtt
[
    {"Node":"supernode1",
     "Address":"192.168.1.146",
     "ServiceID":"registrator:mqtt1:1883",
     "ServiceName":"rpi-mqtt",
     "ServiceTags":[],
     "ServiceAddress":"",
     "ServicePort":1883,
     "ServiceEnableTagOverride":false,
     "CreateIndex":13123,
     "ModifyIndex":13123},

    {"Node":"supernode2",
     "Address":"192.168.1.115",
     "ServiceID":"registrator:mqtt1:1883",
     "ServiceName":"rpi-mqtt",
     "ServiceTags":[],
     "ServiceAddress":"",
     "ServicePort":1883,
     "ServiceEnableTagOverride":false,
     "CreateIndex":13136,
     "ModifyIndex":13136}
]

 

As we can see from the query that we can add more attributes like tag and address.

 

Serial Concentrator

Now the service discovery and registration are all working, we can use the check functionality to ensure the high availability of the ARMHF cluster. We still need to exploit the most important functionality of the Consul: K/V store.

Ideally, all the end nodes are connected to the Beagle Bone Black via USB cable, in which the data are transmitted via UART protocol. That is the reason why we need a serial concentrator to extract the data from USB port, then publish to the server.

At the same time, we would also like to make sure that all the nodes are synchronized with all the configurations.

Let’s think about the normal cluster without consul and registrator: Usually the commands are send by the data channel that we use for the concentrator,  which causes a potential problem that if  someone wants to check the current configuration, we should send the query via UART. This takes a lot of time and has a high change of failure.

To deal with this problem, we will store all the configurations like Wifi channel, Wifi Mode to the key value store, so we will know that when the change of the configuration is done, which node doesn’t change and where does the error occur.

 

Type in the following commands in any of the node:

$ ocker run -d --env-file ./env.list --device /dev/ttyUSB:/dev/ttyUSB --link supernodeX:consul --name [MAC] sheenhx/armhf-concentrator

 

We use –device link the local UART port to the container. For the CC3200, you need to enumerate the UART port manually in Beagle Bone by:

 $ modprobe ftdi_sio
 $ echo 0451 c32a > /sys/bus/usb-serial/drivers/ftdi_sio/new_id

 

We can use the following HTTP APIs to test this concentrator:

$curl -X PUT -d '1000' http://192.168.1.146:8500/v1/kv/interval?flags=1 #change interval of collecting packets

$curl -X PUT -d 'START' http://192.168.1.146:8500/v1/kv/status/all?flags=1 #restart the concentrator

 

And the corresponding logs for the concentrator:

$ docker logs 0033
 K/V changed! 
INFO:requests.packages.urllib3.connectionpool:
Starting new HTTP connection (1): 127.0.0.1
 INFO:root:Sending command: CFG
 INFO:root:Command result: OK
 INFO:root:Sending command: CFG+INTVL=10
 INFO:root:Command result: OK INFO:root:True
 INFO:root:True INFO:root:True K/V changed!
 INFO:requests.packages.urllib3.connectionpool:
Starting new HTTP connection (1): 127.0.0.1 K/V changed! 
INFO:requests.packages.urllib3.connectionpool:
Starting new HTTP connection (1): 127.0.0.1 
INFO:root:Sending command: RESTART INFO:root:Command result: OK 
INFO:root:update Consul K/V Status INFO:root:True K/V changed! 
INFO:requests.packages.urllib3.connectionpool:
Starting new HTTP connection (1): 127.0.0.1

 

Here comes the backend UI:

consul1

The channel value and mode value are all updated, we can also curl in the server, and use the JSON response for later programming logic.

 

Docker Swarm:

Finally the docker swarm is running on armhf Beagle Bone Black:

(BeagleBoneBlack)ubuntu@localhost:~$ docker -H :4000 info
Containers: 24
 Running: 14
 Paused: 0
 Stopped: 10
Images: 33
Server Version: swarm/1.2.3
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 5
 localhost.localdomain: 192.168.1.88:2375
  └ ID: P4XZ:IBAD:X57O:TEWM:PC2M:AQZK:4L35:Y5RE:ULED:PFXR:ESFR:7DO5
  └ Status: Healthy
  └ Containers: 4
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 514.8 MiB
  └ Labels: executiondriver=, kernelversion=3.19.0-66-generic, operatingsystem=, storagedriver=aufs
  └ UpdatedAt: 2016-09-10T10:57:49Z
  └ ServerVersion: 1.11.2
 localhost.localdomain: 192.168.1.90:2375
  └ ID: MFM7:ZKT4:KOKS:ETCS:O5QV:H5SJ:MOKD:IEOW:WAPO:IUMU:VSTP:7QS7
  └ Status: Healthy
  └ Containers: 3
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 514.8 MiB
  └ Labels: executiondriver=, kernelversion=3.19.0-66-generic, operatingsystem=Ubuntu 15.04, storagedriver=aufs
  └ UpdatedAt: 2016-09-10T10:58:00Z
  └ ServerVersion: 1.11.2
 localhost.localdomain: 192.168.1.89:2375
  └ ID: CI6Q:ZO32:5GAG:ZWZC:PLRP:EU5Q:WP6J:E3NQ:EFPT:2RB4:3NG4:X6KF
  └ Status: Healthy
  └ Containers: 3
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 514.8 MiB
  └ Labels: executiondriver=, kernelversion=3.19.0-66-generic, operatingsystem=, storagedriver=aufs
  └ UpdatedAt: 2016-09-10T10:58:09Z
  └ ServerVersion: 1.11.2
 localhost.localdomain: 192.168.1.92:2375
  └ ID: GAY6:AH2V:NPLS:4N42:VYTF:YGPC:EKS3:VZQJ:DKBJ:EDMN:PATD:URFD
  └ Status: Healthy
  └ Containers: 11
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 514.8 MiB
  └ Labels: executiondriver=, kernelversion=3.19.0-66-generic, operatingsystem=Ubuntu 15.04, storagedriver=aufs
  └ UpdatedAt: 2016-09-10T10:58:16Z
  └ ServerVersion: 1.11.2
 localhost.localdomain: 192.168.1.87:2375
  └ ID: ZDD7:KRGJ:LL7W:CHTB:E6IH:YNAV:6EK2:2QMP:NHK4:65KG:CK7F:DC3W
  └ Status: Healthy
  └ Containers: 3
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 514.8 MiB
  └ Labels: executiondriver=, kernelversion=3.19.0-66-generic, operatingsystem=, storagedriver=aufs
  └ UpdatedAt: 2016-09-10T10:57:53Z
  └ ServerVersion: 1.11.2
Plugins: 
 Volume: 
 Network: 
Kernel Version: 3.19.0-66-generic
Operating System: linux
Architecture: arm
CPUs: 5
Total Memory: 2.514 GiB
Name: 70156ddad663
Docker Root Dir: 
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support

Please just follow the instructions in https://docs.docker.com/swarm/install-manual/

But in Ubuntu snappy , we need to open the remote API port with systemd:

(BeagleBoneBlack)ubuntu@localhost:~$ cat /var/lib/apps/docker/current/etc/docker.conf
# Docker systemd conf

DOCKER_OPTIONS="-H tcp://0.0.0.0:2375"

Then

(BeagleBoneBlack)ubuntu@localhost:~$ sudo systemctl restart docker_docker-daemon_1.11.2

Use the armhf version for swarm:

$ docker run -d -p 4000:4000 hypriot/rpi-swarm manage -H :4000 --replication --advertise 192.168.1.92:4000 consul://192.168.1.87:8500
$ docker run -d hypriot/rpi-swarm join --advertise=192.168.1.87:2375 consul://192.168.1.87:8500

congrats!

Recent posts

Recent Comments

Categories

Sheen Written by:

A Lead Hardware Engineer lives in Berlin. Six Years' experience of wireless hardware design and embedded programming. He has two Chinese patents and several award-wining IoT products. Now he focuses on bridging the hardware innovations between world and Shenzhen.

Be First to Comment

Leave a Reply