Docker Pull Resolver Problem

Say you're running a docker swarm mode and your CI (e.g. gitlab runner) runs as a service in your swarm too.
Next to your ci service, you're running a registry inside the swarm too.

When your runner is entering the CD stage, it possibly wants to run docker push and might ended up like this.

docker push registry:5000/some_shit
Using default tag: latest
Error response from daemon: Get http://registry:5000/v2/: dial tcp: lookup registry on no such host

Because you're clever, you attached the runner service and start debugging like this

# system configuration
cat /etc/resolv.conf 
search eu-central-1.compute.internal
options timeout:2 attempts:5 ndots:0

# resolves correctly
dig registry

; <<>> DiG 9.11.2 <<>> enc-registry
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33867
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;registry.			IN	A
registry.		600	IN	A
;; Query time: 0 msec
;; WHEN: Wed Jan 24 10:48:31 UTC 2018
;; MSG SIZE  rcvd: 58

# other programms behave corretly too
curl http://registry:5000/v2/_catalog

And now you don't understand the world anymore.

The puzzle solution is how you've started your CI service.

docker create service \
--mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock \

So it is using the docker daemon from your host system, which is not part of the swarm network and does not have the resolver information of any node in the swarm.

The solution: publish (-p 5000:5000) the ports of your registry and use localhost:5000 for pull and push in your CI job.