Docker Pull Resolver Problem

Say you're running a docker swarm mode and your CI (e.g. gitlab runner) runs as a service in your swarm too.
Next to your ci service, you're running a registry inside the swarm too.

When your runner is entering the CD stage, it possibly wants to run docker push and might ended up like this.

docker push registry:5000/some_shit
Using default tag: latest
Error response from daemon: Get http://registry:5000/v2/: dial tcp: lookup registry on 10.50.0.2:53: no such host

Because you're clever, you attached the runner service and start debugging like this

# system configuration
cat /etc/resolv.conf 
search eu-central-1.compute.internal
nameserver 127.0.0.11
options timeout:2 attempts:5 ndots:0

# resolves correctly
dig registry

; <<>> DiG 9.11.2 <<>> enc-registry
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33867
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;registry.			IN	A
;; ANSWER SECTION:
registry.		600	IN	A	10.0.1.5
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Jan 24 10:48:31 UTC 2018
;; MSG SIZE  rcvd: 58

# other programms behave corretly too
curl http://registry:5000/v2/_catalog
{"repositories":["some_shit"]}

And now you don't understand the world anymore.

The puzzle solution is how you've started your CI service.

docker create service \
...
--mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock \
...

So it is using the docker daemon from your host system, which is not part of the swarm network and does not have the resolver information of any node in the swarm.

The solution: publish (-p 5000:5000) the ports of your registry and use localhost:5000 for pull and push in your CI job.