首页 运维 nginx Kubernetes中Nginx服务启动失败如何排查

Kubernetes中Nginx服务启动失败如何排查

May 23, 2023 am 09:25 AM
nginx kubernetes

❌pod节点启动失败,nginx服务无法正常访问,服务状态显示为ImagePullBackOff

[root@m1 ~]# kubectl get pods
NAME                    READY   STATUS             RESTARTS   AGE
nginx-f89759699-cgjgp   0/1     ImagePullBackOff   0          103m
登录后复制

查看nginx服务的Pod节点详细信息。

[root@m1 ~]# kubectl describe pod nginx-f89759699-cgjgp
Name:             nginx-f89759699-cgjgp
Namespace:        default
Priority:         0
Service Account:  default
Node:             n1/192.168.200.84
Start Time:       Fri, 10 Mar 2023 08:40:33 +0800
Labels:           app=nginx
                  pod-template-hash=f89759699
Annotations:      <none>
Status:           Pending
IP:               10.244.3.20
IPs:
  IP:           10.244.3.20
Controlled By:  ReplicaSet/nginx-f89759699
Containers:
  nginx:
    Container ID:   
    Image:          nginx
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zk8sj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-zk8sj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zk8sj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Normal   BackOff  57m (x179 over 100m)    kubelet  Back-off pulling image "nginx"
  Normal   Pulling  7m33s (x22 over 100m)   kubelet  Pulling image "nginx"
  Warning  Failed   2m30s (x417 over 100m)  kubelet  Error: ImagePullBackOff
登录后复制

发现,获取nginx镜像失败。可能是由于Docker服务引起的。

于是,检查Docker是否正常启动

systemctl status docker
登录后复制

发现,docker服务启动失败????,手动尝试重新启动。

systemctl restart docker
登录后复制

但是,重启docker服务失败,出现如下报错信息。

[root@m1 ~]# systemctl restart docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.
登录后复制

执行systemctl restart docker命令失效。

接着,当执行docker version命令时,发现未能连接到Docker daemon

[root@m1 ~]# docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:11 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
登录后复制

于是,再次通过执行systemctl status docker命令,查看docker服务未能启动,阅读输出报错信息,如下所示。

[root@m1 ~]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2023-03-10 10:28:16 CST; 4min 35s ago
     Docs: https://docs.docker.com
 Main PID: 2221 (code=exited, status=1/FAILURE)

Mar 10 10:28:13 m1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 10:28:13 m1 systemd[1]: docker.service: Failed with result &#39;exit-code&#39;.
Mar 10 10:28:13 m1 systemd[1]: Failed to start Docker Application Container Engine.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Mar 10 10:28:16 m1 systemd[1]: Stopped Docker Application Container Engine.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Start request repeated too quickly.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Failed with result &#39;exit-code&#39;.
Mar 10 10:28:16 m1 systemd[1]: Failed to start Docker Application Container Engine.
[root@m1 ~]#
登录后复制

通过上述输出显示,Docker 服务进程的启动失败,状态为 1/FAILURE

✅接下来,尝试通过以下步骤来排查和解决问题:

1️⃣查看 Docker 服务日志:使用以下命令查看 Docker 服务日志,以便更详细地了解失败原因。

sudo journalctl -u docker.service
登录后复制

Kubernetes中Nginx服务启动失败如何排查

2️⃣ 通过输出Ddocker日志分析,提取到了相关报错信息片段,发现是配置daemon中的/etc/docker/daemon.json配置文件出错导致的。

Mar 10 10:20:17 m1 systemd[1]: Starting Docker Application Container Engine...
Mar 10 10:20:17 m1 dockerd[1572]: unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character &#39;"&#39; after object key:value pair
Mar 10 10:20:17 m1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 10:20:17 m1 systemd[1]: docker.service: Failed with result &#39;exit-code&#39;.
Mar 10 10:20:17 m1 systemd[1]: Failed to start Docker Application Container Engine.
Mar 10 10:20:19 m1 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Mar 10 10:20:19 m1 systemd[1]: docker.service: Scheduled restart job, restart counter is at 2.
Mar 10 10:20:19 m1 systemd[1]: Stopped Docker Application Container Engine.
登录后复制

3️⃣此时,查看daemon配置文件/etc/docker/daemon.json是否配置正确。

[root@m1 ~]# cat /etc/docker/daemon.json
{	
  # 设置 Docker 镜像的注册表镜像源为阿里云镜像源。
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"]
  # 指定 Docker 守护进程使用 systemd 作为 cgroup driver。
  "exec-opts": ["native.cgroupdriver=systemd"]
}
登录后复制

咋一看,配置信息没有什么问题,都是正确的,但仔细一看,就会发现应该在"registry-mirrors"选项的结尾添加逗号。犯了缺少逗号(,)导致的语法错误,终于找到了问题根源。

修改后:

[root@m1 ~]# cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}

[root@m1 ~]# cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
登录后复制

按下:wq报错退出。

4️⃣ 重新加载系统并重新启动Docker服务

systemctl daemon-reload
systemctl restart docker
systemctl status docker
登录后复制

5️⃣检查docker版本信息是否输出正常

[root@m1 ~]# docket version
-bash: docket: command not found
[root@m1 ~]# docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:11 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:29 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
登录后复制
[root@m1 ~]# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 20
  Running: 8
  Paused: 0
  Stopped: 12
 Images: 20
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-372.9.1.el8.x86_64
 Operating System: Rocky Linux 8.6 (Green Obsidian)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 9.711GiB
 Name: m1
 ID: 4YIS:FHSB:YXRI:CED5:PJSJ:EAS2:BCR3:GJJF:FDPK:EDJH:DVKU:AIYJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://w2kavmmf.mirror.aliyuncs.com/
 Live Restore Enabled: false
登录后复制

至此,Docker服务重启成功,pod节点恢复正常,Nginx服务能够正常访问。

[root@m1 ~]# kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
nginx-f89759699-cgjgp   1/1     Running   0          174m
登录后复制

查看pod详细信息,显示正常。

[root@m1 ~]# kubectl describe pod nginx-f89759699-cgjgp
Name:             nginx-f89759699-cgjgp
Namespace:        default
Priority:         0
Service Account:  default
Node:             n1/192.168.200.84
Start Time:       Fri, 10 Mar 2023 08:40:33 +0800
Labels:           app=nginx
                  pod-template-hash=f89759699
Annotations:      <none>
Status:           Running
IP:               10.244.3.20
IPs:
  IP:           10.244.3.20
Controlled By:  ReplicaSet/nginx-f89759699
Containers:
  nginx:
    Container ID:   docker://88bdc2bfa592f60bf99bac2125b0adae005118ae8f2f271225245f20b7cfb3c8
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:aa0afebbb3cfa473099a62c4b32e9b3fb73ed23f2a75a65ce1d4b4f55a5c2ef2
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Fri, 10 Mar 2023 10:37:42 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zk8sj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-zk8sj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zk8sj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason   Age                   From     Message
  ----    ------   ----                  ----     -------
  Normal  BackOff  58m (x480 over 171m)  kubelet  Back-off pulling image "nginx"
[root@m1 ~]#
登录后复制

Kubernetes中Nginx服务启动失败如何排查

以上是Kubernetes中Nginx服务启动失败如何排查的详细内容。更多信息请关注PHP中文网其他相关文章!

本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

Video Face Swap

Video Face Swap

使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热工具

记事本++7.3.1

记事本++7.3.1

好用且免费的代码编辑器

SublimeText3汉化版

SublimeText3汉化版

中文版,非常好用

禅工作室 13.0.1

禅工作室 13.0.1

功能强大的PHP集成开发环境

Dreamweaver CS6

Dreamweaver CS6

视觉化网页开发工具

SublimeText3 Mac版

SublimeText3 Mac版

神级代码编辑软件(SublimeText3)

热门话题

Java教程
1663
14
CakePHP 教程
1419
52
Laravel 教程
1313
25
PHP教程
1263
29
C# 教程
1236
24
nginx在windows中怎么配置 nginx在windows中怎么配置 Apr 14, 2025 pm 12:57 PM

如何在 Windows 中配置 Nginx?安装 Nginx 并创建虚拟主机配置。修改主配置文件并包含虚拟主机配置。启动或重新加载 Nginx。测试配置并查看网站。选择性启用 SSL 并配置 SSL 证书。选择性设置防火墙允许 80 和 443 端口流量。

docker怎么启动容器 docker怎么启动容器 Apr 15, 2025 pm 12:27 PM

Docker 容器启动步骤:拉取容器镜像:运行 "docker pull [镜像名称]"。创建容器:使用 "docker create [选项] [镜像名称] [命令和参数]"。启动容器:执行 "docker start [容器名称或 ID]"。检查容器状态:通过 "docker ps" 验证容器是否正在运行。

docker容器名称怎么查 docker容器名称怎么查 Apr 15, 2025 pm 12:21 PM

可以通过以下步骤查询 Docker 容器名称:列出所有容器(docker ps)。筛选容器列表(使用 grep 命令)。获取容器名称(位于 "NAMES" 列中)。

怎么查看nginx是否启动 怎么查看nginx是否启动 Apr 14, 2025 pm 01:03 PM

确认 Nginx 是否启动的方法:1. 使用命令行:systemctl status nginx(Linux/Unix)、netstat -ano | findstr 80(Windows);2. 检查端口 80 是否开放;3. 查看系统日志中 Nginx 启动消息;4. 使用第三方工具,如 Nagios、Zabbix、Icinga。

docker怎么创建容器 docker怎么创建容器 Apr 15, 2025 pm 12:18 PM

在 Docker 中创建容器: 1. 拉取镜像: docker pull [镜像名] 2. 创建容器: docker run [选项] [镜像名] [命令] 3. 启动容器: docker start [容器名]

nginx怎么查版本 nginx怎么查版本 Apr 14, 2025 am 11:57 AM

可以查询 Nginx 版本的方法有:使用 nginx -v 命令;查看 nginx.conf 文件中的 version 指令;打开 Nginx 错误页,查看页面的标题。

nginx怎么配置云服务器域名 nginx怎么配置云服务器域名 Apr 14, 2025 pm 12:18 PM

在云服务器上配置 Nginx 域名的方法:创建 A 记录,指向云服务器的公共 IP 地址。在 Nginx 配置文件中添加虚拟主机块,指定侦听端口、域名和网站根目录。重启 Nginx 以应用更改。访问域名测试配置。其他注意事项:安装 SSL 证书启用 HTTPS、确保防火墙允许 80 端口流量、等待 DNS 解析生效。

nginx服务器挂了怎么办 nginx服务器挂了怎么办 Apr 14, 2025 am 11:42 AM

当 Nginx 服务器宕机时,可执行以下故障排除步骤:检查 nginx 进程是否正在运行。查看错误日志以获取错误消息。检查 nginx 配置语法正确性。确保 nginx 具有访问文件所需的权限。检查文件描述符打开限制。确认 nginx 正在侦听正确的端口。添加防火墙规则以允许nginx流量。检查反向代理设置,包括后端服务器可用性。如需进一步帮助,请联系技术支持。

See all articles