How to check hardware errors in Linux

WBOY
Release: 2022-05-17 10:02:34
Original
3351 people have browsed it

In Linux, you can use mcelog to check hardware errors; mcelog is a tool used to check hardware errors. Errors can be obtained based on the hot restart or hard restart caused by the error. The error information of the hot restart will be captured. If the hard restart error cannot be caught, you can use the "yum install mcelog" command to install it.

How to check hardware errors in Linux

#The operating environment of this tutorial: linux7.3 system, Dell G3 computer.

How to check hardware errors in Linux

1. mcelog is a tool used on Linux systems to check hardware errors, especially memory and CPU errors.

Uncorrected errors are critical exceptions that often result in kernel errors on the system if the CPU cannot recover. This causes the application to reset and interrupt.

For uncorrected errors, mcelog's ability to catch the error depends on whether the error resulted in a warm restart or a hard restart.

If it is a hot restart, the information will be captured by mcelog and can be seen after recovery. A hard reboot can result in data loss, and mcelog may not capture the event.

2. Installation

 [root@RedHat_test ~]# yum install mcelog.x86_64
Copy after login

3. How to start mcelog

  • ## cron: oldest The method, there are certain, scheduled tasks, some will be lost

  • daemon: This method is used on el7, the daemon process

  • trigger: A more advanced way, when triggered, see man mcelog

4, mcelog related files

    ##/dev/ mcelog device file
  • /var/log/mcelog messages log file
  • /etc/mcelog/mcelog.conf configuration file
  • /var/run/mcelog.pid
  • The default fault log is only recorded in /var/log/mcelog and is not recorded in the system log.

If it needs to be reflected in the system log, you need to modify the /etc/mcelog/mcelog.conf file, remove the preceding #, and save it.

5. Run mcelog in the background

 [root@RedHat_test ~]# mcelog --daemon
Copy after login
Copy after login

6. Check whether the system is abnormal

1. How to run mcelog manually

 [root@RedHat_test ~]# mcelog --daemon
Copy after login
Copy after login

2. Check the mcelog log

     [root@RedHat_test ~]# tail /var/log/mcelog
     # 什么也没有输出,表明正常
Copy after login

3. Check whether the mcelog daemon detects error information

     [root@RedHat_test ~]# mcelog --client
     # 什么也没有输出,表明正常
Copy after login

4. Parse the mcelog output when the system exception occurs

   [root@RedHat_test ~]# mcelog --ascii < file.log
     # or或者
     [root@RedHat_test ~]# mcelog --ascii --file file.log
Copy after login

Recommended learning:

Linux video tutorial

The above is the detailed content of How to check hardware errors in Linux. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template