Home > System Tutorial > LINUX > Use log analysis to easily diagnose hardware failures

Use log analysis to easily diagnose hardware failures

王林
Release: 2024-01-01 21:29:32
forward
631 people have browsed it

A colleague discovered that the number of message logs on a certain machine suddenly increased sharply. After a brief check, there were memory-related errors, so he forwarded them to me for review.

Message log

Enter the server to view the message log. First, see what the alarm mentioned by the colleague is, as shown below:
Use log analysis to easily diagnose hardware failures
Really, channel 3, the memory in the first slot failed. However, I only know A1/B1/A2/B2, so I continue.

Ipmitool tool

No matter what, after checking it with the Ipmitool tool, there is indeed a memory alarm, as shown below

Use log analysis to easily diagnose hardware failures

Although there is an alarm, it is impossible to locate the specific memory that is broken

IDRAC-web

No matter what, we still have the web page of DELL's own IDRAC to check the hardware status. Log in and take a look. First, look at the log. Here it is, B6 memory slot failure

Use log analysis to easily diagnose hardware failures

Look at the hardware status again, there is an alarm in B6 memory

Use log analysis to easily diagnose hardware failures

In this regard, I found the information I wanted and located the B6 memory failure, which needs to be replaced. As for how to replace it and what matters need to be paid attention to, I will talk about it later.

Summarize

Hardware security is the lowest level of security for the server. You must monitor the hardware and deal with hardware failures in a timely manner. Otherwise, you will understand. Introducing several common logs involving hardware failure analysis:

  1. messages log
  2. dmesg log
  3. ipmitool sel list view hardware log
  4. Logs on the remote management page (DELL's IDRAC, HP's ILO, IBM's IMM, etc.)
  5. smart log

The above is the detailed content of Use log analysis to easily diagnose hardware failures. For more information, please follow other related articles on the PHP Chinese website!

source:linuxprobe.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template