A colleague discovered that the number of message logs on a certain machine suddenly increased sharply. After a brief check, there were memory-related errors, so he forwarded them to me for review.
Message logEnter the server to view the message log. First, see what the alarm mentioned by the colleague is, as shown below:
Really, channel 3, the memory in the first slot failed. However, I only know A1/B1/A2/B2, so I continue.
No matter what, after checking it with the Ipmitool tool, there is indeed a memory alarm, as shown below
Although there is an alarm, it is impossible to locate the specific memory that is broken
IDRAC-webNo matter what, we still have the web page of DELL's own IDRAC to check the hardware status. Log in and take a look. First, look at the log. Here it is, B6 memory slot failure
Look at the hardware status again, there is an alarm in B6 memory
In this regard, I found the information I wanted and located the B6 memory failure, which needs to be replaced. As for how to replace it and what matters need to be paid attention to, I will talk about it later.
SummarizeHardware security is the lowest level of security for the server. You must monitor the hardware and deal with hardware failures in a timely manner. Otherwise, you will understand. Introducing several common logs involving hardware failure analysis:
The above is the detailed content of Use log analysis to easily diagnose hardware failures. For more information, please follow other related articles on the PHP Chinese website!