Sample Header Ad - 728x90

Identify ram module linked to ECC error di DMESG

2 votes
0 answers
602 views
one of my server is logging the following ECC errors: [lun set 14 00:14:16 2020] {33}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [lun set 14 00:14:16 2020] {33}[Hardware Error]: It has been corrected by h/w and requires no further action [lun set 14 00:14:16 2020] {33}[Hardware Error]: event severity: corrected [lun set 14 00:14:16 2020] {33}[Hardware Error]: Error 0, type: corrected [lun set 14 00:14:16 2020] {33}[Hardware Error]: fru_text: CorrectedErr [lun set 14 00:14:16 2020] {33}[Hardware Error]: section_type: memory error [lun set 14 00:14:16 2020] {33}[Hardware Error]: node: 0 device: 1 [lun set 14 00:14:16 2020] {33}[Hardware Error]: error_type: 2, single-bit ECC [lun set 14 00:14:16 2020] ghes_edac: Internal error: Can't find EDAC structure The server has the following RAN configuration: Handle 0x0029, DMI type 16, 23 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: Single-bit ECC Maximum Capacity: 64 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x002A, DMI type 17, 40 bytes Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 16384 MB Form Factor: DIMM Set: None Locator: DIMM CHA3 Bank Locator: BANK 0 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: SK Hynix Serial Number: 71929DA0 Asset Tag: 1651 Part Number: HMA82GU7MFR8N-TF Rank: 2 Configured Clock Speed: 2133 MHz Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.2 V Handle 0x002B, DMI type 17, 40 bytes Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 16384 MB Form Factor: DIMM Set: None Locator: DIMM CHA1 Bank Locator: BANK 1 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: SK Hynix Serial Number: 71929CFF Asset Tag: 1651 Part Number: HMA82GU7MFR8N-TF Rank: 2 Configured Clock Speed: 2133 MHz Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.2 V Handle 0x002C, DMI type 17, 40 bytes Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 16384 MB Form Factor: DIMM Set: None Locator: DIMM CHB4 Bank Locator: BANK 2 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: SK Hynix Serial Number: 71929BB8 Asset Tag: 1651 Part Number: HMA82GU7MFR8N-TF Rank: 2 Configured Clock Speed: 2133 MHz Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.2 V Handle 0x002D, DMI type 17, 40 bytes Memory Device Array Handle: 0x0029 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 16384 MB Form Factor: DIMM Set: None Locator: DIMM CHB2 Bank Locator: BANK 3 Type: DDR4 Type Detail: Synchronous Speed: 2133 MHz Manufacturer: Samsung Serial Number: 33BB5E37 Asset Tag: 1641 Part Number: M391A2K43BB1-CPB Rank: 2 Configured Clock Speed: 2133 MHz Minimum Voltage: Unknown Maximum Voltage: Unknown Configured Voltage: 1.2 V How can I identify the faulty module to replace it? I think that the following log's row has the information I need but I miss the way to decrypt it. [lun set 14 00:14:16 2020] {33}[Hardware Error]: node: 0 device: 1
Asked by sKo (21 rep)
Sep 14, 2020, 10:00 AM