Bonjour,
j'ai besoin de votre aide 😢
j'ai un serveur dédié chez soyoustart qui suite à un incident chez ovh ne démarre plus, hier matin le serveur avait un débit très limité ( 20M/5M) apres reboot / ouverture de ticket / divers reboot vers 10h, il à été inacessible pendant 6-7h, ovh est intervenu, avec est de la ram, changement de cable réseau et intervention sur le switch, suite à ça, le serveur est accessible mais uniquement en rescue.
je me suis connecté dessus pour voir les logs ( var/log) mais rien de récent, les derniers remonte au reboot de 10h et plus rien après.
le serveur est partitionner de cette façon la :
Filesystem Size Used Avail Use% Mounted on
/dev/md2 20G 6.6G 12G 37% /
udev 10M 0 10M 0% /dev
/dev/md3 5.4T 4.9T 280G 95% /home
le /dev/md2 est le raid 1 système et le /dev/md3 qui est mon raid 0 pour la partie home.
quand j'attaque le serveur en rescue le raid 0 est inactif...
le fdisk l- donne ceci :
root@rescue:~# fdisk -l
Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: FBBA84A2-4672-497F-8504-B5088F771DFE
Device Start End Sectors Size Type
/dev/sdb1 40 2048 2009 1004.5K BIOS boot
/dev/sdb2 4096 40962047 40957952 19.5G Linux RAID
/dev/sdb3 40962048 5859477503 5818515456 2.7T Linux RAID
/dev/sdb4 5859477504 5860524031 1046528 511M Linux swap
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/md2: 19.5 GiB, 20970405888 bytes, 40957824 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
le parted -l ceci :
root@rescue:~# parted -l
Error: /dev/sda: unrecognised disk label
Model: ATA HGST HUS724030AL (scsi)
Disk /dev/sda: 3001GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
Disk Flags:
Model: ATA HGST HUS724030AL (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 20.5kB 1049kB 1029kB bios_grub-sdb bios_grub
2 2097kB 21.0GB 21.0GB ext4 primary raid
3 21.0GB 3000GB 2979GB primary raid
4 3000GB 3001GB 536MB linux-swap(v1) primary
Model: Linux Software RAID Array (md)
Disk /dev/md2: 21.0GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:
Number Start End Size File system Flags
1 0.00B 21.0GB 21.0GB ext4
le cat /proc/mdstat ceci :
> root@rescue:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md3 : inactive sdb3[1](S)
2909257664 blocks
md2 : active raid1 sdb2[1]
20478912 blocks [2/1] [_U]
unused devices: <none>
l'état smart des 2 disques :
root@rescue:~# smartctl -a /dev/sda
smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.14.77-mod-std-ipv6-64-rescue] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: HGST HUS724030ALA640
Serial Number: PN2A31PAKD8STT
LU WWN Device Id: 5 000cca 22befdbe0
Firmware Version: MF8OABY0
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Wed Feb 8 09:05:19 2017 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 24) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 426) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 137 137 054 Pre-fail Offline - 77
3 Spin_Up_Time 0x0007 159 159 024 Pre-fail Always - 560 (Average 409)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 58
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 142 142 020 Pre-fail Offline - 25
9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 22940
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 138
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 138
194 Temperature_Celsius 0x0002 153 153 000 Old_age Always - 39 (Min/Max 18/55)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 21987 -
# 2 Short offline Completed without error 00% 21984 -
# 3 Short offline Completed without error 00% 21984 -
# 4 Short offline Completed without error 00% 21023 -
# 5 Short offline Completed without error 00% 21020 -
# 6 Short offline Completed without error 00% 21020 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
root@rescue:~# smartctl -a /dev/sdb
smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.14.77-mod-std-ipv6-64-rescue] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: HGST HUS724030ALA640
Serial Number: PN1234P8KAD02X
LU WWN Device Id: 5 000cca 22ceeff47
Firmware Version: MF8OABY0
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Wed Feb 8 09:05:25 2017 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 24) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 434) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 136 136 054 Pre-fail Offline - 81
3 Spin_Up_Time 0x0007 159 159 024 Pre-fail Always - 455 (Average 328)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 142 142 020 Pre-fail Offline - 25
9 Power_On_Hours 0x0012 098 098 000 Old_age Always - 18011
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 60
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 423
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 423
194 Temperature_Celsius 0x0002 162 162 000 Old_age Always - 37 (Min/Max 18/49)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 17058 -
# 2 Short offline Completed without error 00% 17055 -
# 3 Short offline Completed without error 00% 17055 -
# 4 Short offline Completed without error 00% 16094 -
# 5 Short offline Completed without error 00% 16091 -
# 6 Short offline Completed without error 00% 16091 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
le md3 est en inactif, comment le réactiver sans tout casser? je débute en linux, j'ai passé des dizaines sur le forum à lire et comprendre mais la, le raid soft, j'ai un peu peur de tout péter...
Merci beaucoup 🙂