Summary: | Unlucky heuristic for calculate BAD_SECTOR_MANY | ||
---|---|---|---|
Product: | libatasmart | Reporter: | Jean-Louis <jelot-freedesktop> |
Component: | library | Assignee: | Lennart Poettering <lennart> |
Status: | REOPENED --- | QA Contact: | Lennart Poettering <lennart> |
Severity: | normal | ||
Priority: | medium | CC: | jelot-freedesktop, stephen.boddy, zeuthen |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
skdump of harddrive
smart blob with slightly broken sectors Drop our own "many bad sectors" heuristic |
Description
Jean-Louis
2009-12-23 01:11:19 UTC
The entire SMART attribute business is highly vendor dependant since there is no officially accepted spec about SMART attribute decoding. (It never became an official standard, all it ever was was a draft that was later on withdrawn) Fortunately on almost all drives the raw data of quite a few fields can be decoded the same way. In libatasmart we try to include the decoding of fields where it makes sense and is commonly accepted. OTOH the non-raw fields (i.e. "current" and "worst") encode the information about the raw number of sectors (for sector related attributes) in a way that we cannot determine the actual number of sectors anymore. The reason for this extra threshold we apply here is that we wanted vendor-independent health checking. i.e. as long as we can trust the number of raw bad sectors the drive reports we can compare that with a threshold that is not fiddled with by the vendor to make his drives look better. The reason I picked log2() here is simply that we do want to allow more bad sectors on bigger drives than on small ones. But a linearly related threshold seemed to increase too quickly, so the next choice was logarithmic. Do you have any empiric example where the current thresholds do not work as they should? Please check the associated skdump save file. This is an old 20GB laptop drive. In the latest Ubuntu 9.10 they ship with 0.16 of libatasmart. I think this drive is incorrectly flagged as failing, because the lib relies on the raw value being a single raw48 value. This then looks like very many (262166) bad blocks. Using "smartctl -a /dev/sda" I get the following extracts: SMART overall-health self-assessment test result: PASSED 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 262166 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 4 If I use the -v 5,raw8 option 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 0 0 4 0 22 If I use the -v 5,raw16 option 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 4 22 The attribute is being read as raw48, which in this case looks to be completely wrong. Using the different raw# value seems to tie in with attribute 196. It could be argued that if you cannot rely on the format of the raw value, you should not base warnings off it, and only use the normalized, worst and threshold values. I'm technical, and I damn near junked a relatives old but still serviceable laptop because of this. Created attachment 32330 [details]
skdump of harddrive
(In reply to comment #1) > The reason I picked log2() here is simply that we do want to allow more bad > sectors on bigger drives than on small ones. But a linearly related threshold > seemed to increase too quickly, so the next choice was logarithmic. > > Do you have any empiric example where the current thresholds do not work as > they should? > For convenience I use kibibyte, mebibyte, gibibyte ... 128 GiB = 2^37 -> log2(2^37/512) = log2(2^37/2^9) = 28 sectors For an HDD of 128 GiB (2^37 Bytes) the calculated threshold value is 28 sectors (14336 Bytes = 14 KiB), isn't it too low? For an HDD of 1 TiB (2^40 Bytes) the calculated threshold value is 31 sectors (15872 Bytes = 15.5 KiB) ... For an hypothetical HDD of 1 PiB (2^50 Bytes, 1024 tebibyte) the calculated threshold is only 41 sectors (20992 Bytes = 20.5 KiB) ... If we do want to allow more bad sectors on bigger drives than on small ones, IMHO this isn't a good heuristic. Difference between HDD of 128 GiB and HDD of 8 TiB is only 6 sectors (3 KiB) I forgotten to say that this bug report and the enhancement requested in Bug #25773 is due to Launchpad Bug 438136 <https://bugs.launchpad.net/ubuntu/+source/libatasmart/+bug/438136?comments=all> On launchpad there are also some screenshots of palimpsest that show the failing hard disk with relatively few bad sectors or with raw value with probably different format (there are some 65537 65539 65551 65643 and similar number of bad sectors) Some example: 117 bad sectors (58.5 KiB) on 1000GB HDD <http://launchpadlibrarian.net/32604239/palimpsest-screenshot.png> 66 bad sectors (33 KiB) on 200GB HDD <http://launchpadlibrarian.net/34794631/Screenshot-SMART%20Data.png> 466 bad sectors (233 KiB) on 1500GB HDD <http://launchpadlibrarian.net/34991157/Screenshot.png> 65 bad sectors (32.5 KiB) on 120GB HDD (all current pending sectors" <http://launchpadlibrarian.net/35201129/Pantallazo-Datos%20SMART.png> 54 bad sectors (27 KiB) on 169GB HDD <http://launchpadlibrarian.net/36115988/Screenshot.png> The bigger problem of this is (as you already mentioned) that the raw value is misparsed way too often. Random examples from bug reports: http://launchpadlibrarian.net/34574037/smartctl.txt 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 327697 http://launchpadlibrarian.net/35971054/smartctl_tests.log 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 65542 http://launchpadlibrarian.net/36599746/smartctl_tests-deer.log 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 65552 https://bugzilla.redhat.com/attachment.cgi?id=382378 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 655424 https://bugzilla.redhat.com/show_bug.cgi?id=506254 reallocated-sector-count 100/100/ 5 FAIL 1900724 sectors Prefail Online It seems that "no officially accepted spec about SMART attribute decoding" also hits here in the sense of that way too many drives get the raw counts wrong. In all the 30 or so logs that I looked at in the various Launchpad/RedHat/fd.o bug reports related to this I didn't see an implausible value of the normalized values, though. I appreciate the effort of doing vendor independent bad blocks checking, but a lot of people get tons of false alarms due to that, and thus won't believe it any more if there is really a disk failing some day. My feeling is that a more cautious approach would be to use the normalized value vs. treshold for the time being, and use the raw values if/when that can be made more reliable (then we should use something in between logarithmic and linear, though, since due to sheer probabilities, large disks will have more bad sectors and also more reserve sectors than small ones). Created attachment 34234 [details]
smart blob with slightly broken sectors
BTW, I use this smart blob for playing around and testing, which is a particularly interesting one: It has a few bad sectors (correctly parsed), but not enough yet to be below the vendor specified threshold.
5 reallocated-sector-count 77 1 63 1783 sectors 0xf70600000000 prefail online yes no
197 current-pending-sector 83 6 0 1727 sectors 0xbf0600000000 old-age offline n/a n/a
So this can be loaded into skdump or udisks for testing the desktop integration all the way through:
$ sudo udisks --ata-smart-refresh /dev/sda --ata-smart-simulate /tmp/smart.blob
Created attachment 34242 [details] [review] Drop our own "many bad sectors" heuristic This patch just uses the standard "compare normalized value against treshold". I know that it's not necessarily how you really want it to work, but it's a pragmatic solution to avoid all those false positives, which don't help people either. So of course feel free to entirely ignore it, but at least I want to post it here for full disclosure. (I'll apply it to Debian/Ubuntu, we have to get a release out). This patch is against the one in bug 26834. Oh, forgot: I compared for i in blob-examples/*; do echo "-- $i"; ./skdump --load=$i; done > /tmp/atasmart-test.out before and after, and get two differences like -^[[1mOverall Status: BAD_SECTOR_MANY^[[0m +^[[1mOverall Status: BAD_SECTOR^[[0m The first one is against blob-examples/Maxtor_96147H8--BAC51KJ0: 5 reallocated-sector-count 226 226 63 69 sectors 0x450000000000 prefail online yes yes and the second one against blob-examples/WDC_WD5000AAKS--00TMA0-12.01C01 5 reallocated-sector-count 192 192 140 63 sectors 0x3f0000000000 prefail online yes yes so under the premise of changing the evaluation to use the normalized numbers those are correct and expected changes. (In reply to comment #1) > The reason I picked log2() here is simply that we do want to allow more bad > sectors on bigger drives than on small ones. But a linearly related threshold > seemed to increase too quickly, so the next choice was logarithmic. > > Do you have any empiric example where the current thresholds do not work as > they should? According to http://www.seagate.com/ww/v/index.jsp?locale=en-US&name=SeaTools_Error_Codes_-_Seagate_Technology&vgnextoid=d173781e73d5d010VgnVCM100000dd04090aRCRD (which I first read about 18 months ago, when 1.5TB drives were brand new), "Current disk drives contain *thousands* [my emphasis] of spare sectors which are automatically reallocated if the drive senses difficulty reading or writing". Therefore, it is my belief that your heuristic is off by somewhere between one and two orders of magnitude as your heuristic only allows for 30 bad sectors on a 1TB drive (Seagate's article would imply it has at least 2000 spare sectors - and maybe more - of which 30 are only 1.5%). As you say, though, this is highly manufacturer- and model-dependent; Seagate's drives might be designed with very many more spare sectors than other manufacturers' drives. The only sure-fire way to interpret the SMART attributes is to compare the cooked value with the vendor-set threshold for that attribute. If you are insistent upon doing something with the raw reallocated sector count attribute, I believe it would be far more useful to alert when it changes, or changes by a large number of sectors in a short period of time. So, I wanna give this one more try. I kept the log2() in there, but multiplied it now with 1024 which should be a safe margin. If this brings bad results we can drop this entirely. In that case, please reopen. Just want to reiterate what a bad idea it is to: a) make your own seat of the pants algorithm to determine how many bad sectors is "too many" based on no significant data. b) do so when you can't even read the raw number correctly (due to varying format of raw values). My wife's 120G laptop drive has 10 bad sectors, but palimpsest still reads this as 655424. (The 0x0a is the low order byte in intel byte order see https://bugzilla.redhat.com/show_bug.cgi?id=498115#c61 for details, still fails in Fedora 16, gnome-disk-utility-3.0.2.) The 1024 factor *still* sees the disk as failing - it does not address the underlying problem of not having a reliable raw value, and not knowing the design parameters or even the type of technology. Please, please, just use the vendor numbers. The only thing you could add would be to keep a history, and warn of *changes* in the value (but don't say "OH MY GOD YOUR DISK IS ABOUT TO DIE!" unless the scaled value passes the vendor threshold). |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.