25772 – Unlucky heuristic for calculate BAD_SECTOR_MANY

Bug 25772 - Unlucky heuristic for calculate BAD_SECTOR_MANY

Summary: Unlucky heuristic for calculate BAD_SECTOR_MANY

Status:	REOPENED

Alias:	None

Product:	libatasmart
Classification:	Unclassified
Component:	library (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	Lennart Poettering
QA Contact:	Lennart Poettering

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-12-23 01:11 UTC by Jean-Louis
Modified:	2012-02-14 19:22 UTC (History)
CC List:	3 users (show)

See Also:
i915 platform:
i915 features:

Attachments
skdump of harddrive (1.54 KB, application/octet-stream) 2009-12-28 08:39 UTC, Steve	Details
smart blob with slightly broken sectors (1.54 KB, application/octet-stream) 2010-03-19 04:27 UTC, Martin Pitt	Details
Drop our own "many bad sectors" heuristic (3.28 KB, patch) 2010-03-19 07:02 UTC, Martin Pitt	Details \| Splinter Review
View All

Description Jean-Louis 2009-12-23 01:11:19 UTC

A comment in the code say:

/* We use log2(n_sectors) as a threshold here. We had to pick
 * something, and this makes a bit of sense, or doesn't it? */

this means: 

128GB = 2^37 Bytes -> log2(2^28) = 28 sectors
1TB = 2^40 Bytes -> log2(2^31) = 31 sectors
8TB = 2^43 Bytes -> log2(2^34) = 34 sectors

I think that this is a unlucky heuristic.

The meaning of raw value is vendor specific.
Could have sense if BAD_SECTOR_MANY is calculated like:

(worst value - threshold value) <= 5 ?

obviously this is only an example

Comment 1 Lennart Poettering 2009-12-23 05:15:11 UTC

The entire SMART attribute business is highly vendor dependant since there is no officially accepted spec about SMART attribute decoding. (It never became an official standard, all it ever was was a draft that was later on withdrawn) Fortunately on almost all drives the raw data of quite a few fields can be decoded the same way. In libatasmart we try to include the decoding of fields where it makes sense and is commonly accepted.

OTOH the non-raw fields (i.e. "current" and "worst") encode the information about the raw number of sectors (for sector related attributes) in a way that we cannot determine the actual number of sectors anymore. 

The reason for this extra threshold we apply here is that we wanted vendor-independent health checking. i.e. as long as we can trust the number of raw bad sectors the drive reports we can compare that with a threshold that is not fiddled with by the vendor to make his drives look better.

The reason I picked log2() here is simply that we do want to allow more bad sectors on bigger drives than on small ones. But a linearly related threshold seemed to increase too quickly, so the next choice was logarithmic.

Do you have any empiric example where the current thresholds do not work as they should?

Comment 2 Steve 2009-12-28 08:38:04 UTC

Please check the associated skdump save file. This is an old 20GB laptop drive. In the latest Ubuntu 9.10 they ship with 0.16 of libatasmart. I think this drive is incorrectly flagged as failing, because the lib relies on the raw value being a single raw48 value. This then looks like very many (262166) bad blocks.

Using "smartctl -a /dev/sda" I get the following extracts:

SMART overall-health self-assessment test result: PASSED
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       262166
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       4

If I use the -v 5,raw8 option
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0 0 0 4 0 22 

If I use the -v 5,raw16 option
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0 4 22

The attribute is being read as raw48, which in this case looks to be completely wrong. Using the different raw# value seems to tie in with attribute 196.

It could be argued that if you cannot rely on the format of the raw value, you should not base warnings off it, and only use the normalized, worst and threshold values. I'm technical, and I damn near junked a relatives old but still serviceable laptop because of this.

Comment 3 Steve 2009-12-28 08:39:08 UTC

Created attachment 32330 [details]
skdump of harddrive

Comment 4 Jean-Louis 2009-12-29 04:17:41 UTC

(In reply to comment #1)
> The reason I picked log2() here is simply that we do want to allow more bad
> sectors on bigger drives than on small ones. But a linearly related threshold
> seemed to increase too quickly, so the next choice was logarithmic.
> 
> Do you have any empiric example where the current thresholds do not work as
> they should?
> 

For convenience I use kibibyte, mebibyte, gibibyte ...

128 GiB = 2^37 -> log2(2^37/512) = log2(2^37/2^9) = 28 sectors

For an HDD of 128 GiB (2^37 Bytes) the calculated threshold value is 28 sectors (14336 Bytes = 14 KiB), isn't it too low?

For an HDD of 1 TiB (2^40 Bytes) the calculated threshold value is 31 sectors (15872 Bytes = 15.5 KiB) ...

For an  hypothetical HDD of 1 PiB (2^50 Bytes, 1024 tebibyte) the calculated threshold is only 41 sectors (20992 Bytes = 20.5 KiB) ...

If we do want to allow more bad sectors on bigger drives than on small ones, IMHO this isn't a good heuristic.

Difference between HDD of 128 GiB and HDD of 8 TiB is only 6 sectors (3 KiB)

Comment 5 Jean-Louis 2009-12-30 06:12:33 UTC

I forgotten to say that this bug report and the enhancement requested in Bug #25773 is due to Launchpad Bug 438136 <https://bugs.launchpad.net/ubuntu/+source/libatasmart/+bug/438136?comments=all>

On launchpad there are also some screenshots of palimpsest that show the failing hard disk with relatively  few bad sectors or with raw value with probably different format (there are some 65537 65539 65551 65643 and similar number of bad sectors)

Some example:

117 bad sectors (58.5 KiB) on 1000GB HDD
<http://launchpadlibrarian.net/32604239/palimpsest-screenshot.png>

66 bad sectors (33 KiB) on 200GB HDD
<http://launchpadlibrarian.net/34794631/Screenshot-SMART%20Data.png>

466 bad sectors (233 KiB) on 1500GB HDD
<http://launchpadlibrarian.net/34991157/Screenshot.png>

65 bad sectors (32.5 KiB) on 120GB HDD (all current pending sectors"
<http://launchpadlibrarian.net/35201129/Pantallazo-Datos%20SMART.png>

54 bad sectors (27 KiB) on 169GB HDD
<http://launchpadlibrarian.net/36115988/Screenshot.png>

Comment 6 Martin Pitt 2010-03-19 04:00:15 UTC

The bigger problem of this is (as you already mentioned) that the raw value is misparsed way too often. Random examples from bug reports:

  http://launchpadlibrarian.net/34574037/smartctl.txt
5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       327697

  http://launchpadlibrarian.net/35971054/smartctl_tests.log
5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       65542

  http://launchpadlibrarian.net/36599746/smartctl_tests-deer.log
5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       65552

  https://bugzilla.redhat.com/attachment.cgi?id=382378
5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       655424

  https://bugzilla.redhat.com/show_bug.cgi?id=506254
reallocated-sector-count    100/100/  5   FAIL    1900724 sectors Prefail 
Online 

It seems that "no officially accepted spec about SMART attribute decoding" also hits here in the sense of that way too many drives get the raw counts wrong. In all the 30 or so logs that I looked at in the various Launchpad/RedHat/fd.o bug reports related to this I didn't see an implausible value of the normalized values, though.

I appreciate the effort of doing vendor independent bad blocks checking, but a lot of people get tons of false alarms due to that, and thus won't believe it any more if there is really a disk failing some day.

My feeling is that a more cautious approach would be to use the normalized value vs. treshold for the time being, and use the raw values if/when that can be made more reliable (then we should use something in between logarithmic and linear, though, since due to sheer probabilities, large disks will have more bad sectors and also more reserve sectors than small ones).

Comment 7 Martin Pitt 2010-03-19 04:27:33 UTC

Created attachment 34234 [details]
smart blob with slightly broken sectors

BTW, I use this smart blob for playing around and testing, which is a particularly interesting one: It has a few bad sectors (correctly parsed), but not enough yet to be below the vendor specified threshold.

  5 reallocated-sector-count     77     1    63   1783 sectors 0xf70600000000 prefail online  yes  no  
197 current-pending-sector       83     6     0   1727 sectors 0xbf0600000000 old-age offline n/a  n/a 

So this can be loaded into skdump or udisks for testing the desktop integration all the way through:

$ sudo udisks --ata-smart-refresh /dev/sda --ata-smart-simulate /tmp/smart.blob

Comment 8 Martin Pitt 2010-03-19 07:02:09 UTC

Created attachment 34242 [details] [review]
Drop our own "many bad sectors" heuristic

This patch just uses the standard "compare normalized value against treshold". I know that it's not necessarily how you really want it to work, but it's a pragmatic solution to avoid all those false positives, which don't help people either.

So of course feel free to entirely ignore it, but at least I want to post it here for full disclosure. (I'll apply it to Debian/Ubuntu, we have to get a release out).

This patch is against the one in bug 26834.

Comment 9 Martin Pitt 2010-03-19 07:05:13 UTC

Oh, forgot: I compared

  for i in blob-examples/*; do echo "-- $i"; ./skdump --load=$i; done > /tmp/atasmart-test.out

before and after, and get two differences like

-^[[1mOverall Status: BAD_SECTOR_MANY^[[0m
+^[[1mOverall Status: BAD_SECTOR^[[0m

The first one is against blob-examples/Maxtor_96147H8--BAC51KJ0:
 5 reallocated-sector-count    226   226    63   69 sectors  0x450000000000 prefail online  yes  yes 

and the second one against blob-examples/WDC_WD5000AAKS--00TMA0-12.01C01

  5 reallocated-sector-count    192   192   140   63 sectors  0x3f0000000000 prefail online  yes  yes 

so under the premise of changing the evaluation to use the normalized numbers those are correct and expected changes.

Comment 10 Alex Butcher 2010-07-04 02:09:56 UTC

(In reply to comment #1)

> The reason I picked log2() here is simply that we do want to allow more bad
> sectors on bigger drives than on small ones. But a linearly related threshold
> seemed to increase too quickly, so the next choice was logarithmic.
> 
> Do you have any empiric example where the current thresholds do not work as
> they should?

According to http://www.seagate.com/ww/v/index.jsp?locale=en-US&name=SeaTools_Error_Codes_-_Seagate_Technology&vgnextoid=d173781e73d5d010VgnVCM100000dd04090aRCRD (which I first read about 18 months ago, when 1.5TB drives were brand new), "Current disk drives contain *thousands* [my emphasis] of spare sectors which are automatically reallocated if the drive senses difficulty reading or writing". Therefore, it is my belief that your heuristic is off by somewhere between one and two orders of magnitude as your heuristic only allows for 30 bad sectors on a 1TB drive (Seagate's article would imply it has at least 2000 spare sectors - and maybe more - of which 30 are only 1.5%).

As you say, though, this is highly manufacturer- and model-dependent; Seagate's drives might be designed with very many more spare sectors than other manufacturers' drives. The only sure-fire way to interpret the SMART attributes is to compare the cooked value with the vendor-set threshold for that attribute.

If you are insistent upon doing something with the raw reallocated sector count attribute, I believe it would be far more useful to alert when it changes, or changes by a large number of sectors in a short period of time.

Comment 11 Lennart Poettering 2011-10-11 15:01:49 UTC

So, I wanna give this one more try. I kept the log2() in there, but multiplied it now with 1024 which should be a safe margin.

If this brings bad results we can drop this entirely. In that case, please reopen.

Comment 12 Stuart Gathman 2012-02-14 19:20:21 UTC

Just want to reiterate what a bad idea it is to:

a) make your own seat of the pants algorithm to determine how many bad sectors is "too many" based on no significant data.

b) do so when you can't even read the raw number correctly (due to varying format of raw values).  

My wife's 120G laptop drive has 10 bad sectors, but palimpsest still reads this as 655424.  (The 0x0a is the low order byte in intel byte order see https://bugzilla.redhat.com/show_bug.cgi?id=498115#c61 for details, still fails in Fedora 16, gnome-disk-utility-3.0.2.)  The 1024 factor *still* sees the disk as failing - it does not address the underlying problem of not having a reliable raw value, and not knowing the design parameters or even the type of technology.

Please, please, just use the vendor numbers.  The only thing you could add would be to keep a history, and warn of *changes* in the value (but don't say "OH MY GOD YOUR DISK IS ABOUT TO DIE!" unless the scaled value passes the vendor threshold).

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.