Enabling Write-Read-Verify Feature on Disks

Given the appalling reliability of modern disks, any feature that helps ensure data integrity and early detection of failure has to be deemed a good thing. What caught my attention recently is that all of the Seagate Barracuda disks I have (a number of ST31000333AS, ST31000340AS and ST31000528AS models) support the Write-Read-Verify feature. But there is a snag – disks from different batches, even for the same model, seem to disagree about the default state of this feature. Worse, the feature gets reset to it’s default setting on every reboot. This wouldn’t be a problem if the usual tool for such things on Linux, hdparm, had an option for controlling the state of this feature – but it doesn’t. So I wrote a patch to add control of write-read-verify capability to hdparm. Hopefully this will help keep your data a little safer.

14 thoughts on “Enabling Write-Read-Verify Feature on Disks

  1. I applied your patch to hdparm 9.28 and tried it on an ST31500541AS, but it gives an I/O error when enabling WRV. Not sure if this is a problem with the patch/hdparm, the SATA controller of my NAS, or the drive falsely reporting it supports WRV.

    • Hmm, curious, it works OK on my Seagates (models listed above).

      I am, however, using hdparm 9.36. Can you try that?

      You try also try using sg_sat_set_features from the sg3_utils package? Try:

      sg_sat_set_features –feature=0bh –lba=0 /path/to/device

      Does that work or does it also return an error?

      • I made a typo and meant hdparm 9.38, but have now also tried it with 9.36. I’ve posted some debugging output at https://pastebin.com/ZvhusQNe

        What I noticed is that sg_sat_set_features tries to issue the CDB 85 06 0c 00 0b 00 00 00 00 00 00 00 00 00 ef 00, while hdparm -R 0 sends 85 06 20 00 8b 00 00 00 00 00 00 00 00 40 ef 00 and hdparm -R 2 85 06 20 00 0b 00 02 00 00 00 00 00 00 40 ef 00

        Is -R 0 supposed to disable WVR or enable Mode 0 (which is what I want). It seems to send the opcode to disable WVR (8b).

        • IIRC, -R1 enables it, -R0 disables it. It doesn’t, however, set the extended flags in the LBA register because this should default to 0.

          Look at the patch code – if write_read_verify is true, then it sets 0x0b, which is enable, otherwise it sets 0x8b.

          Setting -R2 is the same as -R1 because the flag is boolean. You may be confusing the WRV feature flag with the LBA register extended parameter values here.

          • I don’t think that’s quite right. The parameter should be between 0-3 (“bad/missing write-read-verify value (0..3)”), so I assume you intended these to correspond to the various modes as defined in the T13 standard.

            The value passed to -R is indeed stored in the CDB, but then it also determines whether to disable (0) or enable (non-zero) WRV based on this value. So with this patch you wouldn’t be able to enable Mode 0 (the most useful one).

        • I don’t think you’re reading the patch quite right. The passed -R value is checked only in one place:

          args[2] = write_read_verify ? 0x0b : 0x8b;

          If it’s true, it sends the 0x0b opcode to enable WRV, otherwise it sends the 0x8b opcode to disable it.

          No allowance is made for any other value, nor is it stored. The check is done based on a different structure:

          int supported = id[119] & 0x2;
          if (supported)
          printf(” write-read-verify = %2u\n”, id[120] & 0x2);

          • Are we looking at the same patch?

            + args[0] = ATA_OP_SETFEATURES;
            + args[1] = write_read_verify;
            + args[2] = write_read_verify ? 0x0b : 0x8b;
            + args[3] = 0;

            It also passes the write_read_verify value directly to the drive in args[1]. Perhaps you have a newer version than I have.

            When getting the WRV status, id[120] & 0x2 would evaluate to 2 if enabled, so that solves one mystery at least.

        • Yes, it passes the value to do_drive_cmd in args[1] but from what I can see, that doesn’t actually do anything in this case.

          If I pass -R0, the WRV reads disabled, as returned by hdparm -I and hdparm -R. If I pass hdparm -R1 to enable it, WRV reads enabled, as returned by hdparm -I and hdparm -R (hdparm -R returns that the register is set to 2, which is correct).

          Granted, it is debatable whether this should accept other values than 0 and 1 to enable or disable it, but I’m not sure it’d be useful. Adding 2 in there would just confuse things.

          • You’re right! I assumed args[1] ended up in the LBA/Mode field, but it’s the (Verify) Sector Count which should be ignored if Mode != 3.

            My assumption was based on the fact that you explicitly limited the values than can be passed to -R to 0-3 (the valid modes) instead of 0-1:
            case GET_SET_PARM('R',"write-read-verify",write_read_verify,0,3);
            I actually would be a bit surprised if you weren’t thinking of the Mode field at the time you wrote this patch.

        • Ah, yes, the 0-3 value range was a typo. I submitted an updated patch where that is fixed (range 0-1) upstream to Mark (hdparm maintainer) earlier today. I’m hoping it’ll get included in 9.39. 🙂

      • I’ve written my own little tool to enable WRV using HDIO_DRIVE_TASK (https://pastebin.com/aSLYTyVK). WRV is now reported as enabled by hdparm -I, but hdparm -R reports it as being in Mode 2, which if true might not be all that useful. There might still be a bug somewhere or the drive forces it to that value like it forces AAM to 128, 208 or 254. Let’s see if this alleviates the random write errors I’ve been experiencing with this drive.

        • Most interesting. What controller are you using? Is it perchance a SAS controller in SATA passthrough mode? And/or through a SAS expander, perhaps? These are known to perform unpredictably when passing low level SATA feature enabling commands like what my hdparm patch and sg_sat_set_features do.

          • It seems to be some non-standard SATA controller connected or embedded (on)to an FPGA-based SPARC running an non-standard 2.6.17 kernel. It doesn’t make debugging any easier, but unfortunately I currently don’t have any SATA controllers I can connect 3.5″ drives to laying around.

        • From what you put on pastebin, it looks like it is operating in SATA passthrough mode, which implies it is either SAS or something “special” (possibly a port muxer). Either way, it implies that isn’t talking SATA natively which is likely where the problem is – whatever the SATA commands are being passed through is getting confused by the WRV enable SATA commands since they are being sent raw by hdparm. This theory also seems corroborated by the fact you are getting same results from sg_sat_set_features.

Comments are closed.