Post by Jan Schmidt via illumos-zfsThat patch looks somewhat promising, though I have not tried it yet. How did you
decide which of the overlapping space map ranges to drop? From my understanding,
either range might be the one that's currently correct, isn't it?
It's actually worse than that, because there are a lot of different
cases, depending on whether the overlapping ranges are alloc or free,
whether there are overlapping sub-ranges within them, whether they're
partial or complete overlaps, etc. And then there is the possibility of
subsequent ranges that partially overlap the previous bad ones. You
didn't mention which form of corruption you're hitting or how severe it
is, so I don't know which cases might apply to you. zdb is helpful in
getting a handle on that.
I have a different patch (George gets most of the credit, I take most of
the blame), that I used to recover spacemap corruption we had at Joyent
(albeit from a different cause, 4504). It's intended for one-time use;
you boot it, it fixes the spacemaps by leaking ambiguous regions,
preferring to lose a little space rather than risk later overwriting of
data, and condenses them back out, then you reboot onto normal bits
again. This covers a lot more cases; I tested many of them, but there
may yet be edge cases that aren't addressed. I recommend building a
libzpool with this first and trying zdb with that before booting with
the zfs module.
This comes with absolutely no warranty of any kind and should be used
only where dumping the data somewhere else (harder than you might think,
since you can't create snapshots in read-only mode) and recreating the
pool is not an option. It's on you to understand what it does and why
and to satisfy yourself that it will solve your problem safely before
using it. The comments might help a little, but you're really on your
own.
See
https://github.com/wesolows/illumos-joyent/commit/dc4d7e06c8e0af213619f0aa517d819172911005