Alexander Motin
2013-08-01 18:39:51 UTC
Hi.
I was working from FreeBSD side on problems of restoring ZFS pools
operation after loosing some disks. I've made several patches and since
they don't look much FreeBSD-specific, I would like to get some comments
from the community. Patches could be found here:
http://people.freebsd.org/~mav/zfs_patches/.
cmd_on_suspend.patch -- allows three IOCTLs to be used on suspended
pool. Looking on git history, this patch just should restore the state
of things we had before IOCTL code refactoring some time ago. This
change allows `zpool clear` to be used to recover suspended pool and
`zpool status -v` print list of errors for the suspended pool. As I can
see in sources, that is the only supposed way to recover in that case.
remove_last3.patch -- makes SPA_ASYNC_REMOVE async events in ZFS to be
processed even after pool was suspended by using separate async thread,
scheduled immediately. Without that, async events are handled only after
successful txg commit, that is obviously impossible without minimal set
of disks. Lack of handling for these events makes ZFS to not close some
lost devices, that cause problem (tested on both FreeBSD and
OpenIndiana) after disk reconnection.
vdev_clear_all.patch -- makes `zpool clear` to reopen also reconnected
cache and spare devices. Since `zpool status` reports about such kinds
of errors, that is strange that they are not cleared by `zpool clear`.
online_on_suspend.patch -- is a questionable patch. The idea was to
make `zpool online` command work for suspended pools, as some sources
recommend. The problem is that this command tries to change pool content
(like updating history) or wait for completion. That required some hacks
to make it not stuck with pool still suspended. The question is whether
`zpool online` should work for suspended pools at all.
no_features.patch -- I've found a way to hang ZFS by device detach up
to the state of freezing `zpool status`. Issue was introduced with pool
features implementation. `zpool` tool reads pool configuration on every
pool opening. Previously there was no problem since the configuration
was recreated by kernel from data that seems always present in memory.
But newly introduced feature counters may not. They are stored in ZAP
and in normal case just live in ARC. But if system has strong ARC
pressure, that information may be evicted and then, if we are loosing
devices, we are stuck. I am not sure what would be a proper fix for this
situation. I've created workaround that blocks features reporting to
user-level if pool is suspended. That helps with one predictable flaw:
`zpool upgrade` always wish to update suspended pools, but fortunately
it can't due to the same suspension.
I am relatively new to ZFS internals, so please feel free to correct me
if my approaches are wrong somehow. Thank you!
I was working from FreeBSD side on problems of restoring ZFS pools
operation after loosing some disks. I've made several patches and since
they don't look much FreeBSD-specific, I would like to get some comments
from the community. Patches could be found here:
http://people.freebsd.org/~mav/zfs_patches/.
cmd_on_suspend.patch -- allows three IOCTLs to be used on suspended
pool. Looking on git history, this patch just should restore the state
of things we had before IOCTL code refactoring some time ago. This
change allows `zpool clear` to be used to recover suspended pool and
`zpool status -v` print list of errors for the suspended pool. As I can
see in sources, that is the only supposed way to recover in that case.
remove_last3.patch -- makes SPA_ASYNC_REMOVE async events in ZFS to be
processed even after pool was suspended by using separate async thread,
scheduled immediately. Without that, async events are handled only after
successful txg commit, that is obviously impossible without minimal set
of disks. Lack of handling for these events makes ZFS to not close some
lost devices, that cause problem (tested on both FreeBSD and
OpenIndiana) after disk reconnection.
vdev_clear_all.patch -- makes `zpool clear` to reopen also reconnected
cache and spare devices. Since `zpool status` reports about such kinds
of errors, that is strange that they are not cleared by `zpool clear`.
online_on_suspend.patch -- is a questionable patch. The idea was to
make `zpool online` command work for suspended pools, as some sources
recommend. The problem is that this command tries to change pool content
(like updating history) or wait for completion. That required some hacks
to make it not stuck with pool still suspended. The question is whether
`zpool online` should work for suspended pools at all.
no_features.patch -- I've found a way to hang ZFS by device detach up
to the state of freezing `zpool status`. Issue was introduced with pool
features implementation. `zpool` tool reads pool configuration on every
pool opening. Previously there was no problem since the configuration
was recreated by kernel from data that seems always present in memory.
But newly introduced feature counters may not. They are stored in ZAP
and in normal case just live in ARC. But if system has strong ARC
pressure, that information may be evicted and then, if we are loosing
devices, we are stuck. I am not sure what would be a proper fix for this
situation. I've created workaround that blocks features reporting to
user-level if pool is suspended. That helps with one predictable flaw:
`zpool upgrade` always wish to update suspended pools, but fortunately
it can't due to the same suspension.
I am relatively new to ZFS internals, so please feel free to correct me
if my approaches are wrong somehow. Thank you!
--
Alexander Motin
Alexander Motin