question about vdev_label_read

Discussion:

question about vdev_label_read_config

Bob

2014-04-24 12:19:58 UTC

Hello All,

Here we have a corrupted zpool, which shows below:
***@openindiana:~# zpool import
pool: POOLSAS
id: 6657267340672818258
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.
config:

POOLSAS ONLINE
raidz1-0 ONLINE
c2t5000C50068DE43B3d0 UNAVAIL corrupted data
c2t5000C50068DE4F63d0 UNAVAIL corrupted data
c2t5000C50068DE600Bd0 UNAVAIL corrupted data
c2t5000C50068E16F3Fd0 UNAVAIL corrupted data
c2t5000C50068E177B7d0 UNAVAIL corrupted data
c2t5000C50068E19A53d0 UNAVAIL corrupted data
c2t5000C50068E1DEE7d0 UNAVAIL corrupted data
c2t5000C50068E34F0Fd0 UNAVAIL corrupted data
raidz1-1 ONLINE
c2t5000C50068DE6957d0 ONLINE
c2t5000C50068E16983d0 ONLINE
c2t5000C50068E18C73d0 ONLINE
c2t5000C50068E1AF2Fd0 ONLINE
c2t5000C50068E1DB6Fd0 ONLINE
c2t5000C50068E1F8A7d0 ONLINE
c2t5000C50068E208EBd0 ONLINE
c2t5000C50068E266ABd0 ONLINE
raidz1-2 ONLINE
c2t5000C500688BAB87d0 UNAVAIL corrupted data
c2t5000C500688BB0CBd0 UNAVAIL corrupted data
spare-2 ONLINE
c2t5000C50068950F3Bd0 FAULTED corrupted data
c2t5000C500688B97BBd0 UNAVAIL corrupted data
c2t5000C500689AB39Bd0 UNAVAIL corrupted data
c2t5000C500689AB4A7d0 UNAVAIL corrupted data
c2t5000C50068A34BA3d0 UNAVAIL corrupted data
c2t5000C50068DEA203d0 UNAVAIL corrupted data
spare-7 ONLINE
c2t5000C5006B5C78B3d0 FAULTED corrupted data
c2t5000C50068E1EEE7d0 FAULTED corrupted data
raidz1-3 ONLINE
c2t5000C50068883F83d0 FAULTED corrupted data
spare-1 ONLINE
c2t5000C500688AEAB7d0 UNAVAIL corrupted data
c2t5000C50068E150ABd0 FAULTED corrupted data
spare-2 ONLINE
c2t5000C50068950F93d0 UNAVAIL corrupted data
c2t5000C500688B91B3d0 FAULTED corrupted data
c2t5000C50068951093d0 FAULTED corrupted data
c2t5000C5006898C36Fd0 FAULTED corrupted data
c2t5000C50068A181ABd0 FAULTED corrupted data
c2t5000C50068EFB347d0 FAULTED corrupted data
c2t5000C50068EFBC17d0 UNAVAIL corrupted data
cache
c2t5000A7203008D09Bd0
spares
c2t5000C500688B91B3d0
c2t5000C500688B97BBd0
c2t5000C50068E150ABd0
c2t5000C50068E1EEE7d0
logs
mirror-4 ONLINE
c2t5000A7203008D0A4d0 ONLINE
c2t5000A7203008D0A5d0 ONLINE

and with zpool import -f POOLSAS, we got the unhappy output:(,
***@openindiana:~# zpool import -f POOLSAS
cannot import 'POOLSAS': one or more devices is currently unavailable

OK, from what I googled, It seems we suffered a corrupted pool, like
many other guys met.
Well, I checked with zdb, the label seems 'OK', and after some trace
work(Thanks to Dtrace, really powerful and useful in this case if not
all the case),
I found that vdev_label_read_config return NULL,
which makes the vdev 'can't open'(or we can say dead?), and the later
story short, zio will be marked as ENXIO in zio_vdev_io_start:
if (vd->vdev_ops->vdev_op_leaf &&
(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE)) {

if (zio->io_type == ZIO_TYPE_READ && vdev_cache_read(zio) == 0)
return (ZIO_PIPELINE_CONTINUE);

if ((zio = vdev_queue_io(zio)) == NULL)
return (ZIO_PIPELINE_STOP);

if (!vdev_accessible(vd, zio)) {
zio->io_error = ENXIO;
zio_interrupt(zio);
return (ZIO_PIPELINE_STOP);
}
}

return (vd->vdev_ops->vdev_op_io_start(zio));

finally, We got ENXIO in spa_import->...->spa_load_impl, I guess it's
right here after vdev_load(which suffers ENXIO and vdev_dtl_load failed
right in space_map_load and
dmu_read->dmu_buf_hold_array_by_dnode->zio_wait(ENXIO set in
zio_vdev_io_start?)

in vdev_validate->vdev_label_read_config:
uint64_t txg = strict ? spa->spa_config_txg : -1ULL;

if ((label = vdev_label_read_config(vd, txg)) == NULL) {
vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
VDEV_AUX_BAD_LABEL);
return (0);
}
I found that the spa->spa_config_txg is 0, which lead the
vdev_label_read_config return NULL,
So my question is:
1, does spa->spa_config_txg = 0 mean that the data corrupted?
2, Is there any solution to recover my data, I mean import the pool?

Thanks.

George Wilson

2014-04-24 16:32:53 UTC

Permalink

Bob,

I believe you need the fix for:

3422 zpool create/syseventd race yield non-importable pool

That changed the logic in vdev_validate() to better deal with the strict
checks. It's possible that even with that fix we may need to see why we
can't read the label. Can you pull in the fix for the bug above and retry?

Thanks,
George

Post by Bob
Hello All,
pool: POOLSAS
id: 6657267340672818258
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.
POOLSAS ONLINE
raidz1-0 ONLINE
c2t5000C50068DE43B3d0 UNAVAIL corrupted data
c2t5000C50068DE4F63d0 UNAVAIL corrupted data
c2t5000C50068DE600Bd0 UNAVAIL corrupted data
c2t5000C50068E16F3Fd0 UNAVAIL corrupted data
c2t5000C50068E177B7d0 UNAVAIL corrupted data
c2t5000C50068E19A53d0 UNAVAIL corrupted data
c2t5000C50068E1DEE7d0 UNAVAIL corrupted data
c2t5000C50068E34F0Fd0 UNAVAIL corrupted data
raidz1-1 ONLINE
c2t5000C50068DE6957d0 ONLINE
c2t5000C50068E16983d0 ONLINE
c2t5000C50068E18C73d0 ONLINE
c2t5000C50068E1AF2Fd0 ONLINE
c2t5000C50068E1DB6Fd0 ONLINE
c2t5000C50068E1F8A7d0 ONLINE
c2t5000C50068E208EBd0 ONLINE
c2t5000C50068E266ABd0 ONLINE
raidz1-2 ONLINE
c2t5000C500688BAB87d0 UNAVAIL corrupted data
c2t5000C500688BB0CBd0 UNAVAIL corrupted data
spare-2 ONLINE
c2t5000C50068950F3Bd0 FAULTED corrupted data
c2t5000C500688B97BBd0 UNAVAIL corrupted data
c2t5000C500689AB39Bd0 UNAVAIL corrupted data
c2t5000C500689AB4A7d0 UNAVAIL corrupted data
c2t5000C50068A34BA3d0 UNAVAIL corrupted data
c2t5000C50068DEA203d0 UNAVAIL corrupted data
spare-7 ONLINE
c2t5000C5006B5C78B3d0 FAULTED corrupted data
c2t5000C50068E1EEE7d0 FAULTED corrupted data
raidz1-3 ONLINE
c2t5000C50068883F83d0 FAULTED corrupted data
spare-1 ONLINE
c2t5000C500688AEAB7d0 UNAVAIL corrupted data
c2t5000C50068E150ABd0 FAULTED corrupted data
spare-2 ONLINE
c2t5000C50068950F93d0 UNAVAIL corrupted data
c2t5000C500688B91B3d0 FAULTED corrupted data
c2t5000C50068951093d0 FAULTED corrupted data
c2t5000C5006898C36Fd0 FAULTED corrupted data
c2t5000C50068A181ABd0 FAULTED corrupted data
c2t5000C50068EFB347d0 FAULTED corrupted data
c2t5000C50068EFBC17d0 UNAVAIL corrupted data
cache
c2t5000A7203008D09Bd0
spares
c2t5000C500688B91B3d0
c2t5000C500688B97BBd0
c2t5000C50068E150ABd0
c2t5000C50068E1EEE7d0
logs
mirror-4 ONLINE
c2t5000A7203008D0A4d0 ONLINE
c2t5000A7203008D0A5d0 ONLINE
and with zpool import -f POOLSAS, we got the unhappy output:(,
cannot import 'POOLSAS': one or more devices is currently unavailable
OK, from what I googled, It seems we suffered a corrupted pool, like
many other guys met.
Well, I checked with zdb, the label seems 'OK', and after some trace
work(Thanks to Dtrace, really powerful and useful in this case if not
all the case),
I found that vdev_label_read_config return NULL,
which makes the vdev 'can't open'(or we can say dead?), and the later
if (vd->vdev_ops->vdev_op_leaf &&
(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE)) {
if (zio->io_type == ZIO_TYPE_READ && vdev_cache_read(zio) == 0)
return (ZIO_PIPELINE_CONTINUE);
if ((zio = vdev_queue_io(zio)) == NULL)
return (ZIO_PIPELINE_STOP);
if (!vdev_accessible(vd, zio)) {
zio->io_error = ENXIO;
zio_interrupt(zio);
return (ZIO_PIPELINE_STOP);
}
}
return (vd->vdev_ops->vdev_op_io_start(zio));
finally, We got ENXIO in spa_import->...->spa_load_impl, I guess it's
right here after vdev_load(which suffers ENXIO and vdev_dtl_load failed
right in space_map_load and
dmu_read->dmu_buf_hold_array_by_dnode->zio_wait(ENXIO set in
zio_vdev_io_start?)
uint64_t txg = strict ? spa->spa_config_txg : -1ULL;
if ((label = vdev_label_read_config(vd, txg)) == NULL) {
vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
VDEV_AUX_BAD_LABEL);
return (0);
}
I found that the spa->spa_config_txg is 0, which lead the
vdev_label_read_config return NULL,
1, does spa->spa_config_txg = 0 mean that the data corrupted?
2, Is there any solution to recover my data, I mean import the pool?
Thanks.
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com

Bob

2014-04-25 03:33:11 UTC

Permalink

Dear George,

I think you are totally right:)
after pick up the patch, the 'corrupted pool' comes back!!!
Very appreciated for you help.

Post by George Wilson
Bob,
3422 zpool create/syseventd race yield non-importable pool
That changed the logic in vdev_validate() to better deal with the strict
checks. It's possible that even with that fix we may need to see why we
can't read the label. Can you pull in the fix for the bug above and retry?
Thanks,
George

-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/25851491-12403091
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com

Bob

2014-04-25 12:24:16 UTC

Permalink

Hello All,

After the import and tried to use scrub for the POOL, It seems that
there's still data errors(checksum error reported when try using
send/recv to backup the data),
***@openindiana:~# zpool status
pool: POOLSAS
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 125G in 0h38m with 378 errors on Fri Apr 25
20:03:32 2014
config:

NAME STATE READ WRITE CKSUM
POOLSAS ONLINE 0 0 2
raidz1-0 ONLINE 0 0 0
c2t5000C50068DE43B3d0 ONLINE 0 0 0
c2t5000C50068DE4F63d0 ONLINE 0 0 0
c2t5000C50068DE600Bd0 ONLINE 0 0 0
c2t5000C50068E16F3Fd0 ONLINE 0 0 0
c2t5000C50068E177B7d0 ONLINE 0 0 0
c2t5000C50068E19A53d0 ONLINE 0 0 0
c2t5000C50068E1DEE7d0 ONLINE 0 0 0
c2t5000C50068E34F0Fd0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c2t5000C50068DE6957d0 ONLINE 0 0 0
c2t5000C50068E16983d0 ONLINE 0 0 0
c2t5000C50068E18C73d0 ONLINE 0 0 0
c2t5000C50068E1AF2Fd0 ONLINE 0 0 0
c2t5000C50068E1DB6Fd0 ONLINE 0 0 0
c2t5000C50068E1F8A7d0 ONLINE 0 0 0
c2t5000C50068E208EBd0 ONLINE 0 0 0
c2t5000C50068E266ABd0 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 4
c2t5000C500688BAB87d0 ONLINE 0 0 0
c2t5000C500688BB0CBd0 ONLINE 0 0 0
spare-2 ONLINE 0 0 0
c2t5000C50068950F3Bd0 ONLINE 0 0 0
c2t5000C500688B97BBd0 ONLINE 0 0 0
c2t5000C500689AB39Bd0 ONLINE 0 0 0
c2t5000C500689AB4A7d0 ONLINE 0 0 0
c2t5000C50068A34BA3d0 ONLINE 0 0 0
c2t5000C50068DEA203d0 ONLINE 0 0 0
spare-7 ONLINE 0 0 0
c2t5000C5006B5C78B3d0 ONLINE 0 0 0
c2t5000C50068E1EEE7d0 ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
c2t5000C50068883F83d0 ONLINE 0 0 0
spare-1 ONLINE 0 0 0
c2t5000C500688AEAB7d0 ONLINE 0 0 0
c2t5000C50068E150ABd0 ONLINE 0 0 0
spare-2 ONLINE 0 0 0
c2t5000C50068950F93d0 ONLINE 0 0 0
c2t5000C500688B91B3d0 ONLINE 0 0 0
c2t5000C50068951093d0 ONLINE 0 0 0
c2t5000C5006898C36Fd0 ONLINE 0 0 0
c2t5000C50068A181ABd0 ONLINE 0 0 0
c2t5000C50068EFB347d0 ONLINE 0 0 0
c2t5000C50068EFBC17d0 ONLINE 0 0 0
logs
mirror-4 ONLINE 0 0 0
c2t5000A7203008D0A4d0 ONLINE 0 0 0
c2t5000A7203008D0A5d0 ONLINE 0 0 0
cache
c2t5000A7203008D09Bd0 ONLINE 0 0 0
spares
c2t5000C500688B91B3d0 INUSE currently in use
c2t5000C500688B97BBd0 INUSE currently in use
c2t5000C50068E150ABd0 INUSE currently in use
c2t5000C50068E1EEE7d0 INUSE currently in use

errors: 379 data errors, use '-v' for a list

NAME STATE READ WRITE CKSUM
POOLSAS ONLINE 0 0 264
raidz1-0 ONLINE 0 0 0
c2t5000C50068DE43B3d0 ONLINE 0 0 0
c2t5000C50068DE4F63d0 ONLINE 0 0 0
c2t5000C50068DE600Bd0 ONLINE 0 0 0
c2t5000C50068E16F3Fd0 ONLINE 0 0 0
c2t5000C50068E177B7d0 ONLINE 0 0 0
c2t5000C50068E19A53d0 ONLINE 0 0 0
c2t5000C50068E1DEE7d0 ONLINE 0 0 0
c2t5000C50068E34F0Fd0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c2t5000C50068DE6957d0 ONLINE 0 0 0
c2t5000C50068E16983d0 ONLINE 0 0 0
c2t5000C50068E18C73d0 ONLINE 0 0 0
c2t5000C50068E1AF2Fd0 ONLINE 0 0 0
c2t5000C50068E1DB6Fd0 ONLINE 0 0 0
c2t5000C50068E1F8A7d0 ONLINE 0 0 0
c2t5000C50068E208EBd0 ONLINE 0 0 0
c2t5000C50068E266ABd0 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 528
c2t5000C500688BAB87d0 ONLINE 0 0 0
c2t5000C500688BB0CBd0 ONLINE 0 0 0
spare-2 ONLINE 0 0 0
c2t5000C50068950F3Bd0 ONLINE 0 0 0
c2t5000C500688B97BBd0 ONLINE 0 0 0 (repairing)
c2t5000C500689AB39Bd0 ONLINE 0 0 0
c2t5000C500689AB4A7d0 ONLINE 0 0 0
c2t5000C50068A34BA3d0 ONLINE 0 0 0
c2t5000C50068DEA203d0 ONLINE 0 0 0
spare-7 ONLINE 0 0 0
c2t5000C5006B5C78B3d0 ONLINE 0 0 0 (repairing)
c2t5000C50068E1EEE7d0 ONLINE 0 0 0 (repairing)
raidz1-3 ONLINE 0 0 0
c2t5000C50068883F83d0 ONLINE 0 0 0
spare-1 ONLINE 0 0 0
c2t5000C500688AEAB7d0 ONLINE 0 0 0
c2t5000C50068E150ABd0 ONLINE 0 0 0 (repairing)
spare-2 ONLINE 0 0 0
c2t5000C50068950F93d0 ONLINE 0 0 0 (repairing)
c2t5000C500688B91B3d0 ONLINE 0 0 0 (repairing)
c2t5000C50068951093d0 ONLINE 0 0 0
c2t5000C5006898C36Fd0 ONLINE 0 0 0
c2t5000C50068A181ABd0 ONLINE 0 0 0
c2t5000C50068EFB347d0 ONLINE 0 0 0
c2t5000C50068EFBC17d0 ONLINE 0 0 0
logs
mirror-4 ONLINE 0 0 0
c2t5000A7203008D0A4d0 ONLINE 0 0 0
c2t5000A7203008D0A5d0 ONLINE 0 0 0
cache
c2t5000A7203008D09Bd0 ONLINE 0 0 0
spares
c2t5000C500688B91B3d0 INUSE currently in use
c2t5000C500688B97BBd0 INUSE currently in use
c2t5000C50068E150ABd0 INUSE currently in use
c2t5000C50068E1EEE7d0 INUSE currently in use

errors: Permanent errors have been detected in the following files:

POOLSAS/Cloud-desk-3TB-***@snap0409:<0x1>
POOLSAS/Cloud-desk-3TB-***@snap_poollsas:<0x1>

I'd like to know if it's possible to recover these errors? should I try
more scrubs?

Post by Bob
Dear George,
I think you are totally right:)
after pick up the patch, the 'corrupted pool' comes back!!!
Very appreciated for you help.

Jim Klimov via illumos-zfs

2014-04-28 15:53:56 UTC

Permalink

Post by Bob
Hello All,
After the import and tried to use scrub for the POOL, It seems that
there's still data errors(checksum error reported when try using
send/recv to backup the data),
I'd like to know if it's possible to recover these errors? should I try
more scrubs?

Well, it certainly should hot hurt to try scrubbing, but it may still be the case that the file is broken on storage - i.e. that more than the number of redundant disks for that block contain invalid data, so that no recombination of the available pieces results in a match to the block checksum stored elsewhere. Thanks to ZFS, this data corruption is no longer silent.

I've had such events on my home nas, possibly due to desktop-grade hardware, where all 6 disks happened to occasionally spew garbage at the same offsets, ruining parts of the same block. It may be argued what exactly had malfunctioned (old cpu, non-ecc ram, power source spikes, controller... maybe even disks, at least they could theoretically misinterpret some power surges as commands), but I never saw such behavior on real servers.

What you can do, even while scrubs are underway, is query the disks with zdb - to find the chain of blockpointers to the file data involved, and from that calculate the disk offsets and look up the pieces of blocks on specific disk sectors. 'Zdb -R' might also help, but I think it was less than satisfactory for me back then, can't really say why now, or if it was improved somehow... perhaps it tried to reconstruct data while i was interested in raw sector contents?.. Anyhow, if there are any suspicious irregularities (or rather regularities - in my case 6 sectors at same offsets of differnent disks were filled by the same pattern of bytes, 2 or 4 bytes long, over and over...) - you'd see t hat somethjng is wrong indeed ;)

To find the offset in the broken file you can use dd with the dataset's blocksize and conv=noerror, it should give you the location(s) of IO error(s) in the file, easily translatable to the line number in zdb block-walk for the file ;)

Finally, your zpool report says that this file has broken data in snapshots. Apparently, live data for that block was replaced since... so... do you indeed really care to restore that piece of history, or is it just a matter of principle and learning?

HTH, Jim

--
Typos courtesy of K-9 Mail on my Samsung Android