Discussion:
deadlock in spa_config_enter()
Rafael Vanoni via illumos-zfs
2014-09-29 18:28:10 UTC
Permalink
Hi folks,

I have a system with a mirrored rpool on two SSDs that started hanging
when I run 'zpool status'. This box has a fairly old version of illumos,
but I haven't found any related bug online.

From what I could gather, the zpool(1) thread is trying to acquire a
ffffff03e6584be0::findstack -v
stack pointer for thread ffffff03e6584be0: ffffff00185c1a70
[ ffffff00185c1a70 _resume_from_idle+0xf1() ]
ffffff00185c1aa0 swtch+0x145()
ffffff00185c1ad0 cv_wait+0x61(ffffff03de2b5404, ffffff03de2b53f0)
ffffff00185c1b40 spa_config_enter+0x86(ffffff03de2b4b00, 7f,
ffffff03de2b4b00
, 0)
ffffff00185c1b80 spa_vdev_state_enter+0x73(ffffff03de2b4b00, 7f)
ffffff00185c1bd0 spa_vdev_set_common+0x37(ffffff03de2b4b00,
ae3b2a900a0041eb,
ffffff0479fca400, 1)
ffffff00185c1c00 spa_vdev_setpath+0x22(ffffff03de2b4b00,
ae3b2a900a0041eb,
ffffff0479fca400)
ffffff00185c1c40 zfs_ioc_vdev_setpath+0x48(ffffff0479fca000)
ffffff00185c1cc0 zfsdev_ioctl+0x177(8500000000, 5a10, 8042610, 100003,
ffffff0490dcfde0, ffffff00185c1de4)
ffffff00185c1d00 cdev_ioctl+0x45(8500000000, 5a10, 8042610, 100003,
ffffff0490dcfde0, ffffff00185c1de4)
ffffff00185c1d40 spec_ioctl+0x5a(ffffff03e595ed00, 5a10, 8042610,
100003,
ffffff0490dcfde0, ffffff00185c1de4, 0)
ffffff00185c1dc0 fop_ioctl+0x7b(ffffff03e595ed00, 5a10, 8042610, 100003,
ffffff0490dcfde0, ffffff00185c1de4, 0)
ffffff00185c1ec0 ioctl+0x18e(3, 5a10, 8042610)
ffffff00185c1f10 _sys_sysenter_post_swapgs+0x149()
ffffff047d4c8b20::findstack -v
stack pointer for thread ffffff047d4c8b20: ffffff0018c26bf0
[ ffffff0018c26bf0 _resume_from_idle+0xf1() ]
ffffff0018c26c20 swtch+0x145()
ffffff0018c26c50 cv_wait+0x61(ffffff03de2b5404, ffffff03de2b53f0)
ffffff0018c26cc0 spa_config_enter+0xcf(ffffff03de2b4b00, 2,
fffffffff7a32d68,
1)
ffffff0018c26d10 zil_flush_vdevs+0x55(ffffff03e0857100)
ffffff0018c26d80 zil_commit_writer+0x262(ffffff03e0857100, 1df3, 198a48)
ffffff0018c26dd0 zil_commit+0x8c(ffffff03e0857100, 1df3, 198a48)
ffffff0018c26e40 zfs_fsync+0xd3(ffffff04798ea140, 10,
ffffff04662ea2f0, 0)
ffffff0018c26e90 fop_fsync+0x5a(ffffff04798ea140, 10,
ffffff04662ea2f0, 0)
ffffff0018c26ec0 fdsync+0x38(5, 10)
ffffff0018c26f10 _sys_sysenter_post_swapgs+0x149()

Or as shown in mdb:

ADDR TYPE NWAITERS THREAD PROC
ffffff03de2b5404 cond 2: ffffff047d4c8b20 syslogd
ffffff03e6584be0 zpool

And here's the SPA config lock, showing that the first thread already
has the lock as a writer:

ffffff03de2b53f0 {
ffffff03de2b53f0 scl_lock = {
ffffff03de2b53f0 _opaque = [ 0 ]
}
ffffff03de2b53f8 scl_writer = 0xffffff03e6584be0
ffffff03de2b5400 scl_write_wanted = 0x1
ffffff03de2b5404 scl_cv = {
ffffff03de2b5404 _opaque = 0x2
}
ffffff03de2b5408 scl_count = {
ffffff03de2b5408 rc_count = 0x1
}
}

Anyone seen this before ?

Thanks,
Rafael

Loading...