Discussion:
zpool export hang
Alex via illumos-zfs
2014-06-11 17:19:01 UTC
Permalink
I'm seeing zpool export <pool> hang sometimes on OmniOS r151010

the kernel thread is stuck here :
ffffff0032178ad0 txg_wait_synced+0x83(ffffff052fe74400, 0)
ffffff0032178b50 spa_export_common+0x17e(ffffff05d8c5f000, 1, 0, 1, 0)
ffffff0032178b80 spa_export+0x2a(ffffff05d8c5f000, 0, 1, 0)
ffffff0032178bd0 zfs_ioc_pool_export+0x3e(ffffff05d8c5f000)

whole stack here :
http://pastebin.com/CR5Gx4Cm

After rebooting the pool has to be manually imported, but is not otherwise
damaged.

Anyone has seen this on r151010 ?
Any advice on what to check to understand why we get stuck here ?

Cheers,
alex



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
George Wilson via illumos-zfs
2014-06-11 19:53:44 UTC
Permalink
Can you provide the output of:

::stacks -c spa_sync

Is it possible that the pool is suspended and is unable to completely
sync out the transaction group? You can check that with:

::walk spa | ::print spa_t spa_name spa_suspended

- George
Post by Alex via illumos-zfs
I'm seeing zpool export <pool> hang sometimes on OmniOS r151010
ffffff0032178ad0 txg_wait_synced+0x83(ffffff052fe74400, 0)
ffffff0032178b50 spa_export_common+0x17e(ffffff05d8c5f000, 1, 0, 1, 0)
ffffff0032178b80 spa_export+0x2a(ffffff05d8c5f000, 0, 1, 0)
ffffff0032178bd0 zfs_ioc_pool_export+0x3e(ffffff05d8c5f000)
http://pastebin.com/CR5Gx4Cm
After rebooting the pool has to be manually imported, but is not
otherwise damaged.
Anyone has seen this on r151010 ?
Any advice on what to check to understand why we get stuck here ?
Cheers,
alex
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4>
| Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Alex via illumos-zfs
2014-06-12 17:48:35 UTC
Permalink
I reproduced it today (the live dump from the previous hang was no good)
Same stack for the kernel thread as before.
Post by George Wilson via illumos-zfs
::stacks -c spa_sync
THREAD STATE SOBJ COUNT
ffffff0036dd7c40 SLEEP CV 1
swtch+0x141
cv_wait+0x70
taskq_wait+0x43
metaslab_group_preload+0x45
metaslab_sync_reassess+0x37
vdev_sync_done+0x74
spa_sync+0x4f3
txg_sync_thread+0x227
thread_start+8

ffffff003832bc40 SLEEP CV 1
swtch+0x141
cv_wait+0x70
zio_wait+0x5b
dsl_pool_sync+0x16c
spa_sync+0x2ff
txg_sync_thread+0x227
thread_start+8
Post by George Wilson via illumos-zfs
::walk spa | ::print spa_t spa_name spa_suspended
gives spa_suspended = 0 for all pools

whole output here http://pastebin.com/jZhgAdmS
--
alex
Post by George Wilson via illumos-zfs
::stacks -c spa_sync
Is it possible that the pool is suspended and is unable to completely sync
::walk spa | ::print spa_t spa_name spa_suspended
- George
I'm seeing zpool export <pool> hang sometimes on OmniOS r151010
ffffff0032178ad0 txg_wait_synced+0x83(ffffff052fe74400, 0)
ffffff0032178b50 spa_export_common+0x17e(ffffff05d8c5f000, 1, 0, 1, 0)
ffffff0032178b80 spa_export+0x2a(ffffff05d8c5f000, 0, 1, 0)
ffffff0032178bd0 zfs_ioc_pool_export+0x3e(ffffff05d8c5f000)
http://pastebin.com/CR5Gx4Cm
After rebooting the pool has to be manually imported, but is not
otherwise damaged.
Anyone has seen this on r151010 ?
Any advice on what to check to understand why we get stuck here ?
Cheers,
alex
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4> |
Modify
<https://www.listbox.com/member/?&>
Your Subscription <http://www.listbox.com>
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
George Wilson via illumos-zfs
2014-06-12 19:02:23 UTC
Permalink
Can you also get the output of:

::stacks -c metaslab_preload

It seems that the metaslab_group is still in the process of loading new
metaslab and the taskq is waiting for that to complete.

Thanks,
George
Post by Alex via illumos-zfs
I reproduced it today (the live dump from the previous hang was no good)
Same stack for the kernel thread as before.
Post by George Wilson via illumos-zfs
::stacks -c spa_sync
THREAD STATE SOBJ COUNT
ffffff0036dd7c40 SLEEP CV 1
swtch+0x141
cv_wait+0x70
taskq_wait+0x43
metaslab_group_preload+0x45
metaslab_sync_reassess+0x37
vdev_sync_done+0x74
spa_sync+0x4f3
txg_sync_thread+0x227
thread_start+8
ffffff003832bc40 SLEEP CV 1
swtch+0x141
cv_wait+0x70
zio_wait+0x5b
dsl_pool_sync+0x16c
spa_sync+0x2ff
txg_sync_thread+0x227
thread_start+8
Post by George Wilson via illumos-zfs
::walk spa | ::print spa_t spa_name spa_suspended
gives spa_suspended = 0 for all pools
whole output here http://pastebin.com/jZhgAdmS
--
alex
::stacks -c spa_sync
Is it possible that the pool is suspended and is unable to
::walk spa | ::print spa_t spa_name spa_suspended
- George
Post by George Wilson via illumos-zfs
I'm seeing zpool export <pool> hang sometimes on OmniOS r151010
ffffff0032178ad0 txg_wait_synced+0x83(ffffff052fe74400, 0)
ffffff0032178b50 spa_export_common+0x17e(ffffff05d8c5f000, 1, 0, 1, 0)
ffffff0032178b80 spa_export+0x2a(ffffff05d8c5f000, 0, 1, 0)
ffffff0032178bd0 zfs_ioc_pool_export+0x3e(ffffff05d8c5f000)
http://pastebin.com/CR5Gx4Cm
After rebooting the pool has to be manually imported, but is not
otherwise damaged.
Anyone has seen this on r151010 ?
Any advice on what to check to understand why we get stuck here ?
Cheers,
alex
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4>
| Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Alex via illumos-zfs
2014-06-17 10:07:50 UTC
Permalink
While trying to reproduce the problem, I got a different situation, "zpool
export" hangs, but :

The pool is seen in state 'EXPORTED' in ::spa

And the stack is different
0t5679::pid2proc | ::walk thread | ::findstack -v
stack pointer for thread ffffff05216033e0: ffffff0020d0b8f0
[ ffffff0020d0b8f0 _resume_from_idle+0xf4() ]
ffffff0020d0b920 swtch+0x141()
ffffff0020d0b960 cv_wait+0x70(ffffff493c63126a, ffffff493c631258)
ffffff0020d0b9a0 taskq_wait+0x43(ffffff493c631238)
ffffff0020d0b9d0 metaslab_group_passivate+0x41(ffffff493c767c90)
ffffff0020d0ba10 vdev_metaslab_fini+0x38(ffffff050925d580)
ffffff0020d0ba50 vdev_free+0x69(ffffff050925d580)
ffffff0020d0ba90 vdev_free+0x4b(ffffff0513248ac0)
ffffff0020d0bad0 spa_unload+0x7c(ffffff0510d06000)
ffffff0020d0bb50 spa_export_common+0x115(ffffff49480f1000, 1, 0, 1, 0)
ffffff0020d0bb80 spa_export+0x2a(ffffff49480f1000, 0, 1, 0)
ffffff0020d0bbd0 zfs_ioc_pool_export+0x3e(ffffff49480f1000)
(..)

(This is still on r151010, haven't tried with 'bloody' yet)
--
alex



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Loading...