Discussion:
zfs has stopped writing to pool disks
Danw via illumos-zfs
2014-05-29 07:46:16 UTC
Permalink
Hi All,

I've encountered a weird problem with my OI install.

Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk and
all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero of=/dev/dsk/c6t18d0s0"
works fine
export/import doesn't solve anything

Environment:
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split

I did a update in the last month, and once I can get the system to boot
without the pool I'll try again.

Any ideas why this is happening?

Thanks!
Dan
George Wilson via illumos-zfs
2014-05-29 18:14:32 UTC
Permalink
Check to see if the pool is suspended:

As root run:

# echo "::walk spa | ::print spa_t spa_name spa_suspended"

If it returns a 1 for spa_suspended then it means that an I/O error was
encountered by the pool. You can also look at fmdump to see if errors
have been logged to FMA.

- George
Post by Danw via illumos-zfs
Hi All,
I've encountered a weird problem with my OI install.
Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk and
all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero of=/dev/dsk/c6t18d0s0"
works fine
export/import doesn't solve anything
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split
I did a update in the last month, and once I can get the system to
boot without the pool I'll try again.
Any ideas why this is happening?
Thanks!
Dan
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
George Wilson via illumos-zfs
2014-05-30 12:47:46 UTC
Permalink
It would be useful to see the kernel stacks when the I/O stops by running:

# echo "::stacks -m zfs" | mdb -kw

Can you post that output?

Thanks,
George
Thanks for the reply.
running
echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k
resulted in a zero for my pool.
It seems to be related to having compression set to lz4 in the root of
the pool. ran the command "zfs inherit compression tank" before I went
to work and by the time I got home it had managed to work its way
through and IO ran fine.
I re-enabled lz4 compression and after some time writes stopped
working. It appears to work when I enable it for a sub-filesystem
(tank/share) but when its the root of the pool (tank) it seems to stop
working correctly.
I'm starting to wonder if it may have something todo with the root
being set mountpoint=none.
Ideas?
Thanks!
Dan
Post by George Wilson via illumos-zfs
# echo "::walk spa | ::print spa_t spa_name spa_suspended"
If it returns a 1 for spa_suspended then it means that an I/O error
was encountered by the pool. You can also look at fmdump to see if
errors have been logged to FMA.
- George
Post by Danw via illumos-zfs
Hi All,
I've encountered a weird problem with my OI install.
Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk
and all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero of=/dev/dsk/c6t18d0s0"
works fine
export/import doesn't solve anything
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split
I did a update in the last month, and once I can get the system to
boot without the pool I'll try again.
Any ideas why this is happening?
Thanks!
Dan
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
George Wilson via illumos-zfs
2014-05-30 16:53:50 UTC
Permalink
It does look like you're in the write code path:

ffffff0017424c40 SLEEP CV 1
swtch+0x145
cv_wait+0x61
zio_wait+0x5d
dbuf_read+0x1e5
dbuf_findbp+0xf8
dbuf_hold_impl+0x81
dbuf_hold+0x2e
dmu_buf_hold+0x8e
zap_get_leaf_byblk+0x5c
zap_deref_leaf+0x78
fzap_update+0x6f
zap_update_uint64+0x8e
ddt_zap_update+0x78
ddt_object_update+0x51
ddt_sync_entry+0x237
ddt_sync_table+0x9f
ddt_sync+0x6b
spa_sync+0x3b4
txg_sync_thread+0x27b
thread_start+8

Is it possible that most of the data you're writing is being dedup-ed
and so you're not seeing many writes? You should still see some
metadata updates though.

You could also try to run the following several times:

echo "::stacks -c spa_sync" | mdb -k

Also can you please include 'zpool iostat tank 1' collected over a this
period?

Thanks,
George
It took aprox 30-60 minutes after enabling lz4 on the root of the
zpool for writes to disable, and when I noticed it was because a
"rsync bigfile.gz /path/ -aP" got 700GB in and just stopped. No error
and no progress.
# echo "::stacks -m zfs" | mdb -kw
THREAD STATE SOBJ COUNT
ffffff00171eac40 SLEEP CV 2
swtch+0x145
cv_wait+0x61
spa_thread+0x225
thread_start+8
ffffff0017796c40 SLEEP CV 2
swtch+0x145
cv_wait+0x61
txg_thread_wait+0x5f
txg_quiesce_thread+0x94
thread_start+8
ffffff001713dc40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
arc_reclaim_thread+0x13d
thread_start+8
ffffff0017143c40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
l2arc_feed_thread+0xa6
thread_start+8
ffffff001779fc40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
txg_thread_wait+0x7c
txg_sync_thread+0x12c
thread_start+8
ffffff0017424c40 SLEEP CV 1
swtch+0x145
cv_wait+0x61
zio_wait+0x5d
dbuf_read+0x1e5
dbuf_findbp+0xf8
dbuf_hold_impl+0x81
dbuf_hold+0x2e
dmu_buf_hold+0x8e
zap_get_leaf_byblk+0x5c
zap_deref_leaf+0x78
fzap_update+0x6f
zap_update_uint64+0x8e
ddt_zap_update+0x78
ddt_object_update+0x51
ddt_sync_entry+0x237
ddt_sync_table+0x9f
ddt_sync+0x6b
spa_sync+0x3b4
txg_sync_thread+0x27b
thread_start+8
ffffff0566af1820 ONPROC <NONE> 1
zio_destroy+0x53
zio_wait+0x7e
dmu_tx_count_write+0x25a
dmu_tx_hold_sa+0x46
zfs_write+0x55c
fop_write+0x6b
write+0x2e2
write32+0x22
_sys_sysenter_post_swapgs+0x149
#
Now I'm going to switch off lz4 and see how long the command takes to
complete.
Post by George Wilson via illumos-zfs
# echo "::stacks -m zfs" | mdb -kw
Can you post that output?
Thanks,
George
Thanks for the reply.
running
echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k
resulted in a zero for my pool.
It seems to be related to having compression set to lz4 in the root
of the pool. ran the command "zfs inherit compression tank" before I
went to work and by the time I got home it had managed to work its
way through and IO ran fine.
I re-enabled lz4 compression and after some time writes stopped
working. It appears to work when I enable it for a sub-filesystem
(tank/share) but when its the root of the pool (tank) it seems to
stop working correctly.
I'm starting to wonder if it may have something todo with the root
being set mountpoint=none.
Ideas?
Thanks!
Dan
Post by George Wilson via illumos-zfs
# echo "::walk spa | ::print spa_t spa_name spa_suspended"
If it returns a 1 for spa_suspended then it means that an I/O error
was encountered by the pool. You can also look at fmdump to see if
errors have been logged to FMA.
- George
Post by Danw via illumos-zfs
Hi All,
I've encountered a weird problem with my OI install.
Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk
and all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero
of=/dev/dsk/c6t18d0s0" works fine
export/import doesn't solve anything
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split
I did a update in the last month, and once I can get the system to
boot without the pool I'll try again.
Any ideas why this is happening?
Thanks!
Dan
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Danw via illumos-zfs
2014-05-31 00:02:38 UTC
Permalink
Post by George Wilson via illumos-zfs
echo "::stacks -c spa_sync" | mdb -k
Also can you please include 'zpool iostat tank 1' collected over a
this period?
I've collected zpool iostat, iostat and the mdb output. Below you can
see the layout of the zpool to see that c6t26 and c6t30 are the SSDs
used for caching, and that the regular disks are getting zero writes.

Thanks!
Dan

iostat -xdne 1 output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.iostat.txt

echo "::stacks -c spa_sync" | mdb -k output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.mdb.txt

zpool iostat -T d -v tank 1 output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.zpooliostat.txt


***@Gangsta:~# zpool status tank
pool: tank
state: ONLINE
scan: scrub repaired 103K in 106h36m with 0 errors on Fri May 23
22:36:19 2014
config:

NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c6t16d0 ONLINE 0 0 0
c6t20d0 ONLINE 0 0 0
c6t24d0 ONLINE 0 0 0
c6t28d0 ONLINE 0 0 0
c6t55d0 ONLINE 0 0 0
c6t53d0 ONLINE 0 0 0
c6t49d0 ONLINE 0 0 0
c6t51d0 ONLINE 0 0 0
c6t54d0 ONLINE 0 0 0
c6t56d0 ONLINE 0 0 0
c6t52d0 ONLINE 0 0 0
c6t50d0 ONLINE 0 0 0
logs
c6t26d0s0 ONLINE 0 0 0
c6t30d0s0 ONLINE 0 0 0
cache
c6t26d0s1 ONLINE 0 0 0
c6t30d0s1 ONLINE 0 0 0
spares
c6t18d0 AVAIL
c6t22d0 AVAIL

errors: No known data errors
***@Gangsta:~#
George Wilson via illumos-zfs
2014-05-31 03:11:37 UTC
Permalink
The data shows that you're spending all the time reading the dedup table
and since it's mostly random reads on a single raidz2 device it's
probably not very fast. Can you dump out the dedup table by using 'zpool
status -D tank'?

Another test worth trying is to disable dedup and then seeing what
happens with the writes.

- George
Post by Danw via illumos-zfs
Post by George Wilson via illumos-zfs
echo "::stacks -c spa_sync" | mdb -k
Also can you please include 'zpool iostat tank 1' collected over a
this period?
I've collected zpool iostat, iostat and the mdb output. Below you can
see the layout of the zpool to see that c6t26 and c6t30 are the SSDs
used for caching, and that the regular disks are getting zero writes.
Thanks!
Dan
iostat -xdne 1 output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.iostat.txt
echo "::stacks -c spa_sync" | mdb -k output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.mdb.txt
zpool iostat -T d -v tank 1 output
http://www.krakadikt.com/wp-content/uploads/2014/05/problem.zpooliostat.txt
pool: tank
state: ONLINE
scan: scrub repaired 103K in 106h36m with 0 errors on Fri May 23
22:36:19 2014
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c6t16d0 ONLINE 0 0 0
c6t20d0 ONLINE 0 0 0
c6t24d0 ONLINE 0 0 0
c6t28d0 ONLINE 0 0 0
c6t55d0 ONLINE 0 0 0
c6t53d0 ONLINE 0 0 0
c6t49d0 ONLINE 0 0 0
c6t51d0 ONLINE 0 0 0
c6t54d0 ONLINE 0 0 0
c6t56d0 ONLINE 0 0 0
c6t52d0 ONLINE 0 0 0
c6t50d0 ONLINE 0 0 0
logs
c6t26d0s0 ONLINE 0 0 0
c6t30d0s0 ONLINE 0 0 0
cache
c6t26d0s1 ONLINE 0 0 0
c6t30d0s1 ONLINE 0 0 0
spares
c6t18d0 AVAIL
c6t22d0 AVAIL
errors: No known data errors
Danw via illumos-zfs
2014-05-30 16:43:56 UTC
Permalink
It took aprox 30-60 minutes after enabling lz4 on the root of the zpool
for writes to disable, and when I noticed it was because a "rsync
bigfile.gz /path/ -aP" got 700GB in and just stopped. No error and no
progress.

# echo "::stacks -m zfs" | mdb -kw
THREAD STATE SOBJ COUNT
ffffff00171eac40 SLEEP CV 2
swtch+0x145
cv_wait+0x61
spa_thread+0x225
thread_start+8

ffffff0017796c40 SLEEP CV 2
swtch+0x145
cv_wait+0x61
txg_thread_wait+0x5f
txg_quiesce_thread+0x94
thread_start+8

ffffff001713dc40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
arc_reclaim_thread+0x13d
thread_start+8

ffffff0017143c40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
l2arc_feed_thread+0xa6
thread_start+8

ffffff001779fc40 SLEEP CV 1
swtch+0x145
cv_timedwait_hires+0xe0
cv_timedwait+0x5a
txg_thread_wait+0x7c
txg_sync_thread+0x12c
thread_start+8

ffffff0017424c40 SLEEP CV 1
swtch+0x145
cv_wait+0x61
zio_wait+0x5d
dbuf_read+0x1e5
dbuf_findbp+0xf8
dbuf_hold_impl+0x81
dbuf_hold+0x2e
dmu_buf_hold+0x8e
zap_get_leaf_byblk+0x5c
zap_deref_leaf+0x78
fzap_update+0x6f
zap_update_uint64+0x8e
ddt_zap_update+0x78
ddt_object_update+0x51
ddt_sync_entry+0x237
ddt_sync_table+0x9f
ddt_sync+0x6b
spa_sync+0x3b4
txg_sync_thread+0x27b
thread_start+8

ffffff0566af1820 ONPROC <NONE> 1
zio_destroy+0x53
zio_wait+0x7e
dmu_tx_count_write+0x25a
dmu_tx_hold_sa+0x46
zfs_write+0x55c
fop_write+0x6b
write+0x2e2
write32+0x22
_sys_sysenter_post_swapgs+0x149

#

Now I'm going to switch off lz4 and see how long the command takes to
complete.
Post by George Wilson via illumos-zfs
# echo "::stacks -m zfs" | mdb -kw
Can you post that output?
Thanks,
George
Thanks for the reply.
running
echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k
resulted in a zero for my pool.
It seems to be related to having compression set to lz4 in the root
of the pool. ran the command "zfs inherit compression tank" before I
went to work and by the time I got home it had managed to work its
way through and IO ran fine.
I re-enabled lz4 compression and after some time writes stopped
working. It appears to work when I enable it for a sub-filesystem
(tank/share) but when its the root of the pool (tank) it seems to
stop working correctly.
I'm starting to wonder if it may have something todo with the root
being set mountpoint=none.
Ideas?
Thanks!
Dan
Post by George Wilson via illumos-zfs
# echo "::walk spa | ::print spa_t spa_name spa_suspended"
If it returns a 1 for spa_suspended then it means that an I/O error
was encountered by the pool. You can also look at fmdump to see if
errors have been logged to FMA.
- George
Post by Danw via illumos-zfs
Hi All,
I've encountered a weird problem with my OI install.
Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk
and all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero of=/dev/dsk/c6t18d0s0"
works fine
export/import doesn't solve anything
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split
I did a update in the last month, and once I can get the system to
boot without the pool I'll try again.
Any ideas why this is happening?
Thanks!
Dan
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Danw via illumos-zfs
2014-05-30 06:38:41 UTC
Permalink
Thanks for the reply.

running
echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k
resulted in a zero for my pool.

It seems to be related to having compression set to lz4 in the root of
the pool. ran the command "zfs inherit compression tank" before I went
to work and by the time I got home it had managed to work its way
through and IO ran fine.

I re-enabled lz4 compression and after some time writes stopped working.
It appears to work when I enable it for a sub-filesystem (tank/share)
but when its the root of the pool (tank) it seems to stop working correctly.

I'm starting to wonder if it may have something todo with the root being
set mountpoint=none.

Ideas?

Thanks!
Dan
Post by George Wilson via illumos-zfs
# echo "::walk spa | ::print spa_t spa_name spa_suspended"
If it returns a 1 for spa_suspended then it means that an I/O error
was encountered by the pool. You can also look at fmdump to see if
errors have been logged to FMA.
- George
Post by Danw via illumos-zfs
Hi All,
I've encountered a weird problem with my OI install.
Writes to raidz2 disks have stopped working, as if i/o is blocking
Writes to ZIL work until memory fills up and cannot flush to disk and
all writes block
Reads always work
iostat -E shows zero errors
attempts to change zfs settings in this pool block
writing data to a spare via "dd if=/dev/zero of=/dev/dsk/c6t18d0s0"
works fine
export/import doesn't solve anything
IBM x3650
2x dual core xeon
12gb ram
1x lsi 3801e (1068e based) sas hba
2x Rackable se3016 16 disk jbod
12x 2tb sata wd red nas disk
2x 120gb sata intel s3500 ssd for log/cache 11gb/100gb split
I did a update in the last month, and once I can get the system to
boot without the pool I'll try again.
Any ideas why this is happening?
Thanks!
Dan
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24744300-e24df8bc
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Loading...