Discussion:
zio_arena fragmentation under certain loads
Boris Protopopov
2013-10-04 18:30:18 UTC
Permalink
Hi, guys,

I've been looking at one particular case of vmem fragmentation where we run
out of free vmem segments of 128K and up in zio_arena due to fragmentation,
which makes it impossible for zio caches to get any more slabs. The latter
leads to system lockup.

I am working on the code that trails Illumos latest to some degree (paying
customer needs support), but I have checked that the 1618 fix is in. I also
tried playing with how early we start reaping ARC caches, brining the
threshold up to 1/8th of the zio arena from 1/16th (as in1618).

I also tried the latest Illumos and found that it did not fragment vmem to
the fatal degree. Yet I did not find relevant fixes in the commit list.
past the 1618 fix.

Perhaps I am missing it, so if anyone could point to something relevant, I
would appreciate it.
--
Best regards,

Boris Protopopov
Nexenta Systems

455 El Camino Real, Santa Clara, CA 95050

[d] 408.791.3366 | [c] 978.621.6901
Skype: bprotopopov



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
George Wilson
2013-10-04 18:47:26 UTC
Permalink
Boris,

There were fixes that went into Opensolaris prior to the illumos fix for
1618. Do you happen to know if the code base that the customer is
running is based off of Nevada build 147?

Thanks,
George
Post by Boris Protopopov
Hi, guys,
I've been looking at one particular case of vmem fragmentation where
we run out of free vmem segments of 128K and up in zio_arena due to
fragmentation, which makes it impossible for zio caches to get any
more slabs. The latter leads to system lockup.
I am working on the code that trails Illumos latest to some degree
(paying customer needs support), but I have checked that the 1618 fix
is in. I also tried playing with how early we start reaping ARC
caches, brining the threshold up to 1/8th of the zio arena from 1/16th
(as in1618).
I also tried the latest Illumos and found that it did not fragment
vmem to the fatal degree. Yet I did not find relevant fixes in the
commit list. past the 1618 fix.
Perhaps I am missing it, so if anyone could point to something
relevant, I would appreciate it.
--
Best regards,
Boris Protopopov
Nexenta Systems
455 El Camino Real, Santa Clara, CA 95050
[d] 408.791.3366 | [c] 978.621.6901Skype: bprotopopov
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4>
| Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Boris Protopopov
2013-10-04 19:10:25 UTC
Permalink
Hi, George,
thanks for the quick reply, yes, that code is based on onnv 134, with lots
of backports, etc. I will look more closely in that range. If there is
anything you recall :) please let me know,
Boris.
Post by George Wilson
Boris,
There were fixes that went into Opensolaris prior to the illumos fix for
1618. Do you happen to know if the code base that the customer is running
is based off of Nevada build 147?
Thanks,
George
Hi, guys,
I've been looking at one particular case of vmem fragmentation where we
run out of free vmem segments of 128K and up in zio_arena due to
fragmentation, which makes it impossible for zio caches to get any more
slabs. The latter leads to system lockup.
I am working on the code that trails Illumos latest to some degree (paying
customer needs support), but I have checked that the 1618 fix is in. I also
tried playing with how early we start reaping ARC caches, brining the
threshold up to 1/8th of the zio arena from 1/16th (as in1618).
I also tried the latest Illumos and found that it did not fragment vmem to
the fatal degree. Yet I did not find relevant fixes in the commit list.
past the 1618 fix.
Perhaps I am missing it, so if anyone could point to something relevant, I
would appreciate it.
--
Best regards,
Boris Protopopov
Nexenta Systems
455 El Camino Real, Santa Clara, CA 95050
[d] 408.791.3366 | [c] 978.621.6901 Skype: bprotopopov
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4> |
Modify<https://www.listbox.com/member/?&>Your Subscription <http://www.listbox.com>
--
Best regards,

Boris Protopopov
Nexenta Systems

455 El Camino Real, Santa Clara, CA 95050

[d] 408.791.3366 | [c] 978.621.6901
Skype: bprotopopov



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Loading...