Discussion:
vmware store nfs target high CPU load
Koping Wang
2013-09-27 15:51:13 UTC
Permalink
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create several volumes as vmware nfs target. I follow the Nexenta's best practice guide using 16K record size. So far we have around 90 VMs running on the it. I found the load average load on the zfs server is high, pick hour can hit 20, after business hour can still be 6 or 7. (Just use uptime). If there is storage vmotion, the load can go over 20. Vmstat show "sy" can be 50-75% usage in pick hour, 25% in after hour. I have never see every load this high. My question is "Can 16K record size cause High load?

Thanks
Koping Wang




-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Robert Mustacchi
2013-09-27 15:53:33 UTC
Permalink
Post by Koping Wang
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create several volumes as vmware nfs target. I follow the Nexenta's best practice guide using 16K record size. So far we have around 90 VMs running on the it. I found the load average load on the zfs server is high, pick hour can hit 20, after business hour can still be 6 or 7. (Just use uptime). If there is storage vmotion, the load can go over 20. Vmstat show "sy" can be 50-75% usage in pick hour, 25% in after hour. I have never see every load this high. My question is "Can 16K record size cause High load?
The best thing to do here is to see where your CPU usage is going. The
simplest way to do that is to put together one of Brendan Gregg's flame
graphs:

https://github.com/brendangregg/flamegraph

We can observe these things rather easily, so let's not guess.

Robert
Ray Van Dolson
2013-10-18 06:41:35 UTC
Permalink
Post by Robert Mustacchi
Post by Koping Wang
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and
MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs
for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create
several volumes as vmware nfs target. I follow the Nexenta's best
practice guide using 16K record size. So far we have around 90 VMs
running on the it. I found the load average load on the zfs server
is high, pick hour can hit 20, after business hour can still be 6
or 7. (Just use uptime). If there is storage vmotion, the load can
go over 20. Vmstat show "sy" can be 50-75% usage in pick hour, 25%
in after hour. I have never see every load this high. My question
is "Can 16K record size cause High load?
The best thing to do here is to see where your CPU usage is going. The
simplest way to do that is to put together one of Brendan Gregg's flame
https://github.com/brendangregg/flamegraph
We can observe these things rather easily, so let's not guess.
Robert
FYI -- here is FlameGraph output for our current workload:

https://esri.box.com/shared/static/lhh4afjnc4duyuacebbj.svg

1-minute load currently at 17.

Ray
Matthew Ahrens
2013-10-18 15:57:56 UTC
Permalink
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and
MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs
for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create
several volumes as vmware nfs target. I follow the Nexenta's best
practice guide using 16K record size. So far we have around 90 VMs
running on the it. I found the load average load on the zfs server
is high, pick hour can hit 20, after business hour can still be 6
or 7. (Just use uptime). If there is storage vmotion, the load can
go over 20. Vmstat show "sy" can be 50-75% usage in pick hour, 25%
in after hour. I have never see every load this high. My question
is "Can 16K record size cause High load?
The best thing to do here is to see where your CPU usage is going. The
simplest way to do that is to put together one of Brendan Gregg's flame
https://github.com/brendangregg/flamegraph
We can observe these things rather easily, so let's not guess.
Robert
https://esri.box.com/shared/static/lhh4afjnc4duyuacebbj.svg
1-minute load currently at 17.
Looks like the CPU load is mainly caused by NFS read and write requests.

--matt



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Ray Van Dolson
2013-10-18 16:10:14 UTC
Permalink
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and
MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs
for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create
several volumes as vmware nfs target. I follow the Nexenta's best
practice guide using 16K record size. So far we have around 90 VMs
running on the it. I found the load average load on the zfs server
is high, pick hour can hit 20, after business hour can still be 6
or 7. (Just use uptime). If there is storage vmotion, the load can
go over 20. Vmstat show "sy" can be 50-75% usage in pick hour, 25%
in after hour. I have never see every load this high. My question
is "Can 16K record size cause High load?
The best thing to do here is to see where your CPU usage is going. The
simplest way to do that is to put together one of Brendan Gregg's flame
https://github.com/brendangregg/flamegraph
We can observe these things rather easily, so let's not guess.
Robert
https://esri.box.com/shared/static/lhh4afjnc4duyuacebbj.svg
1-minute load currently at 17.
Looks like the CPU load is mainly caused by NFS read and write requests.
--matt
Interesting. It seems a bit odd that the NFS load on this system would
generate such a high system load number. Maybe this isn't even truly
indicative of a problem, but am used to things being much "lower".

It's a pretty beefy system -- 16 processor cores, 144GB of memory and
STEC ZIL/L2ARC drives. 12 vdevs of 15 7.2K RPM disks each... at the
time of this load we were seeing maybe 120MB/sec of NFS traffic and
Richard's NFS top utility didn't show any particularly high levels of
write latency going on (though some read responses were nearing 25ms).

Thanks,
Ray
Sam Zaydel
2013-10-18 17:09:48 UTC
Permalink
I do not recall dedup being mentioned in this discussion, so I just wanted
to quickly raise it as an item. Also, some compression algorithms with
certain data may be adding to load. Are you using compression and dedup at
all?
Post by Koping Wang
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and
MD1200 JBODs. This is a pretty large system, 180 drives, plus
SSDs
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create
several volumes as vmware nfs target. I follow the Nexenta's best
practice guide using 16K record size. So far we have around 90
VMs
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
running on the it. I found the load average load on the zfs
server
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
is high, pick hour can hit 20, after business hour can still be 6
or 7. (Just use uptime). If there is storage vmotion, the load
can
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
go over 20. Vmstat show "sy" can be 50-75% usage in pick hour,
25%
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
in after hour. I have never see every load this high. My
question
Post by Ray Van Dolson
Post by Robert Mustacchi
Post by Koping Wang
is "Can 16K record size cause High load?
The best thing to do here is to see where your CPU usage is going.
The
Post by Ray Van Dolson
Post by Robert Mustacchi
simplest way to do that is to put together one of Brendan Gregg's
flame
Post by Ray Van Dolson
Post by Robert Mustacchi
https://github.com/brendangregg/flamegraph
We can observe these things rather easily, so let's not guess.
Robert
https://esri.box.com/shared/static/lhh4afjnc4duyuacebbj.svg
1-minute load currently at 17.
Looks like the CPU load is mainly caused by NFS read and write requests.
--matt
Interesting. It seems a bit odd that the NFS load on this system would
generate such a high system load number. Maybe this isn't even truly
indicative of a problem, but am used to things being much "lower".
It's a pretty beefy system -- 16 processor cores, 144GB of memory and
STEC ZIL/L2ARC drives. 12 vdevs of 15 7.2K RPM disks each... at the
time of this load we were seeing maybe 120MB/sec of NFS traffic and
Richard's NFS top utility didn't show any particularly high levels of
write latency going on (though some read responses were nearing 25ms).
Thanks,
Ray
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/24342081-7731472e
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
--
Join the geek side, we have π!

Please feel free to connect with me on LinkedIn.
http://www.linkedin.com/in/samzaydel



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Boris Protopopov
2013-09-27 16:43:31 UTC
Permalink
Hey Koping,
Did you try getting on touch with Nexenta support ?
Best regards, Boris

Typos courtesy of my iPhone
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create several volumes as vmware nfs target. I follow the Nexenta’s best practice guide using 16K record size. So far we have around 90 VMs running on the it. I found the load average load on the zfs server is high, pick hour can hit 20, after business hour can still be 6 or 7. (Just use uptime). If there is storage vmotion, the load can go over 20. Vmstat show “sy” can be 50-75% usage in pick hour, 25% in after hour. I have never see every load this high. My question is “Can 16K record size cause High load?
Thanks
Koping Wang
illumos-zfs | Archives | Modify Your Subscription
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Richard Elling
2013-09-28 15:49:55 UTC
Permalink
Hi,
I have a zfs server running nexenta 3.1.3.5. I use Dell R720 and MD1200 JBODs. This is a pretty large system, 180 drives, plus SSDs for l2arc and ZIL. The R720 has 2 8 core CPU. Recently I create several volumes as vmware nfs target. I follow the Nexenta’s best practice guide using 16K record size. So far we have around 90 VMs running on the it. I found the load average load on the zfs server is high, pick hour can hit 20, after business hour can still be 6 or 7. (Just use uptime).
Uptime is a poor metric to judge performance, especially NFS. It shows the rolling average
of the number of threads on the run queue. If that number is less than the number of CPUs,
as measured by psrinfo, then there are idle CPUs. For NFS, the requests tend to be short-lived.
If there is storage vmotion, the load can go over 20. Vmstat show “sy” can be 50-75% usage in pick hour, 25% in after hour. I have never see every load this high. My question is “Can 16K record size cause High load?
Yes. But many other things can also cause high loads.

The number of NFS daemon (nfsd) threads varies based on demand. Each client can have a
number of threads, for ESXi this is a tunable (NFS.MaxConnPerIP). Thus the worst case burst
load on the NFS server is:
theoretical peak client threads = sum(NFS.MaxConnPerIP) for all ESXi servers

By default, this will peak at 1024 on the server side, though NexentaStor used to allow higher
numbers, it is not clear that it is achievable on Nexenta's version of the OS. But don't worry
about this unless you see vmstat run queue depth (or its uptime rolling average) pegging at 1024.

For ESXi workloads, I have seen that ZFS prefetch can be counter-productive. This can lead to
higher CPU and I/O activity for little gain. The test here is to disable prefetch and see if the
performance improves. Do not use the %sys time metric to determine this, because if there is
any idle CPU time, then you are not CPU bound for this workload. What you need is...

Shameless plug: get nfssvrtop from http://github.com/richardelling/tools. This will show you the
performance and latency of the NFS service. This is invaluable because you can make changes
on the server side and see their affect on the NFS response times to the clients.

Finally, NFS, TCP/IP, and ZFS are all running in the kernel, so their CPU usage is accounted
against system time.
-- richard

--

***@RichardElling.com
+1-760-896-4422
Loading...