Inline.
- Andrew
On Sun, Nov 17, 2013 at 10:56 AM, Richard Elling
Post by aurfalienHi,
I thought to combine 2 subjects as at first glance I sort of find them odd.
RAM and ARC;
I've read a bit on how 128GB on avg seems to be the sweet spot for ARC and
not to go over that.
This is poor advice. Where did you read it? The authors need to be enlightened.
Actually, at the moment, I'd stand by that advice. There are a number of
problems identified on 'large memory systems' (> 128-192 GB or so) that
have culminated in Nexenta forcing the ARC max to 128 GB on many builds
today. Identification and resolution of the bugs is ongoing. AFAIK, at
least some of the identified issues will effect any illumos-based OS, not
just Nexenta [which is a bit older], but I can't speak for Linux or FreeBSD
ZFS. The common wisdom in the field is 128-192, and also that it is enough
to limit ARC. We have systems in production with 256, 512, and more, that
are fine with ARC limited to 128-192 GB, and the hope is once the 'bugs'
are resolved, they could remove the artificial limit. We also have systems
in production with 512+ that are fine /without/ any limit -- the issues
have to do with more than /just/ the amount of RAM in the system (in fact,
AFAIK, one could argue they have nothing to do with the amount of RAM in
the system, but with other things.. however, the amount of RAM in the
system makes the symptoms of the 'bugs' go from manageable to unmanageable,
and it can be hard to impossible beforehand to tell you if you will hit the
problems, thus the field rule of thumb to limit to 128 GB of RAM).
Post by aurfalienI've also read the more RAM for ARC the better.
If one does go over 128GB, slow downs could occur. Nothing definitive on
my readings other then that.
Is there any merit to this?
Not really. There are behaviours that can occur on various OSes and how they deal with
large memory. ZFS uses the kernel memory allocation, so if the kernel can scale well,
then ZFS can scale well. If the kernel doesn't scale well... ZFS can't do
it better than the
kernel.
True and untrue. ZFS IS beholden to the kernel when it comes to memory
stuff, to some degree. However, that it is all about kernel would assume
there are zero inefficiencies in ZFS and how it handles its own in-memory
structures and that all perceived latency or issues regarding memory (ARC)
are a result of kernel code. That's just patently false. Random example:
ZFS decides it needs to 'free' 100 GB of RAM from ARC this very moment, on
a box where average block sized on the pool is 4-8 KB. AFAIK, act of
'freeing' that RAM isn't really so much a kernel task (it doesn't just go
to the kernel and say, hey dude, this huge block of RAM, I don't need it
anymore) as it is ZFS running through its own in-memory structures, freeing
up tiny pieces of RAM after it identifies where they are, a process that
takes considerable time (and in some circumstances seems to completely
freeze the box until it is done). But I'm ill-equipped to explain this
better than that and (thankfully for us all!) am not one of the developers
working on identifying and fixing these issues. I know enough to know that
the above statement might be technically accurate in some lights, but it
falls short of field-usable truth.
Post by aurfalienWhat tools could one use to monitor this phenomena, I assume arcstat.py and zpool iostat?
ARC kstats and the arcstat tool will show the current ARC size and targets.
Fragmentation;
I've read that if one goes over 70% storage utilization, that
fragmentation will occur and be noticeable.
This is true of all file systems. There is a point at which the allocation
algorithms must make
hard decisions. However, there is nothing magic about 70% for ZFS. There is some magic
that occurs at 96% and a well-managed datacenter often has a policy about going over
80% for capacity planning purposes.
Do tools exist to measure ZFS fragmentation?
Can you define "fragmentation" in this context? It is an overloaded term.
While the above is all technically true, it again seems to presume there's
no problem with going over 70, 80%, that 96% is the point of problem. This
is also factually untrue. Field experience tells us that 70-80% as a hard
cap on your utilization is 'good enough' to prevent significant performance
slowdown on a large percentage of deployed systems (of which we have
1000's). WHY this is has a couple of answers depending on your environment
and workload, and it ISN'T a hard & fast safety number (it's possible to
cause significant "fragmentation"-related slowdown with well under 50% of
the pool ever used, especially if you design a test load specifically to
cause issues like this), and some get bit even obeying a 70% rule, but the
majority do not, and so it makes a good field rule of thumb.
There's a big difference between what is academically true, and what is
best advice for field. Academically, there should be no problem with a ZFS
solution on a 1+ PB RAM box with quad-proc, hex-core processors with 7 x8
PCI-e slots each with a dual-port SAS HBA each in turn plugged into
separate SAS switches each in turn plugged into a total of let's say 60 or
so JBOD's, each containing 24 hard disks, for a total of 5,760 TB (w/4 TB
drives) raw space. However, field experience tells us you would be an
astronomical idiot to actually do this, and the resulting system would be a
disaster in the making. One that the owner of would never be happy or even
satisfied with, and would definitely regret buying. We know this without
ever having seen such a system, because we've seen a number of builds
anywhere from 1/5 to 1/3 this size, and they've all been problems.
Post by aurfalien-- richard
--
+1-760-896-4422
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/24484421-62d25f20> |
Modify<https://www.listbox.com/member/?&>Your Subscription
<http://www.listbox.com>
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com