Discussion:
Current recommendations for SSDs fof ZFS rigs on a budget
Jim Klimov
2013-10-02 09:52:45 UTC
Permalink
Hello all,

I believe it is valid to say that the most important quality of SSDs
for a small-box ZFS usage (rpool, zil ffor data pool, l2arc) is their
reliability, then size as fit for the rpool and caching of a particular
data pool, low price (provided that the constraints above are met) and
performance (for a low price expected substellar, though above hdds).

Reliability involves good wear-leveling chipsets, which should maximize
the total writeable volume and/or time until total device death given
the technology of chips (slc, mlc, emlc, etc.) and element size (nm),
minimize block errors (io errors or garbage upon read), and powerloss
protection (do save cached sync writes, don't sprinkle garbage over
existing data).

I believe the latter aspect - powerloss protection - severely limits the
choice in both cheap and "almost enterprise" series. At least,
this is important for ZIL and rpool, less so for L2ARC.

For commercial servers i was looking at intel dc3700 or dc3500 lineups,
which tout reliability and a long warranty, but are a bit pricey and
maybe overkill for a home enthusiast (store photos, serve a scratch area
to edit them over cifs/nfs, store backups of the home zoo of pc's,
occasionally run some compilations or vm's).

Is there anything of similar quality and reliability, but cheaper
(perhaps smaller and slower) that list members can recommend to use
at home without worries that data/OS would be more reliably served
without SSDs in the box (i.e. something not crappy)?

Also, what sizes should be reasonably set aside for zil (mirrored)
and l2arc (striped) for a 4*4tb raidz1 (probably) pool on a planned
maxed-out HP n54l with 16gb ram and an LSI9211-based HBA? I guess,
based on RAM size, there is some limit after which more L2ARC won't
do any good anyhow? :)

Thanks,
//Jim Klimov

Typos courtesy of my Samsung Mobile
Stefan Ring
2013-10-02 14:44:46 UTC
Permalink
Post by Jim Klimov
I believe the latter aspect - powerloss protection - severely limits the
choice in both cheap and "almost enterprise" series. At least,
this is important for ZIL and rpool, less so for L2ARC.
Do you honestly care if you lose a few seconds worth of writes in the
home use scenario? I don't.

It's another matter if unlucky timing causes me to lose a file that
has been there "forever" and has just been rewritten by
write-to-new-file / rename. But this cannot happen with ZFS in the
first place, and even if it could, recovery from an automatic snapshot
would be trivial.

It’s also another matter if the device gets bricked by power loss,
which does happen. I have read about this in some report linked to
from this list, IIRC.
Erik ABLESON
2013-10-02 14:57:08 UTC
Permalink
Post by Stefan Ring
Post by Jim Klimov
I believe the latter aspect - powerloss protection - severely limits the
choice in both cheap and "almost enterprise" series. At least,
this is important for ZIL and rpool, less so for L2ARC.
Do you honestly care if you lose a few seconds worth of writes in the
home use scenario? I don't.
It's another matter if unlucky timing causes me to lose a file that
has been there "forever" and has just been rewritten by
write-to-new-file / rename. But this cannot happen with ZFS in the
first place, and even if it could, recovery from an automatic snapshot
would be trivial.
It’s also another matter if the device gets bricked by power loss,
which does happen. I have read about this in some report linked to
from this list, IIRC.
Which is why the best investment is a dedicated UPS for your storage server, rather than worrying about fancy high-end enterprise SSDs. Depending on your distribution, you will have easy or difficult access to apcupsd which lets you configure automatic shutdown behaviour when alerted via USB from the UPS that time is short.

Plus it's just a good idea to have a UPS to smooth out power issues, especially for machines that are on all of the time. Heck in this case, you can even disable sync if your confident the UPS gives you enough time to shutdown gracefully (I know, this is a bad idea™, but if you want to pull the most out of a single RAIDZ1 vdev, sometimes it's a reasonable approach)

Cheers,

Erik
Stefan Ring
2013-10-02 17:05:01 UTC
Permalink
Post by Stefan Ring
It’s also another matter if the device gets bricked by power loss,
which does happen. I have read about this in some report linked to
from this list, IIRC.
Here: http://www.listbox.com/member/archive/182191/2013/04/sort/time_rev/page/10/entry/24:311/20130410122555:50155488-A1FB-11E2-AD8A-BF08FAA940BA/
Schweiss, Chip
2013-10-02 17:21:43 UTC
Permalink
I suspect, though I've yet to be able to confirm for sure that the Samsung
840 Pro does properly 'SYNCHRONIZE CACHE'.

If you send sync writes to the 840 Pro, it is magnitudes slower than
non-sync writes.

I don't know how without some system controlled mechanism to cut power mere
microseconds from a 'SYNCHRONIZE CACHE' command to actually prove it works
correctly.

Anyone know of a good way to test this?

Also if you slice many consumer SSDs, the 840 Pro in particular to about
80% usage, it will have really good write consistency over time, on par
with the Intel DC S3700. http://www.anandtech.com/show/6489/playing-with-op
I have confirmed this on my own ZFS pool with 840 Pro's with 20% spare area.

-Chip
Post by Stefan Ring
It’s also another matter if the device gets bricked by power loss,
which does happen. I have read about this in some report linked to
from this list, IIRC.
http://www.listbox.com/member/archive/182191/2013/04/sort/time_rev/page/10/entry/24:311/20130410122555:50155488-A1FB-11E2-AD8A-BF08FAA940BA/
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/21878139-69539aca
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Timothy Coalson
2013-10-02 21:29:01 UTC
Permalink
Post by Jim Klimov
I believe the latter aspect - powerloss protection - severely limits the
choice in both cheap and "almost enterprise" series. At least,
this is important for ZIL and rpool, less so for L2ARC.
For commercial servers i was looking at intel dc3700 or dc3500 lineups,
which tout reliability and a long warranty, but are a bit pricey and maybe
overkill for a home enthusiast (store photos, serve a scratch area to edit
them over cifs/nfs, store backups of the home zoo of pc's, occasionally run
some compilations or vm's).
Is there anything of similar quality and reliability, but cheaper
(perhaps smaller and slower) that list members can recommend to use
at home without worries that data/OS would be more reliably served
without SSDs in the box (i.e. something not crappy)?
Intel's SSD 320 also has power loss protection, and is somewhat cheaper for
the size than the DC S3700 (also slower, which may negate its price
advantage for use as SLOG, see below).
Post by Jim Klimov
Also, what sizes should be reasonably set aside for zil (mirrored)
and l2arc (striped) for a 4*4tb raidz1 (probably) pool on a planned
maxed-out HP n54l with 16gb ram and an LSI9211-based HBA? I guess,
based on RAM size, there is some limit after which more L2ARC won't
do any good anyhow? :)
SLOG devices don't always need to be mirrored in my opinion, the main thing
it protects against is simultaneous unclean pool export (crash, power loss,
etc) and failure of an SLOG device, and the impact of such an event is the
loss of a few seconds of data (the other thing it protects against is loss
of synchronous write performance if an SLOG device dies). With power loss
protection on the SSD, I think the probability of this is often low enough
to ignore, especially for home use. You should also consider how much you
actually want to pay (if anything) for the benefits of an SLOG
(sync=disabled for home use is cheap and fast, for the price of the last
few seconds of data lost on crash, etc, or you can leave sync enabled
without SLOG and have small sync writes be slow, but all data safe on
crash, etc). As always, you can set it up without SLOG, and compare your
workload before and after setting sync=disabled to see what kind of
difference an SLOG could make.

As for sizing SLOG devices, that depends on how fast you will be doing
synchronous writing to the pool - a rule of thumb I seem to recall from one
of these lists is 3 txg's worth (data in the ZIL expires after the txg
including it makes it to main storage, but one txg is syncing to disk while
the next is gathering writes, and some other detail, or just a fudge factor
makes it 3), so with default 5sec txgs and assume writing over gigabit
ethernet, 15sec*120MB/s = 1.8GB. If the network isn't your bottleneck for
the sync writes you do, consider the max throughput of your data vdevs
instead: for raidz1 on 4 HDDs, 3 * 150MB/s * 15sec = 6.75GB.

However, it is also important that the SLOG device(s) have the throughput
to keep up, and for SSDs, this will be your main consideration, as the
space needed for your ZIL is rather small compared to typical SSD sizes.
NFS issues synchronous writes (even ones that the client sends as
asynchronous), so if that is what you use the system for, it will need to
be able to take all the NFS write bandwidth.

There is a rule for sizing l2arc based on arc size, and depends on block
sizes (and the size of some header struct for l2arc entries?), but I am not
familiar with it. For home use, you may not even notice the impact of
having any l2arc (random read latency to a larger static dataset than arc
can hold). I don't think there is an easy way to test this other than
trying it, though.

Tim



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Chris Siebenmann
2013-10-03 16:17:57 UTC
Permalink
| However, it is also important that the SLOG device(s) have the
| throughput to keep up, and for SSDs, this will be your main
| consideration, as the space needed for your ZIL is rather small
| compared to typical SSD sizes. NFS issues synchronous writes (even
| ones that the client sends as asynchronous), so if that is what you
| use the system for, it will need to be able to take all the NFS write
| bandwidth.

It's worth noting that modern versions of NFS support asynchronous
writes and generally semi-default to them. Many NFS operations are still
synchronous and large NFS data writes will force syncs far more often
than they would for local IO, but you do get some relief here (depending
on both your exact workload and on the specific clients involved).

- cks

Loading...