John Woods
2013-11-18 16:45:26 UTC
We are looking at using ZIL on SSDs, and would like some input.
Background Problem:
During resilvering of a mirror, ZFS must balance between serving
application I/O, and resilvering I/O.
Solaris has a kernel parameter named "zfs:zfs_resilver_delay" that
controls this balance. The parameter can be between "0" and "4".
A value of "0" tells the O/S to give priority to resilvering I/O,
at the expense of application I/O rates.
A value of "4" tells the O/S to give priority to application I/O,
at the expense of resilvering I/O.
In Solaris 10, it is supposed to default to "2", but actually
defaults to "0".
Gotcha: We got bit by this, and had severe performance degradation.
Recommended: Change to at least "2", but keep in mind resilvering
times may jump by 100% or more.
Our Configuration:
- x86_64 (Intel Xeon-based)
- Solaris 10 64-bit
- 240GB physical RAM (max is 384GB)
- Root Pool:
- 2 x 300GB 10K RPM SAS drives
- Mirrored
- Mixed-Workload Database/File/Web Pool:
- 14 x 300GB 10K RPM SAS drives
- Mirrored, then striped.
Read Performance:
Monitoring ARC cache "echo ::memstat | mdb -k" shows that the ZFS
ARC cache size is using between 60-90 GB. (no problem there)
Write Performance:
This pool hosts a mixed workload of database, web, and file server
I/O. We run a MySQL instance, with InnoDB tables that have fairly high
transaction rates, and we do not want to lose transactions. So, the
ACID-compliant InnoDB engine will be doing a lot of "sync" writes to ZFS.
Goal: We don't need absolutely speedy performance during normal
workloads, just more consistent write performance in the event of disk
failure.
Question #1: Is this strategy sound?
Strategy: Reduce the number of database/application write I/O
operations to the physical disks.
Action: Mirror of two SSDs, and place a ZIL on top of the SSDs.
Will this configuration reduce the impact of many database "sync"
writes to the disk mirrors themselves?
Question #2: Which SSD is the best?
I've seen several posts about the Intel DC S3700 series, which
looks like a great option for servers.
However, there is also a ZIL-optimized SSD, sTec x840z, on the
market. Has anyone used these? Good experience with these? Bad? The link
is: http://www.stec-inc.com/products/s840z-zil-sas-ssd/
What is your SSD of choice, for enterprise-level protection of ZILs?
Background Problem:
During resilvering of a mirror, ZFS must balance between serving
application I/O, and resilvering I/O.
Solaris has a kernel parameter named "zfs:zfs_resilver_delay" that
controls this balance. The parameter can be between "0" and "4".
A value of "0" tells the O/S to give priority to resilvering I/O,
at the expense of application I/O rates.
A value of "4" tells the O/S to give priority to application I/O,
at the expense of resilvering I/O.
In Solaris 10, it is supposed to default to "2", but actually
defaults to "0".
Gotcha: We got bit by this, and had severe performance degradation.
Recommended: Change to at least "2", but keep in mind resilvering
times may jump by 100% or more.
Our Configuration:
- x86_64 (Intel Xeon-based)
- Solaris 10 64-bit
- 240GB physical RAM (max is 384GB)
- Root Pool:
- 2 x 300GB 10K RPM SAS drives
- Mirrored
- Mixed-Workload Database/File/Web Pool:
- 14 x 300GB 10K RPM SAS drives
- Mirrored, then striped.
Read Performance:
Monitoring ARC cache "echo ::memstat | mdb -k" shows that the ZFS
ARC cache size is using between 60-90 GB. (no problem there)
Write Performance:
This pool hosts a mixed workload of database, web, and file server
I/O. We run a MySQL instance, with InnoDB tables that have fairly high
transaction rates, and we do not want to lose transactions. So, the
ACID-compliant InnoDB engine will be doing a lot of "sync" writes to ZFS.
Goal: We don't need absolutely speedy performance during normal
workloads, just more consistent write performance in the event of disk
failure.
Question #1: Is this strategy sound?
Strategy: Reduce the number of database/application write I/O
operations to the physical disks.
Action: Mirror of two SSDs, and place a ZIL on top of the SSDs.
Will this configuration reduce the impact of many database "sync"
writes to the disk mirrors themselves?
Question #2: Which SSD is the best?
I've seen several posts about the Intel DC S3700 series, which
looks like a great option for servers.
However, there is also a ZIL-optimized SSD, sTec x840z, on the
market. Has anyone used these? Good experience with these? Bad? The link
is: http://www.stec-inc.com/products/s840z-zil-sas-ssd/
What is your SSD of choice, for enterprise-level protection of ZILs?