Discussion:
[review request] 5034 ARC's buf_hash_table is too small
Matthew Ahrens via illumos-zfs
2014-07-24 03:36:29 UTC
Permalink
https://reviews.csiden.org/r/61/

5034 ARC's buf\_hash\_table is too small
Reviewed by: George Wilson <***@delphix.com\>
Reviewed by: Christopher Siden <***@delphix.com\>

Original author: Matthew Ahrens

The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table, which is
created at system boot time. The hash table is sized such that if all of
physical memory was filled with 64K blocks, the hash chain length would
average
less than 1.0. However, on a system with typical block size of 8k, this
can lead to
long hash chain lengths. I've observed average length ~6.5; theoretically
it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).

By increasing the hash table size to have enough entries for average length
1.0
when memory is filled with 8k blocks, we can obtain a 18% performance
improvement on cached reads. (680MB/s -> 805MB/s)

The hash table size should also be tunable, rather than hard coded.

--matt



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Matthew Ahrens via illumos-zfs
2014-07-24 05:06:09 UTC
Permalink
Good point, it should be consistent with the others. Review updated.

--matt
Why not prefix the variable with "zfs_" like the rest of the parameters?
Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table,
which
is
created at system boot time. The hash table is sized such that if all of
physical memory was filled with 64K blocks, the hash chain length
would
average
less than 1.0. However, on a system with typical block size of 8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).
By increasing the hash table size to have enough entries for average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18% performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard coded.
--matt
References
1. https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Matthew Ahrens via illumos-zfs
2014-07-24 06:07:58 UTC
Permalink
Out of curiosity, is the workload used to measure the 18% performance
benefit as simple as it seems? i.e. fill the arc, then do reads which
are guaranteed to be arc hits?
That's right.

--matt
--
Cheers, Prakash
Post by Matthew Ahrens via illumos-zfs
Good point, it should be consistent with the others. Review updated.
--matt
On Wed, Jul 23, 2014 at 9:50 PM, Prakash Surya <[1]
Why not prefix the variable with "zfs_" like the rest of the parameters?
Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1][2]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Reviewed by: Christopher Siden
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table,
which
is
created at system boot time. The hash table is sized such that
if
Post by Matthew Ahrens via illumos-zfs
all
of
physical memory was filled with 64K blocks, the hash chain length
would
average
less than 1.0. However, on a system with typical block size of
8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in
the
Post by Matthew Ahrens via illumos-zfs
hashtable).
By increasing the hash table size to have enough entries for
average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18%
performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard
coded.
--matt
References
1. [5]https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
[9]http://lists.open-zfs.org/mailman/listinfo/developer
References
2. https://reviews.csiden.org/r/61/
5. https://reviews.csiden.org/r/61/
9. http://lists.open-zfs.org/mailman/listinfo/developer
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Steven Hartland via illumos-zfs
2014-07-24 09:26:36 UTC
Permalink
Is it worth increasing the default value as well?

Steve

----- Original Message -----
From: "Matthew Ahrens" <***@delphix.com>
To: "Prakash Surya" <***@prakashsurya.com>
Cc: "developer" <***@open-zfs.org>; "illumos-zfs" <***@lists.illumos.org>
Sent: Thursday, July 24, 2014 6:06 AM
Subject: Re: [OpenZFS Developer] [review request] 5034 ARC's buf_hash_table is too small
Post by Matthew Ahrens via illumos-zfs
Good point, it should be consistent with the others. Review updated.
--matt
Why not prefix the variable with "zfs_" like the rest of the parameters?
Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table,
which
is
created at system boot time. The hash table is sized such that if all of
physical memory was filled with 64K blocks, the hash chain length
would
average
less than 1.0. However, on a system with typical block size of 8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).
By increasing the hash table size to have enough entries for average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18% performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard coded.
--matt
References
1. https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
--------------------------------------------------------------------------------
Post by Matthew Ahrens via illumos-zfs
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
Matthew Ahrens via illumos-zfs
2014-07-24 15:26:37 UTC
Permalink
Post by Steven Hartland via illumos-zfs
Is it worth increasing the default value as well?
I decreased the default value from 64K to 8K. What value do you think
would be appropriate, and why?

--matt
Post by Steven Hartland via illumos-zfs
Steve
Sent: Thursday, July 24, 2014 6:06 AM
Subject: Re: [OpenZFS Developer] [review request] 5034 ARC's
buf_hash_table is too small
Good point, it should be consistent with the others. Review updated.
Post by Matthew Ahrens via illumos-zfs
--matt
Why not prefix the variable with "zfs_" like the rest of the parameters?
Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table,
which
is
created at system boot time. The hash table is sized such that if
all
of
physical memory was filled with 64K blocks, the hash chain length
would
average
less than 1.0. However, on a system with typical block size of 8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).
By increasing the hash table size to have enough entries for average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18%
performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard coded.
--matt
References
1. https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
------------------------------------------------------------
--------------------
_______________________________________________
Post by Matthew Ahrens via illumos-zfs
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Steven Hartland via illumos-zfs
2014-07-24 16:05:42 UTC
Permalink
----- Original Message -----
Post by Matthew Ahrens via illumos-zfs
Post by Steven Hartland via illumos-zfs
Is it worth increasing the default value as well?
I decreased the default value from 64K to 8K. What value do you think
would be appropriate, and why?
Ahh I see now, review boards color (the light yellow) for changed lines is too
light to see on my screen so only noticed the added lines; ignore me.

Regards
Steve
Prakash Surya via illumos-zfs
2014-07-24 05:47:35 UTC
Permalink
Out of curiosity, is the workload used to measure the 18% performance
benefit as simple as it seems? i.e. fill the arc, then do reads which
are guaranteed to be arc hits?
--
Cheers, Prakash
Post by Matthew Ahrens via illumos-zfs
Good point, it should be consistent with the others. Review updated.
--matt
Why not prefix the variable with "zfs_" like the rest of the parameters?
Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1][2]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Reviewed by: Christopher Siden
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table,
which
is
created at system boot time. The hash table is sized such that if
all
of
physical memory was filled with 64K blocks, the hash chain length
would
average
less than 1.0. However, on a system with typical block size of
8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).
By increasing the hash table size to have enough entries for
average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18%
performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard
coded.
--matt
References
1. [5]https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
[9]http://lists.open-zfs.org/mailman/listinfo/developer
References
2. https://reviews.csiden.org/r/61/
5. https://reviews.csiden.org/r/61/
9. http://lists.open-zfs.org/mailman/listinfo/developer
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
Prakash Surya via illumos-zfs
2014-07-24 04:50:38 UTC
Permalink
Why not prefix the variable with "zfs_" like the rest of the parameters?

Seems reasonable otherwise, I've seen large hash lengths on the Linux
port running Lustre-ZFS MDS servers which makes use of many 512B buffers.
--
Cheers, Prakash
[1]https://reviews.csiden.org/r/61/
5034 ARC's buf\_hash\_table is too small
Original author: Matthew Ahrens
The ARC puts all (non-anonymous) arc_buf_hdr_t's in a hash table, which is
created at system boot time. The hash table is sized such that if all of
physical memory was filled with 64K blocks, the hash chain length would
average
less than 1.0. However, on a system with typical block size of 8k,
this can lead to
long hash chain lengths. I've observed average length ~6.5;
theoretically it
could be up to 16 (because evicted "ghost" entries are also in the
hashtable).
By increasing the hash table size to have enough entries for average
length 1.0
when memory is filled with 8k blocks, we can obtain a 18% performance
improvement on cached reads. (680MB/s -> 805MB/s)
The hash table size should also be tunable, rather than hard coded.
--matt
References
1. https://reviews.csiden.org/r/61/
_______________________________________________
developer mailing list
http://lists.open-zfs.org/mailman/listinfo/developer
Loading...