Discussion:
A question about ZFS dedup and what happens when you turn it off
Chris Siebenmann via illumos-zfs
2014-04-28 16:18:44 UTC
Permalink
I've been scanning through the ZFS kernel code for deduplication, which
has both taught me things and left me kind of confused about some aspects
of dedup handling. So it's time to ask questions of people who know more
than I do.

Suppose that you do the following:
- turn deduplication on for tank/testfs.
- write a file to tank/testfs with unique blocks, call it tank/testfs/fred.
These unique blocks will create new DDT entries for themselves.
- turn deduplication off on tank/testfs
- delete tank/testfs/fred.

Does this remove the DDT entries for the blocks of tank/testfs/fred? In
the current Illumos source I can't see where this happens (if it does). I
believe that DDT removal is normally done in zio.c's zio_ddt_free(),
but that seem to only be part of the ZIO pipeline if dedup is enabled
on the filesystem (well, the objset technically). However I'm not sure
I'm fully following the code here and there may be other paths too.

Thanks in advance.

- cks
George Wilson via illumos-zfs
2014-04-28 22:05:52 UTC
Permalink
It's been a while since I played with the dedup code so I'm a little
rusty but let me take a shot at clarifying here:

1. The freeing code path is based on the blkptr dedup bit being set and
not on the filesystem property. So even though you've turned off dedup
any existing dedup block will still go through the dedup pipeline.
2. The removal of the ddt entry actually happens in ddt_sync() where it
eventually calls ddt_phys_free(). So in your case the block would get
freed in zio_ddt_free() and the entry would be removed in ddt_phys_free().

Hope that helps,
George
Post by Chris Siebenmann via illumos-zfs
I've been scanning through the ZFS kernel code for deduplication, which
has both taught me things and left me kind of confused about some aspects
of dedup handling. So it's time to ask questions of people who know more
than I do.
- turn deduplication on for tank/testfs.
- write a file to tank/testfs with unique blocks, call it tank/testfs/fred.
These unique blocks will create new DDT entries for themselves.
- turn deduplication off on tank/testfs
- delete tank/testfs/fred.
Does this remove the DDT entries for the blocks of tank/testfs/fred? In
the current Illumos source I can't see where this happens (if it does). I
believe that DDT removal is normally done in zio.c's zio_ddt_free(),
but that seem to only be part of the ZIO pipeline if dedup is enabled
on the filesystem (well, the objset technically). However I'm not sure
I'm fully following the code here and there may be other paths too.
Thanks in advance.
- cks
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Chris Siebenmann via illumos-zfs
2014-04-28 22:23:55 UTC
Permalink
| 1. The freeing code path is based on the blkptr dedup bit being set
| and not on the filesystem property. So even though you've turned
| off dedup any existing dedup block will still go through the dedup
| pipeline.

Ah! Thank you (and Matthew Ahrens). For some reason I had missed that
the dedup bit was saved as part of the on-disk block pointer. So every
BP that was written with dedup active is marked and will go through
dedup processing on removal (and also on reads if there is a read error),
regardless of the current dedup setting for the filesystem.

- cks
Matthew Ahrens
2014-04-28 16:27:06 UTC
Permalink
Post by Chris Siebenmann via illumos-zfs
I've been scanning through the ZFS kernel code for deduplication, which
has both taught me things and left me kind of confused about some aspects
of dedup handling. So it's time to ask questions of people who know more
than I do.
- turn deduplication on for tank/testfs.
- write a file to tank/testfs with unique blocks, call it tank/testfs/fred.
These unique blocks will create new DDT entries for themselves.
- turn deduplication off on tank/testfs
- delete tank/testfs/fred.
Does this remove the DDT entries for the blocks of tank/testfs/fred?
Yes.
Post by Chris Siebenmann via illumos-zfs
In
the current Illumos source I can't see where this happens (if it does). I
believe that DDT removal is normally done in zio.c's zio_ddt_free(),
but that seem to only be part of the ZIO pipeline if dedup is enabled
on the filesystem (well, the objset technically).
It should do DDT free if the dedup flag is set in the block pointer.
Post by Chris Siebenmann via illumos-zfs
However I'm not sure
I'm fully following the code here and there may be other paths too.
Thanks in advance.
- cks
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Loading...