Q on async destroy of a dataset...

Discussion:

surya

2013-12-23 14:53:33 UTC

My quick reading of the code along with a small experiment on smartOS
based VM
shows that the 'async' feature is attained by punting the heavy lifting
of destroy
to sync_thread context. Is that right?
In that case, has anybody done any study on what impact this will have
on the
sync_thread's performance and its ability to open a new txg in the
context of a
massive delete - how many times does it synchronously wait for a
dbuf_read() to
finish - fix for 3122 might lessen the impact but doesn't prevent the
waits(?).
ta,
Surya
PS: I felt the "proprietary" way of zombification of a dataset and a
"daemon' handling
it asynchrnously was better - No?

surya

2013-12-23 05:25:14 UTC

Permalink

George Wilson

2013-12-24 01:08:43 UTC

Permalink

Post by surya
My quick reading of the code along with a small experiment on smartOS
based VM
shows that the 'async' feature is attained by punting the heavy
lifting of destroy
to sync_thread context. Is that right?

That's correct.

Post by surya
In that case, has anybody done any study on what impact this will have
on the
sync_thread's performance and its ability to open a new txg in the
context of a
massive delete - how many times does it synchronously wait for a
dbuf_read() to
finish - fix for 3122 might lessen the impact but doesn't prevent the
waits(?).

You definitely want the fix for 3122 but I've not personally done any
performance analysis on how long you end up waiting on dbuf_read().

- George

Post by surya
ta,
Surya
PS: I felt the "proprietary" way of zombification of a dataset and a
"daemon' handling
it asynchrnously was better - No?
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com

Eric Schrock

2013-12-24 01:29:46 UTC

Permalink

Post by surya
PS: I felt the "proprietary" way of zombification of a dataset and a
"daemon' handling
it asynchrnously was better - No?

As someone who wrote the code that you are presumably referring to in the
ZFS Storage appliance, I can say the answer is "no". That solution only
worked for cases where you could explicitly issue a "zfs destroy" and could
guarantee not being interrupted, and required significant complexity for
every consumer (and all consumers of any complexity needed this).

If the process was interrupted, then when the pool was loaded we would
synchronously destroy the inconsistent dataset. I have vague memories of
this causing multi-hour hour outages at customer sites. It also didn't help
if you did a rollback or a receive, where the destroy was initiated by the
kernel code and the affected datasets were not know a priori.

- Eric
--
Eric Schrock
VP of Engineering, Delphix

275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com

-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com

surya

2013-12-24 08:16:24 UTC

Permalink

On Mon, Dec 23, 2013 at 8:08 PM, George Wilson
PS: I felt the "proprietary" way of zombification of a dataset
and a "daemon' handling
it asynchrnously was better - No?
As someone who wrote the code that you are presumably referring to in
the ZFS Storage appliance,

Right :-)

I can say the answer is "no". That solution only worked for cases
where you could explicitly issue a "zfs destroy" and could guarantee
not being interrupted, and required significant complexity for every
consumer (and all consumers of any complexity needed this).
If the process was interrupted, then when the pool was loaded we would
synchronously destroy the inconsistent dataset. I have vague memories
of this causing multi-hour hour outages at customer sites.

Yes, It did. But a similar dataset could pin down sync_thread now as
well, is whats my
conceren - won't it?

It also didn't help if you did a rollback or a receive, where the
destroy was initiated by the kernel code and the affected datasets
were not know a priori.

Right. Even then it wouldn't be entirely done in the sync context -
objects get removed
in the open context and the mdn and other cleanup would happen in the sync
context. So data services won't be affected - though the application
thread doing
the remove would be in the kernel for that long.
Lastly is there any plan to do object removals as well this way?
thanks,
Surya

- Eric
--
Eric Schrock
VP of Engineering, Delphix
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com <http://www.delphix.com/>
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004>
| Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>

-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com

George Wilson

2013-12-24 12:52:53 UTC

Permalink

Post by surya

On Mon, Dec 23, 2013 at 8:08 PM, George Wilson
PS: I felt the "proprietary" way of zombification of a
dataset and a "daemon' handling
it asynchrnously was better - No?
As someone who wrote the code that you are presumably referring to in
the ZFS Storage appliance,

Right :-)

I can say the answer is "no". That solution only worked for cases
where you could explicitly issue a "zfs destroy" and could guarantee
not being interrupted, and required significant complexity for every
consumer (and all consumers of any complexity needed this).
If the process was interrupted, then when the pool was loaded we
would synchronously destroy the inconsistent dataset. I have vague
memories of this causing multi-hour hour outages at customer sites.

Yes, It did. But a similar dataset could pin down sync_thread now as
well, is whats my
conceren - won't it?

Keep in mind that the entire dataset may not get removed in one txg. It
will chunk up the remove across multiple txgs if necessary. So you
aren't pinning down the sync thread here.

Post by surya

It also didn't help if you did a rollback or a receive, where the
destroy was initiated by the kernel code and the affected datasets
were not know a priori.

Not that I'm aware of.

- George

Post by surya
thanks,
Surya

- Eric
--
Eric Schrock
VP of Engineering, Delphix
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com <http://www.delphix.com/>
*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004>
| Modify <https://www.listbox.com/member/?&> Your Subscription
[Powered by Listbox] <http://www.listbox.com>

*illumos-zfs* | Archives
<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4>
| Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>

Surya prakki

2013-12-24 17:48:02 UTC

Permalink

Ok - your assertion that this won't pin sync_thread made me read the
code again and this time I came across dsl_scan_free_should_pause() -
which address my concern to a good degree. Ta.
-surya

Post by Eric Schrock

Post by surya
PS: I felt the "proprietary" way of zombification of a dataset and a
"daemon' handling
it asynchrnously was better - No?

As someone who wrote the code that you are presumably referring to in the
ZFS Storage appliance,
Right :-)
I can say the answer is "no". That solution only worked for cases where
you could explicitly issue a "zfs destroy" and could guarantee not being
interrupted, and required significant complexity for every consumer (and
all consumers of any complexity needed this).
If the process was interrupted, then when the pool was loaded we would
synchronously destroy the inconsistent dataset. I have vague memories of
this causing multi-hour hour outages at customer sites.
Yes, It did. But a similar dataset could pin down sync_thread now as well,
is whats my
conceren - won't it?
Keep in mind that the entire dataset may not get removed in one txg. It
will chunk up the remove across multiple txgs if necessary. So you aren't
pinning down the sync thread here.
It also didn't help if you did a rollback or a receive, where the destroy
was initiated by the kernel code and the affected datasets were not know a
priori.
Right. Even then it wouldn't be entirely done in the sync context -
objects get removed
in the open context and the mdn and other cleanup would happen in the sync
context. So data services won't be affected - though the application
thread doing
the remove would be in the kernel for that long.
Lastly is there any plan to do object removals as well this way?
Not that I'm aware of.
- George
thanks,
Surya
- Eric
--
Eric Schrock
VP of Engineering, Delphix
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004> |
Modify <https://www.listbox.com/member/?&> Your Subscription
<http://www.listbox.com>
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4> |
Modify <https://www.listbox.com/member/?&> Your Subscription
<http://www.listbox.com>
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004> |
Modify<https://www.listbox.com/member/?&>Your Subscription
<http://www.listbox.com>

Matthew Ahrens

2013-12-29 17:34:34 UTC

Permalink

Catching up with this thread...

If a subsequent txg needs to be synced (e.g. due to an administrative
command like "zfs set", or due to there being lots of dirty data that needs
to be synced out), then the background destroy will be paused and then
resumed in the next txg. This clause in dsl_scan_free_should_pause() is
the critical one:

(NSEC2MSEC(elapsed_nanosecs) > zfs_free_min_time_ms &&

txg_sync_waiting(scn->scn_dp))

Note that we will work on the destroys for at least zfs_free_min_time_ms
(default is 1 second), so the next txg might need to wait up to one second
for background destroy processing. You can tune this down if you are
seeing problems. As low as 100ms is probably reasonable.

--matt

Post by Surya prakki
Ok - your assertion that this won't pin sync_thread made me read the
code again and this time I came across dsl_scan_free_should_pause() -
which address my concern to a good degree. Ta.
-surya

Post by Eric Schrock

Post by surya
PS: I felt the "proprietary" way of zombification of a dataset and a
"daemon' handling
it asynchrnously was better - No?

As someone who wrote the code that you are presumably referring to in the
ZFS Storage appliance,
Right :-)
I can say the answer is "no". That solution only worked for cases where
you could explicitly issue a "zfs destroy" and could guarantee not being
interrupted, and required significant complexity for every consumer (and
all consumers of any complexity needed this).
If the process was interrupted, then when the pool was loaded we would
synchronously destroy the inconsistent dataset. I have vague memories of
this causing multi-hour hour outages at customer sites.
Yes, It did. But a similar dataset could pin down sync_thread now as
well, is whats my
conceren - won't it?
Keep in mind that the entire dataset may not get removed in one txg. It
will chunk up the remove across multiple txgs if necessary. So you aren't
pinning down the sync thread here.
It also didn't help if you did a rollback or a receive, where the destroy
was initiated by the kernel code and the affected datasets were not know a
priori.
Right. Even then it wouldn't be entirely done in the sync context -
objects get removed
in the open context and the mdn and other cleanup would happen in the sync
context. So data services won't be affected - though the application
thread doing
the remove would be in the kernel for that long.
Lastly is there any plan to do object removals as well this way?
Not that I'm aware of.
- George
thanks,
Surya
- Eric
--
Eric Schrock
VP of Engineering, Delphix
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004> |
Modify <https://www.listbox.com/member/?&> Your Subscription
<http://www.listbox.com>
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4> |
Modify <https://www.listbox.com/member/?&> Your Subscription
<http://www.listbox.com>
*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/25372515-edbd5004> |
Modify <https://www.listbox.com/member/?&> Your Subscription
<http://www.listbox.com>

*illumos-zfs* | Archives<https://www.listbox.com/member/archive/182191/=now>
<https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460> |
Modify<https://www.listbox.com/member/?&>Your Subscription
<http://www.listbox.com>