Discussion:
review needed
k***@cflinux.hu
2014-02-10 14:53:56 UTC
Permalink
Dear ZFS team,

I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but does
not drop that hold when in the meantime other thread uses the actual
snapshot vnode.

Please review the attached patch to eliminate this leak.

Thanks in advance,
Matthew Ahrens
2014-02-10 18:31:08 UTC
Permalink
Looks good to me.

--matt
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function obtains
a hold on its parent directory with gfs_dir_lookup(), but does not drop
that hold when in the meantime other thread uses the actual snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/
21635000-ebd1d460
Modify Your Subscription: https://www.listbox.com/
member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Richard Kojedzinszky
2014-02-12 13:29:51 UTC
Permalink
Is there someone also to review/approve?
Post by Matthew Ahrens
Looks good to me.
--matt
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but
does not drop that hold when in the meantime other thread uses the
actual snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now [1]
https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[2]
https://www.listbox.com/member/?& [3]
Powered by Listbox: http://www.listbox.com [4]
ILLUMOS-ZFS | Archives [1] [5] | Modify [6] Your Subscription [4]
------
[1] https://www.listbox.com/member/archive/182191/=now
[2] https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[3] https://www.listbox.com/member/?&
[4] http://www.listbox.com
[5] https://www.listbox.com/member/archive/rss/182191/25402478-0858cafa
[6]
https://www.listbox.com/member/?&
--
Richard Kojedzinszky
Matthew Ahrens
2014-02-12 23:30:24 UTC
Permalink
Anyone else care to review this one-liner?

The fix seems obviously correct by inspection, but what testing have you
done? Ideally you would reproduce the problem (the vnode hold leak) and
then test again with the fix in place and see that the problem does not
reproduce.

Assuming the testing is good, your next step is to submit an RTI by
emailing the illumos advocates. See
http://wiki.illumos.org/display/illumos/How+To+Contribute#HowToContribute-5SubmittingAPatch

--matt
Post by Richard Kojedzinszky
Is there someone also to review/approve?
Post by Matthew Ahrens
Looks good to me.
--matt
Dear ZFS team,
Post by k***@cflinux.hu
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but
does not drop that hold when in the meantime other thread uses the
actual snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now [1]
https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[2]
https://www.listbox.com/member/?& [3]
Powered by Listbox: http://www.listbox.com [4]
ILLUMOS-ZFS | Archives [1] [5] | Modify [6] Your Subscription [4]
------
[1] https://www.listbox.com/member/archive/182191/=now
[2] https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[3] https://www.listbox.com/member/?&
[4] http://www.listbox.com
[5] https://www.listbox.com/member/archive/rss/182191/25402478-0858cafa
[6]
https://www.listbox.com/member/?&
--
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/
21635000-ebd1d460
Modify Your Subscription: https://www.listbox.com/
member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
Richard Kojedzinszky
2014-02-13 09:48:48 UTC
Permalink
Dear Matt,

Actually the testing was to run concurrently an snapshot umount and an
ls in .zfs/snapshot. They sometimes got to the race where
snapshot_inactive() would leave the hold on its parent. After it, a
simple umount of that filesystem failed stating it is busy. But as my
other mail wrote, I am courious that in Illumos that race can even
happen.

Regards,
Post by Richard Kojedzinszky
Is there someone also to review/approve?
Post by Matthew Ahrens
Looks good to me.
--matt
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but
does not drop that hold when in the meantime other thread uses the
actual snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now [1]
https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[2]
https://www.listbox.com/member/?& [3]
Powered by Listbox: http://www.listbox.com [4]
ILLUMOS-ZFS | Archives [1] [5] | Modify [6] Your Subscription [4]
------
[1] https://www.listbox.com/member/archive/182191/=now
[2]
https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[3] https://www.listbox.com/member/?&
[4] http://www.listbox.com
[5]
https://www.listbox.com/member/archive/rss/182191/25402478-0858cafa
[6]
https://www.listbox.com/member/?&
--
Richard Kojedzinszky
Richard Yao
2014-02-13 15:35:45 UTC
Permalink
Do you have a script that can reproduce this? This feels like something
that belongs in a test suite.
Post by Richard Kojedzinszky
Dear Matt,
Actually the testing was to run concurrently an snapshot umount and an
ls in .zfs/snapshot. They sometimes got to the race where
snapshot_inactive() would leave the hold on its parent. After it, a
simple umount of that filesystem failed stating it is busy. But as my
other mail wrote, I am courious that in Illumos that race can even happen.
Regards,
Post by Richard Kojedzinszky
Is there someone also to review/approve?
Post by Matthew Ahrens
Looks good to me.
--matt
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but
does not drop that hold when in the meantime other thread uses the
actual snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now [1]
https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[2]
https://www.listbox.com/member/?& [3]
Powered by Listbox: http://www.listbox.com [4]
ILLUMOS-ZFS | Archives [1] [5] | Modify [6] Your
Subscription [4]
------
[1] https://www.listbox.com/member/archive/182191/=now
[2] https://www.listbox.com/member/archive/rss/182191/21635000-ebd1d460
[3] https://www.listbox.com/member/?&
[4] http://www.listbox.com
[5] https://www.listbox.com/member/archive/rss/182191/25402478-0858cafa
[6]
https://www.listbox.com/member/?&
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com
George Wilson
2014-02-13 16:11:49 UTC
Permalink
LGTM.

- George
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function
obtains a hold on its parent directory with gfs_dir_lookup(), but does
not drop that hold when in the meantime other thread uses the actual
snapshot vnode.
Please review the attached patch to eliminate this leak.
Thanks in advance,
Richard Kojedzinszky
-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
https://www.listbox.com/member/archive/rss/182191/22008002-303f2ff4
https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
Matthew Ahrens via illumos-zfs
2014-09-13 04:06:36 UTC
Permalink
Post by k***@cflinux.hu
Dear ZFS team,
I found that in zfs/zfs_ctldir.c the snapshot_inactive() function obtains
a hold on its parent directory with gfs_dir_lookup(), but does not drop
that hold when in the meantime other thread uses the actual snapshot vnode.
Please review the attached patch to eliminate this leak.
I managed to reproduce this issue on illumos. The proposed patch does not
completely fix the problem, because the hold on the target vnode (vp) is
also leaked. I have a patch that I've tested and will integrate to
illumos. Details in the bug report: https://www.illumos.org/issues/5160

--matt



-------------------------------------------
illumos-zfs
Archives: https://www.listbox.com/member/archive/182191/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182191/23047029-187a0c8d
Modify Your Subscription: https://www.listbox.com/member/?member_id=23047029&id_secret=23047029-2e85923f
Powered by Listbox: http://www.listbox.com

Loading...