Darren Reed via illumos-zfs
2014-10-19 13:55:34 UTC
For reasons unknown, my mind was doing garbage collection
and theproblem of why can't two files in the same zpool
but different zfs filesystemsbe ln'd danced through my
head. On the way out, the thought of using deduplication
crossed my mind as a way to make it faster - or at least
dispense with writing newdata to disk. The idea being to
use deduplication to achieve cross-zfs filesystemlinking
rather than using link(2). It wouldn't be as fast as
supporting link(2)but it would result in the required
space savings and would be faster thandoing a copy as
the file data doesn't need to be written out.
Given that deduplication is already a property of a file,
does it make sense to be able to turn it on for a selection
of file(s) rather than an entire filesystem?
And in this case, turning it on for a file when it is created
and before any data gets written to it so that there is no
need to write new data?
Heck, if the system was so capable, is there any reason why
cp(1) wouldn't use that by default if the source and
destination are within the same zpool?
Apologies if this all sounds somewhat fantastical...
Oh, it is...
deduplication only exists within a zfs filesystem, not a pool...
But that still leaves the question of cp(1) of a file within
a filesystem ... why shouldn't cp(1) be able to turn on dedup
just for that new file?
Too much coding effort for not much gain?
Darren
and theproblem of why can't two files in the same zpool
but different zfs filesystemsbe ln'd danced through my
head. On the way out, the thought of using deduplication
crossed my mind as a way to make it faster - or at least
dispense with writing newdata to disk. The idea being to
use deduplication to achieve cross-zfs filesystemlinking
rather than using link(2). It wouldn't be as fast as
supporting link(2)but it would result in the required
space savings and would be faster thandoing a copy as
the file data doesn't need to be written out.
Given that deduplication is already a property of a file,
does it make sense to be able to turn it on for a selection
of file(s) rather than an entire filesystem?
And in this case, turning it on for a file when it is created
and before any data gets written to it so that there is no
need to write new data?
Heck, if the system was so capable, is there any reason why
cp(1) wouldn't use that by default if the source and
destination are within the same zpool?
Apologies if this all sounds somewhat fantastical...
Oh, it is...
deduplication only exists within a zfs filesystem, not a pool...
But that still leaves the question of cp(1) of a file within
a filesystem ... why shouldn't cp(1) be able to turn on dedup
just for that new file?
Too much coding effort for not much gain?
Darren