About reflinks
Copy-on-Write filesystems have the nice property that it is possible (not to say easy) to “clone” files in \(\mathcal{O}(1)\), as opposed to the classical \(\mathcal{O}(n)\), by having the new file refer to the old blocks, and copying (possibly) changed blocks (as opposed to changing the original blocks, as a hard link would do). This both saves time and space, and can be very beneficial in a lot of situations.
Since in Linux the FICLONE system call was standardised, the use of reflink cloning files now became filesystem agnostic. Currently supporting filesystems include btrfs, XFS, and OCFS2.
The (currently) sad part of the story is language support; Pythons
shutil.copyfile
call
does not (yet) make use of this new system call 1.
Hereby, I introduce a reflink library for Python,
which uses cffi
to make the system call:
from reflink import reflink
reflink("large_file.img", "copy_of_file.img")
I hope someone finds this to be useful, and that it may gain ReFS and APFS support one day.
In the future, I would also like to incorporate the FICLONE_RANGE call, which “clones” a range of blocks, instead of a whole file.
-
Please note that it will be improbable for Python to change their
shutil.copyfile
routine to actually use FICLONE, as this would change the semantics of the call. One might be writing an application that’s supposed to make a backup (against corruption) of a file, while the call actually wouldn’t make a security copy. ↩