Discussion:
Disk sync at shutdown and fusefs filesystems
Alejandro Pulver
2007-12-11 03:18:28 UTC
Permalink
Hello.

The port fusefs-ntfs (NTFS-3G is the official name) is a NTFS
read/write driver using FUSE (a user-space kernel independent API for
writing filesystem drivers). The latter uses a (user-space) cache for
improving performance as there isn't a block device cache in the
kernel, and it was originally made in Linux with that assumption.

The problem with NTFS-3G (and all other FUSE based drivers maybe) is
that it doesn't flush the cache data to the disk at shutdown, but it
does when unmounted (and I guess this doesn't happen automatically). I
noticed this when files I write before manually unmounting persist, and
otherwise sometimes they don't.

So I guess with native (here I mean written directly for the system kernel,
using the kernel cache) FreeBSD filesystems the kernel flushes the cache at
shutdown, but they aren't unmounted.

Generally this isn't a problem since most FUSE filesystems are
"virtual" (for example: over SSH, FTP, HTTP, etc.) and don't use cache
nor need flushing. But this isn't the case with NTFS-3G.

Are my assumptions right? Then I have to look for some way to manually
unmount FUSE filesystems at shutdown, because they are already mounted
at startup. I thought about instructing the fusefs-kmod rc.d script to
unmount FUSE filesystems before attempting to unload the kernel module
(currently it only loads/unloads fuse.ko).

Thanks and Best Regards,
Ale
Doug Barton
2007-12-11 04:18:26 UTC
Permalink
Post by Alejandro Pulver
Then I have to look for some way to manually
unmount FUSE filesystems at shutdown, because they are already mounted
at startup. I thought about instructing the fusefs-kmod rc.d script to
unmount FUSE filesystems before attempting to unload the kernel module
(currently it only loads/unloads fuse.ko).
Yes, I think that given what we're working with here, that would be a
good idea regardless. It should be pretty easy to do, you can find a
sample of something like what you would want in /etc/rc.d/dumpon. Let
me know if you need help, I'm more than a little interested in getting
fuse-ntfs set up here.

Doug
--
This .signature sanitized for your protection
Alejandro Pulver
2007-12-11 15:02:11 UTC
Permalink
On Mon, 10 Dec 2007 20:18:26 -0800
Post by Doug Barton
Post by Alejandro Pulver
Then I have to look for some way to manually
unmount FUSE filesystems at shutdown, because they are already mounted
at startup. I thought about instructing the fusefs-kmod rc.d script to
unmount FUSE filesystems before attempting to unload the kernel module
(currently it only loads/unloads fuse.ko).
Yes, I think that given what we're working with here, that would be a
good idea regardless. It should be pretty easy to do, you can find a
sample of something like what you would want in /etc/rc.d/dumpon. Let
me know if you need help, I'm more than a little interested in getting
fuse-ntfs set up here.
Thanks, here is what I've got so far: it seems /dev/fuse[0-9]* devices
aren't removed after the corresponding filesystem is unmounted (I guess
they are reused), so instead of listing /dev the list has to be taken
from 'mount'. Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?

echo "Stopping ${name}."
for fs in `mount | grep '^/dev/fuse[0-9]*' | cut -d ' ' -f 1`; do
umount $fs
done
sleep 2
kldunload $kmod

Unfortunately it doesn't have a status function to avoid loading when
already loaded and the other way, but can easily be added.

Best Regards,
Ale
Doug Barton
2007-12-11 20:22:35 UTC
Permalink
Post by Alejandro Pulver
Thanks, here is what I've got so far: it seems /dev/fuse[0-9]* devices
aren't removed after the corresponding filesystem is unmounted (I guess
they are reused), so instead of listing /dev the list has to be taken
from 'mount'.
Yeah, I think that's better than using fstab anyway, since this way we get
them all with limited processing. Wish I'd thought of it. :)
Post by Alejandro Pulver
Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?
I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
between unmounts, just to be paranoid. How about:

mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1
Post by Alejandro Pulver
kldunload $kmod
hth,

Doug
--
This .signature sanitized for your protection
Alejandro Pulver
2007-12-11 21:02:17 UTC
Permalink
On Tue, 11 Dec 2007 12:22:35 -0800 (PST)
Post by Doug Barton
Post by Alejandro Pulver
Thanks, here is what I've got so far: it seems /dev/fuse[0-9]* devices
aren't removed after the corresponding filesystem is unmounted (I guess
they are reused), so instead of listing /dev the list has to be taken
from 'mount'.
Yeah, I think that's better than using fstab anyway, since this way we get
them all with limited processing. Wish I'd thought of it. :)
Actually, I tried first with "umount -a -t {fusefs,ntfs-3g,fuse,...}"
but didn't work.
Post by Doug Barton
Post by Alejandro Pulver
Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?
I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1
It looks fine to me. And what about echoing the mountpoints as they are
unmounted?

mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*)
echo "fusefs: unmounting ${mountpoint}."
umount $mountpoint ; sleep 1
;;
esac
done

Also this checks would avoid kldload/kldunload errors:

In fusefs_start:
if kldstat | grep -q fuse\\.ko; then
echo "${name} is already running."
return 0
fi

In fusefs_stop:
if ! kldstat | grep -q fuse\\.ko; then
echo "${name} is not running."
return 1
fi

Well, the word "loaded" instead of "running" would be better. Also a
status command could be added, but I don't think it's needed.

Also
Karsten Behrmann
2007-12-11 21:53:29 UTC
Permalink
Post by Doug Barton
I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1
Hmm, if you truly want to be paranoid, you probably should be unmounting
those in reverse order, because someone might be mounting one fuse-fs
inside another ;)

just my 2 cents,
Karsten
--
Open source is not about suing someone who sells your software. It is
about being able to walk behind him, grinning, and waving free CDs with
the equivalent of what he is trying to sell.
Csaba Henk
2007-12-12 02:00:14 UTC
Permalink
Post by Alejandro Pulver
The problem with NTFS-3G (and all other FUSE based drivers maybe) is
that it doesn't flush the cache data to the disk at shutdown, but it
does when unmounted (and I guess this doesn't happen automatically). I
noticed this when files I write before manually unmounting persist, and
otherwise sometimes they don't.
I just happen to discuss this issue with Szaka (ntfs-3g developer) and
Miklos Szeredi (FUSE developer). At least, we're discussing something
which might have a relevance here.

They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".

This was introduced in the following commit (as seen in my HG mirror):

http://mercurial.creo.hu/repos/fuse-hg/?rev/a5df6fb4a0e6

and it's already included in the current sysutils/fusefs-libs port.

And it wouldn't be hard to add kernel side support for FreeBSD. There
are some questions though:

- Do you think it could be actually useful for solving the shutdown
issue on FreeBSD?

- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).

- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.

Regards,
Csaba
Alejandro Pulver
2007-12-12 16:43:16 UTC
Permalink
On Wed, 12 Dec 2007 03:00:07 +0100
[This message has also been posted to gmane.os.freebsd.devel.hackers.]
Post by Alejandro Pulver
The problem with NTFS-3G (and all other FUSE based drivers maybe) is
that it doesn't flush the cache data to the disk at shutdown, but it
does when unmounted (and I guess this doesn't happen automatically). I
noticed this when files I write before manually unmounting persist, and
otherwise sometimes they don't.
I just happen to discuss this issue with Szaka (ntfs-3g developer) and
Miklos Szeredi (FUSE developer). At least, we're discussing something
which might have a relevance here.
They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".
http://mercurial.creo.hu/repos/fuse-hg/?rev/a5df6fb4a0e6
and it's already included in the current sysutils/fusefs-libs port.
And it wouldn't be hard to add kernel side support for FreeBSD. There
- Do you think it could be actually useful for solving the shutdown
issue on FreeBSD?
Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?

Please correct me if I'm wrong.

At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).
- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.
This would depend on the previous point.

Please CC me as I'm not (yet) subscribed.

Best Regards,
Ale
Csaba Henk
2007-12-17 01:22:08 UTC
Permalink
Post by Alejandro Pulver
On Wed, 12 Dec 2007 03:00:07 +0100
Post by Csaba Henk
Post by Alejandro Pulver
The problem with NTFS-3G (and all other FUSE based drivers maybe) is
that it doesn't flush the cache data to the disk at shutdown, but it
does when unmounted (and I guess this doesn't happen automatically). I
noticed this when files I write before manually unmounting persist, and
otherwise sometimes they don't.
I just happen to discuss this issue with Szaka (ntfs-3g developer) and
Miklos Szeredi (FUSE developer). At least, we're discussing something
which might have a relevance here.
They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".
http://mercurial.creo.hu/repos/fuse-hg/?rev/a5df6fb4a0e6
and it's already included in the current sysutils/fusefs-libs port.
And it wouldn't be hard to add kernel side support for FreeBSD. There
- Do you think it could be actually useful for solving the shutdown
issue on FreeBSD?
Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?
Please correct me if I'm wrong.
At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).
The point in synch umount is that you don't need to wait for an ad hoc
amount of time in order to have the various caches flushed / media
sync'd -- it enables the filesystem daemon itself to notify the umount
procedure that it's done and the world can go on.

The exact nature of caches (userspace / in-kernel) doesn't really make a
difference from this POV. (Implementing the appropriate synchronization
mechanisms is up to the fs daemon.)
Post by Alejandro Pulver
Post by Csaba Henk
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).
Of course you've never seen this! -- these have appeared as consequences
of the synchronous umount (ie., the DESTROY message) which is not yet
implemented on FreeBSD.

The actual question is whether it is worth to implement it. For me it
seems "yes", but I don't know the ins and outs of the FreeBSD
init/shutdown system, that's why I'd like to hear the opinion of people
like you about this before I go and code it.

Whether hangs occur or not if fuse4bsd does sync umount is not that
important. I mean, first I would code a basic implementation of DESTROY
(that's pretty simple to do!) and we'd see how well that works and if we
see problems I try to tune the implementation. That's just business as
usual.

The next issue is more important...
Post by Alejandro Pulver
Post by Csaba Henk
- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.
This would depend on the previous point.
It's more important to specify a suitable security policy -- who/when
should be capable of mounting an fs in a way so that the umount will
be synchronous? (Short term for this: "mounting with synch umount").

I think it's quite a "standalone" question, it depends on nothing else.
If a FUSE fs is mounted with synch umount its daemon can block the umount
command/syscall, and therefore it can probably block the whole shutdown
sequence (killing the process or umounting with -f might help, but these
probably won't happen during the normal shutdown course).

A possible compromise is letting only the superuser to mount with synch
umount. That would mean only superuser mounted ntfs fs-es will be
cleanly unmounted during shutdown (I mean, without resorting to hacks in
the shutdown scripts). ("Tying synch umount to 'allow_other'", as
mentioned above, is practically the same choice.)

This of course can be refined, eg. make it configurable with a sysctl,
etc.

Either case, it's a design issue and not just an implementation detail
so it would be clever to try to make up our mind about this.

Csaba
Alejandro Pulver
2007-12-17 02:26:55 UTC
Permalink
On Mon, 17 Dec 2007 02:21:53 +0100
Post by Csaba Henk
Post by Alejandro Pulver
Post by Csaba Henk
They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".
[...]
Post by Csaba Henk
Post by Alejandro Pulver
Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?
Please correct me if I'm wrong.
At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).
The point in synch umount is that you don't need to wait for an ad hoc
amount of time in order to have the various caches flushed / media
sync'd -- it enables the filesystem daemon itself to notify the umount
procedure that it's done and the world can go on.
The exact nature of caches (userspace / in-kernel) doesn't really make a
difference from this POV. (Implementing the appropriate synchronization
mechanisms is up to the fs daemon.)
I see, thanks for the clarification.
Post by Csaba Henk
Post by Alejandro Pulver
Post by Csaba Henk
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).
Of course you've never seen this! -- these have appeared as consequences
of the synchronous umount (ie., the DESTROY message) which is not yet
implemented on FreeBSD.
That's logical. I missed the point.
Post by Csaba Henk
The actual question is whether it is worth to implement it. For me it
seems "yes", but I don't know the ins and outs of the FreeBSD
init/shutdown system, that's why I'd like to hear the opinion of people
like you about this before I go and code it.
I'm not in the kernel side, but I think it's the correct thing to do.
The kernel syncer does the same with other filesystems, so...
Post by Csaba Henk
Whether hangs occur or not if fuse4bsd does sync umount is not that
important. I mean, first I would code a basic implementation of DESTROY
(that's pretty simple to do!) and we'd see how well that works and if we
see problems I try to tune the implementation. That's just business as
usual.
For what you said, we won't know if there are hangs until we have an
implementation of DESTROY. So this will be attended later as you said.
Post by Csaba Henk
The next issue is more important...
Post by Alejandro Pulver
Post by Csaba Henk
- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.
This would depend on the previous point.
It's more important to specify a suitable security policy -- who/when
should be capable of mounting an fs in a way so that the umount will
be synchronous? (Short term for this: "mounting with synch umount").
I think it's quite a "standalone" question, it depends on nothing else.
<quote>
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
</quote>

IIRC you are saying that any user could make umount hang. And you said
this is an unintended behavior caused by the implementation, which
appeared on Linux and we don't know if it will happen on FreeBSD.

Otherwise the daemon would synchronize the fs and let umount return
normally, and this wouldn't happen, right?

If this always happens then what is the difference between happening on
a root/non-root mount, as it will hang anyways?

If I missed the point again please correct me, and clarify the following:
Does the hang (point 2)/umount stuck (point 3) issues consist of the
same (I assumed so)? If not, please point out the differences.
Post by Csaba Henk
If a FUSE fs is mounted with synch umount its daemon can block the umount
command/syscall, and therefore it can probably block the whole shutdown
sequence (killing the process or umounting with -f might help, but these
probably won't happen during the normal shutdown course).
A possible compromise is letting only the superuser to mount with synch
umount. That would mean only superuser mounted ntfs fs-es will be
cleanly unmounted during shutdown (I mean, without resorting to hacks in
the shutdown scripts). ("Tying synch umount to 'allow_other'", as
mentioned above, is practically the same choice.)
This of course can be refined, eg. make it configurable with a sysctl,
etc.
Either case, it's a design issue and not just an implementation detail
so it would be clever to try to make up our mind about this.
Csaba
Best Regards,
Ale
Csaba Henk
2007-12-17 16:41:32 UTC
Permalink
Post by Alejandro Pulver
<quote>
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
</quote>
IIRC you are saying that any user could make umount hang. And you said
this is an unintended behavior caused by the implementation, which
appeared on Linux and we don't know if it will happen on FreeBSD.
Otherwise the daemon would synchronize the fs and let umount return
normally, and this wouldn't happen, right?
If this always happens then what is the difference between happening on
a root/non-root mount, as it will hang anyways?
Does the hang (point 2)/umount stuck (point 3) issues consist of the
same (I assumed so)? If not, please point out the differences.
Oh sorry, I see my wording was still not clear enough...

Point 3) and point 2) are completely different issues.

Point 2) is about a _bug_ which might make unmount hang (contrary to our
intentions). Point 3) is about the access control _policy_ of the
"mounting with sync unmount" feature which is DOS capable: it enables
some malicious code to make the shutdown sequence hung.

[Btw, I used/use the "hang", "block", "make stuck" expressions
interchangeably, I'm sorry if it's not correct English or just sounds
unnaturally in some cases.]


The same statement with more details:

Point 2) is about a defect of a naive, straightforward implementation of
handling the DESTROY message in the FUSE library. That is, if the daemon
is mounted with synch umount, under certain circumstances (if I
understood correctly, this amounts to killing the daemon with a SIGTERM [ie.,
the FUSE session terminates due the sigterm and not because of doing an
unmount(2) on the fs]) the umount code of the lib falls into an infinite
loop. This is a bug which may or may not affect FreeBSD once DESTROY is
implemented for it -- the umount code in the lib is platform specific,
anyway. So this is just about a possible bug which really falls into
the "we will see it when we get there" category.

OTOH, by the essence of the synch umount, mounting an fs daemon w/ synch
umount means that the daemon gets the control over the termination
of the unmount syscall. So being able to mount w/ synch umount assumes
some kind of trusted state -- it enables a malicious daemon to block its
unmounting. It's not a real risk if the unmount is done manually -- when
the person who unmounts the fs observes that the daemon is blocking the
unmount, she can turn to either a forced unmount or killing the daemon.
However, during the shutdown sequence, which is automated, noone will be
there to forcedly terminate the FUSE session, and shutdown might get
stuck this way.

So we have to decide how to control the access to mounting with synch
umount. This is point 3).


Probably I just provide an implementation of it for the kernel module
and add a "sync_umount" option to mount_fusefs(8) and let it be as is
and we'll see how well it works out, and what to do about access control
-wise.

Csaba
Alejandro Pulver
2007-12-17 17:15:04 UTC
Permalink
On Mon, 17 Dec 2007 17:41:16 +0100
[This message has also been posted to gmane.os.freebsd.devel.hackers.]
Post by Alejandro Pulver
<quote>
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
</quote>
IIRC you are saying that any user could make umount hang. And you said
this is an unintended behavior caused by the implementation, which
appeared on Linux and we don't know if it will happen on FreeBSD.
Otherwise the daemon would synchronize the fs and let umount return
normally, and this wouldn't happen, right?
If this always happens then what is the difference between happening on
a root/non-root mount, as it will hang anyways?
Does the hang (point 2)/umount stuck (point 3) issues consist of the
same (I assumed so)? If not, please point out the differences.
Oh sorry, I see my wording was still not clear enough...
Point 3) and point 2) are completely different issues.
Point 2) is about a _bug_ which might make unmount hang (contrary to our
intentions). Point 3) is about the access control _policy_ of the
"mounting with sync unmount" feature which is DOS capable: it enables
some malicious code to make the shutdown sequence hung.
I understand now (about 3). So a user could write a FUSE daemon which
never replies properly (or doesn't reply at all) to the DESTROY code,
and the kernel module would be waiting indefinitely. Stalling the
shutdown sequence.

Maybe this could be solved with a timeout (see below).
[Btw, I used/use the "hang", "block", "make stuck" expressions
interchangeably, I'm sorry if it's not correct English or just sounds
unnaturally in some cases.]
I'm not a native English speaker, so these all seemed the same to me.

Thanks for the clarifications.
Point 2) is about a defect of a naive, straightforward implementation of
handling the DESTROY message in the FUSE library. That is, if the daemon
is mounted with synch umount, under certain circumstances (if I
understood correctly, this amounts to killing the daemon with a SIGTERM [ie.,
the FUSE session terminates due the sigterm and not because of doing an
unmount(2) on the fs]) the umount code of the lib falls into an infinite
loop. This is a bug which may or may not affect FreeBSD once DESTROY is
implemented for it -- the umount code in the lib is platform specific,
anyway. So this is just about a possible bug which really falls into
the "we will see it when we get there" category.
O.K.
OTOH, by the essence of the synch umount, mounting an fs daemon w/ synch
umount means that the daemon gets the control over the termination
of the unmount syscall. So being able to mount w/ synch umount assumes
some kind of trusted state -- it enables a malicious daemon to block its
unmounting. It's not a real risk if the unmount is done manually -- when
the person who unmounts the fs observes that the daemon is blocking the
unmount, she can turn to either a forced unmount or killing the daemon.
However, during the shutdown sequence, which is automated, noone will be
there to forcedly terminate the FUSE session, and shutdown might get
stuck this way.
So we have to decide how to control the access to mounting with synch
umount. This is point 3).
I can see 2 approaches to solving this (not sure if both are possible
though):

Maybe adding a timeout to the FUSE kernel module? I've seen the syncer
daemon say "giving up on ..." when I had ATA errors related to
configuration/cable problems (it retried for a while, and as disk
couldn't sync, it terminated anyways). So maybe something similar could
be implemented.

Otherwise IMHO only root should be allowed to do such mounts, or having
a sysctl disabled by default to allow users do this (like
vfs.usermount).
Probably I just provide an implementation of it for the kernel module
and add a "sync_umount" option to mount_fusefs(8) and let it be as is
and we'll see how well it works out, and what to do about access control
-wise.
Csaba
When the implementation is ready, and if these problems are sorted out,
do you think it could be enabled by default (at least for root)? Because
that's the behavior most filesystems would prefer I think.

Best Regards,
Ale
Csaba Henk
2008-01-03 15:27:56 UTC
Permalink
Post by Alejandro Pulver
When the implementation is ready, and if these problems are sorted out,
do you think it could be enabled by default (at least for root)? Because
that's the behavior most filesystems would prefer I think.
I made up a testable implementation, see it here:

http://mercurial.creo.hu/repos/fuse4bsd-hg-experimental/?rev/abc018d9f535

It seems to work fine. In order to have a fuse daemon which is
"malicious", ie. tries to stall shutdown, I hacked fusexmp_fh.c as
follows:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--- /dispatch/root/fuse-2.7.0/example/fusexmp_fh.c 2007-05-20 13:05:43.000000000 +0200
+++ fusexmp_fh.c 2008-01-03 02:47:00.000000000 +0100
@@ -416,6 +416,17 @@ static int xmp_lock(const char *path, st
sizeof(fi->lock_owner));
}

+static void xmp_destroy(void *foo)
+{
+ unsigned i = 0;
+
+ for(;;) {
+ fprintf(stderr, "%d ", i++);
+ sleep(1);
+ }
+}
+
+
static struct fuse_operations xmp_oper = {
.getattr = xmp_getattr,
.fgetattr = xmp_fgetattr,
@@ -451,6 +462,7 @@ static struct fuse_operations xmp_oper =
.removexattr= xmp_removexattr,
#endif
.lock = xmp_lock,
+ .destroy = xmp_destroy,
};

int main(int argc, char *argv[])
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

shutdown(8) was able to complete. The above hack could only cause a ten second
delay -- see the output of the daemon:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
unique: 0, opcode: INIT (26), nodeid: 0, insize: 56
INIT: 7.8
flags=0x00000000
max_readahead=0x00000000
INIT: 7.8
flags=0x00000002
max_readahead=0x00000000
max_write=0x00020000
unique: 0, error: 0 (Unknown error: 0), outsize: 40
0 1 2 3 4 5 6 7 8 9
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

So FreeBSD's shutdown handles this fine, the system can't really be
DoS'd this way.

Therefore we don't need a too strict access policy. I think that from
our POV, it would be sufficient to add a "-osync_unmount" mount option
and a sysctl via which it's availability for unpriviliged users can be
set. But life is not that simple: if we added such a mount opt that
would remain FreeBSD specific (on Linux it won't happen, for reasons I
don't want to digress on here), and therefore filesystem authors -- who
usually use fuse options internally -- won't use it; although they are
the authorative persons whether their filesystems needs to have a
sychronized unmount or not. I'll try to find the fine middle ground
with Miklos with respect to this.

So while the interface to this feature is under construction, you can
already play with it and I'd like to know about your experiences.

This can be done as follows:

- Get the experimental version of fuse4bsd from the above mentioned
URL. (More exactly, the above URL shows the cset which brings in
the current implementation; if you want to get the latest of
this branch, use

hg {clone,pull} -r sync_unmount0 http://mercurial.creo.hu/repos/fuse4bsd-hg-experimental

or

fetch http://mercurial.creo.hu/repos/fuse4bsd-hg-experimental/?archive/sync_unmount0.tar.gz

As of the time of writing this, abc018d9f535 and sync_unmount0 refer to
the same revision.)

- Compile it with the CFLAG -DFUSE_HAS_DESTROY=1 (the proto version
hasn't been bumped when DESTROY was added, so I can't figure it
out if DESTROY is available; you have to pass this setting
manually). (Apart from loading the kld, don't forget to use the
mount_fusefs(8) binary compiled from this code!)

- Recompile fusefs-libs using the following revision of
lib/mount_bsd.c:

http://fuse.cvs.sourceforge.net/*checkout*/fuse/fuse/lib/mount_bsd.c?revision=1.14

(this includes the patch
http://fuse.cvs.sourceforge.net/fuse/fuse/lib/mount_bsd.c?r1=1.13&r2=1.14
which fixes a bug referred to in this thread as "issue 2").

- Go wild with your experiments. ATM the easiest way to enable sync unmount
is adding MOUNT_FUSEFS_SYNC_UNMOUNT=1 to the environment. (Making it
settable via the environment lets us leave the lib/fs code intact.)
ATM sync unmount is available without restrictions.

Have fun,
Csaba

Loading...