Disk sync at shutdown and fusefs filesystems

Post by Alejandro Pulver
Then I have to look for some way to manually
unmount FUSE filesystems at shutdown, because they are already mounted
at startup. I thought about instructing the fusefs-kmod rc.d script to
unmount FUSE filesystems before attempting to unload the kernel module
(currently it only loads/unloads fuse.ko).

Yes, I think that given what we're working with here, that would be a
good idea regardless. It should be pretty easy to do, you can find a
sample of something like what you would want in /etc/rc.d/dumpon. Let
me know if you need help, I'm more than a little interested in getting
fuse-ntfs set up here.

Doug

--
This .signature sanitized for your protection

Alejandro Pulver

2007-12-11 15:02:11 UTC

On Mon, 10 Dec 2007 20:18:26 -0800

Post by Doug Barton

Thanks, here is what I've got so far: it seems /dev/fuse[0-9]* devices
aren't removed after the corresponding filesystem is unmounted (I guess
they are reused), so instead of listing /dev the list has to be taken
from 'mount'. Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?

echo "Stopping ${name}."
for fs in `mount | grep '^/dev/fuse[0-9]*' | cut -d ' ' -f 1`; do
umount $fs
done
sleep 2
kldunload $kmod

Unfortunately it doesn't have a status function to avoid loading when
already loaded and the other way, but can easily be added.

Best Regards,
Ale

Doug Barton

2007-12-11 20:22:35 UTC

Post by Alejandro Pulver
Thanks, here is what I've got so far: it seems /dev/fuse[0-9]* devices
aren't removed after the corresponding filesystem is unmounted (I guess
they are reused), so instead of listing /dev the list has to be taken
from 'mount'.

Yeah, I think that's better than using fstab anyway, since this way we get
them all with limited processing. Wish I'd thought of it. :)

Post by Alejandro Pulver
Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?

I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
between unmounts, just to be paranoid. How about:

mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1

Post by Alejandro Pulver
kldunload $kmod

hth,

Doug

--
This .signature sanitized for your protection

Alejandro Pulver

2007-12-11 21:02:17 UTC

On Tue, 11 Dec 2007 12:22:35 -0800 (PST)

Post by Doug Barton

Yeah, I think that's better than using fstab anyway, since this way we get
them all with limited processing. Wish I'd thought of it. :)

Actually, I tried first with "umount -a -t {fusefs,ntfs-3g,fuse,...}"
but didn't work.

Post by Doug Barton

Post by Alejandro Pulver
Also there should be a delay between the 'umount' and
'kldunload' commands. What do you think about the following
(replacement for fusefs_stop function)?

I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1

It looks fine to me. And what about echoing the mountpoints as they are
unmounted?

mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*)
echo "fusefs: unmounting ${mountpoint}."
umount $mountpoint ; sleep 1
;;
esac
done

Also this checks would avoid kldload/kldunload errors:

In fusefs_start:
if kldstat | grep -q fuse\\.ko; then
echo "${name} is already running."
return 0
fi

In fusefs_stop:
if ! kldstat | grep -q fuse\\.ko; then
echo "${name} is not running."
return 1
fi

Well, the word "loaded" instead of "running" would be better. Also a
status command could be added, but I don't think it's needed.

Also

Karsten Behrmann

2007-12-11 21:53:29 UTC

Post by Doug Barton
I suppose this is mostly a style difference, but I like to avoid all those
subshells if we can. I also think it might be a good idea to wait a second
mount | while read dev d1 mountpoint d2; do
case "$dev" in
/dev/fuse[0-9]*) umount $mountpoint ; sleep 1 ;;
esac
done
sleep 1

Hmm, if you truly want to be paranoid, you probably should be unmounting
those in reverse order, because someone might be mounting one fuse-fs
inside another ;)

just my 2 cents,
Karsten

--
Open source is not about suing someone who sells your software. It is
about being able to walk behind him, grinning, and waving free CDs with
the equivalent of what he is trying to sell.

Csaba Henk

2007-12-12 02:00:14 UTC

Post by Alejandro Pulver
The problem with NTFS-3G (and all other FUSE based drivers maybe) is
that it doesn't flush the cache data to the disk at shutdown, but it
does when unmounted (and I guess this doesn't happen automatically). I
noticed this when files I write before manually unmounting persist, and
otherwise sometimes they don't.

I just happen to discuss this issue with Szaka (ntfs-3g developer) and
Miklos Szeredi (FUSE developer). At least, we're discussing something
which might have a relevance here.

They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".

This was introduced in the following commit (as seen in my HG mirror):

http://mercurial.creo.hu/repos/fuse-hg/?rev/a5df6fb4a0e6

and it's already included in the current sysutils/fusefs-libs port.

And it wouldn't be hard to add kernel side support for FreeBSD. There
are some questions though:

- Do you think it could be actually useful for solving the shutdown
issue on FreeBSD?

- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).

- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.

Regards,
Csaba

Alejandro Pulver

2007-12-12 16:43:16 UTC

On Wed, 12 Dec 2007 03:00:07 +0100

[This message has also been posted to gmane.os.freebsd.devel.hackers.]

Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?

Please correct me if I'm wrong.

At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).

- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).

Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).

- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.

This would depend on the previous point.

Please CC me as I'm not (yet) subscribed.

Best Regards,
Ale

Csaba Henk

2007-12-17 01:22:08 UTC

Post by Alejandro Pulver
On Wed, 12 Dec 2007 03:00:07 +0100

Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?
Please correct me if I'm wrong.
At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).

The point in synch umount is that you don't need to wait for an ad hoc
amount of time in order to have the various caches flushed / media
sync'd -- it enables the filesystem daemon itself to notify the umount
procedure that it's done and the world can go on.

The exact nature of caches (userspace / in-kernel) doesn't really make a
difference from this POV. (Implementing the appropriate synchronization
mechanisms is up to the fs daemon.)

Post by Csaba Henk
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).

Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).

Of course you've never seen this! -- these have appeared as consequences
of the synchronous umount (ie., the DESTROY message) which is not yet
implemented on FreeBSD.

The actual question is whether it is worth to implement it. For me it
seems "yes", but I don't know the ins and outs of the FreeBSD
init/shutdown system, that's why I'd like to hear the opinion of people
like you about this before I go and code it.

Whether hangs occur or not if fuse4bsd does sync umount is not that
important. I mean, first I would code a basic implementation of DESTROY
(that's pretty simple to do!) and we'd see how well that works and if we
see problems I try to tune the implementation. That's just business as
usual.

The next issue is more important...

Post by Csaba Henk
- Security issue: with synch unmount, any user who can mount (w/ synch
unmount), is capable of making the unmount stuck (which is easy to
fix when the system is up -- just kill the fs daemon -- but can
make the shutdown process hopelessly stuck). So we'd have to
decide who/when shall be able to do mounts for which the unmount is
synchronous. (The current criteria for this on Linux -- ie.,
is the fuseblk fs variant being used? -- is N/A to FreeBSD for
reasons which are OT here. However, Miklos decided to
change this so that sych unmount will be tied to the "allow_other"
option, which is tied to root privileges, and does make sense
on FreeBSD, too. I'd be happy to hear more suitable criteria.

This would depend on the previous point.

It's more important to specify a suitable security policy -- who/when
should be capable of mounting an fs in a way so that the umount will
be synchronous? (Short term for this: "mounting with synch umount").

I think it's quite a "standalone" question, it depends on nothing else.
If a FUSE fs is mounted with synch umount its daemon can block the umount
command/syscall, and therefore it can probably block the whole shutdown
sequence (killing the process or umounting with -f might help, but these
probably won't happen during the normal shutdown course).

A possible compromise is letting only the superuser to mount with synch
umount. That would mean only superuser mounted ntfs fs-es will be
cleanly unmounted during shutdown (I mean, without resorting to hacks in
the shutdown scripts). ("Tying synch umount to 'allow_other'", as
mentioned above, is practically the same choice.)

This of course can be refined, eg. make it configurable with a sysctl,
etc.

Either case, it's a design issue and not just an implementation detail
so it would be clever to try to make up our mind about this.

Csaba

Alejandro Pulver

2007-12-17 02:26:55 UTC

On Mon, 17 Dec 2007 02:21:53 +0100

Post by Csaba Henk
They have already discovered issues with system shutdown on Linux, and
Miklos has implemented a solution for this dubbed as "synchronous
umount". According to this, the protocol is enhanced with a new message
called DESTROY. Upon unmounting the fs, the kernel sends a DESTROY to
the daemon and waits for answer. That is, unmount(2) won't complete
until the fs says to the kernel "OK, I'm done".

[...]

Post by Alejandro Pulver
Hmm, I don't know much of this, but isn't the Linux problem related to
flushing its own block device cache? In FreeBSD it doesn't exist (i.e.
ublio is only user-space), so I wonder if just unmounting before
shutdown solves the issue. I mean, does the kernel still keep
information after a FUSE filesystem is unmounted?
Please correct me if I'm wrong.
At least the currently discussed trick only works because it waits a few
seconds after unmounting to let it flush the cache (but I think it's a
common fact that filesystems get registered/unregistered with a small
delay, and may not be related to that).

The point in synch umount is that you don't need to wait for an ad hoc
amount of time in order to have the various caches flushed / media
sync'd -- it enables the filesystem daemon itself to notify the umount
procedure that it's done and the world can go on.
The exact nature of caches (userspace / in-kernel) doesn't really make a
difference from this POV. (Implementing the appropriate synchronization
mechanisms is up to the fs daemon.)

I see, thanks for the clarification.

Post by Csaba Henk
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).

Never seen this, but also never unmounted at shutdown before. I have a
patch for it (see thread). Then we could easily see if it get stalled
at shutdown (or when manually stopping the rc.d script).

Of course you've never seen this! -- these have appeared as consequences
of the synchronous umount (ie., the DESTROY message) which is not yet
implemented on FreeBSD.

That's logical. I missed the point.

Post by Csaba Henk
The actual question is whether it is worth to implement it. For me it
seems "yes", but I don't know the ins and outs of the FreeBSD
init/shutdown system, that's why I'd like to hear the opinion of people
like you about this before I go and code it.

I'm not in the kernel side, but I think it's the correct thing to do.
The kernel syncer does the same with other filesystems, so...

Post by Csaba Henk
Whether hangs occur or not if fuse4bsd does sync umount is not that
important. I mean, first I would code a basic implementation of DESTROY
(that's pretty simple to do!) and we'd see how well that works and if we
see problems I try to tune the implementation. That's just business as
usual.

For what you said, we won't know if there are hangs until we have an
implementation of DESTROY. So this will be attended later as you said.

Post by Csaba Henk
The next issue is more important...

This would depend on the previous point.

<quote>
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
</quote>

IIRC you are saying that any user could make umount hang. And you said
this is an unintended behavior caused by the implementation, which
appeared on Linux and we don't know if it will happen on FreeBSD.

Otherwise the daemon would synchronize the fs and let umount return
normally, and this wouldn't happen, right?

If this always happens then what is the difference between happening on
a root/non-root mount, as it will hang anyways?

If I missed the point again please correct me, and clarify the following:
Does the hang (point 2)/umount stuck (point 3) issues consist of the
same (I assumed so)? If not, please point out the differences.

Post by Csaba Henk
If a FUSE fs is mounted with synch umount its daemon can block the umount
command/syscall, and therefore it can probably block the whole shutdown
sequence (killing the process or umounting with -f might help, but these
probably won't happen during the normal shutdown course).
A possible compromise is letting only the superuser to mount with synch
umount. That would mean only superuser mounted ntfs fs-es will be
cleanly unmounted during shutdown (I mean, without resorting to hacks in
the shutdown scripts). ("Tying synch umount to 'allow_other'", as
mentioned above, is practically the same choice.)
This of course can be refined, eg. make it configurable with a sysctl,
etc.
Either case, it's a design issue and not just an implementation detail
so it would be clever to try to make up our mind about this.
Csaba

Best Regards,
Ale

Csaba Henk

2007-12-17 16:41:32 UTC

Post by Alejandro Pulver
<quote>
- Some "got hung in unmount" issues are to be sorted out (these
appeared on Linux, and they might or might not appear on FreeBSD).
</quote>
IIRC you are saying that any user could make umount hang. And you said
this is an unintended behavior caused by the implementation, which
appeared on Linux and we don't know if it will happen on FreeBSD.
Otherwise the daemon would synchronize the fs and let umount return
normally, and this wouldn't happen, right?
If this always happens then what is the difference between happening on
a root/non-root mount, as it will hang anyways?
Does the hang (point 2)/umount stuck (point 3) issues consist of the
same (I assumed so)? If not, please point out the differences.

Oh sorry, I see my wording was still not clear enough...

Point 3) and point 2) are completely different issues.

Point 2) is about a _bug_ which might make unmount hang (contrary to our
intentions). Point 3) is about the access control _policy_ of the
"mounting with sync unmount" feature which is DOS capable: it enables
some malicious code to make the shutdown sequence hung.

[Btw, I used/use the "hang", "block", "make stuck" expressions
interchangeably, I'm sorry if it's not correct English or just sounds
unnaturally in some cases.]

The same statement with more details:

Point 2) is about a defect of a naive, straightforward implementation of
handling the DESTROY message in the FUSE library. That is, if the daemon
is mounted with synch umount, under certain circumstances (if I
understood correctly, this amounts to killing the daemon with a SIGTERM [ie.,
the FUSE session terminates due the sigterm and not because of doing an
unmount(2) on the fs]) the umount code of the lib falls into an infinite
loop. This is a bug which may or may not affect FreeBSD once DESTROY is
implemented for it -- the umount code in the lib is platform specific,
anyway. So this is just about a possible bug which really falls into
the "we will see it when we get there" category.

OTOH, by the essence of the synch umount, mounting an fs daemon w/ synch
umount means that the daemon gets the control over the termination
of the unmount syscall. So being able to mount w/ synch umount assumes
some kind of trusted state -- it enables a malicious daemon to block its
unmounting. It's not a real risk if the unmount is done manually -- when
the person who unmounts the fs observes that the daemon is blocking the
unmount, she can turn to either a forced unmount or killing the daemon.
However, during the shutdown sequence, which is automated, noone will be
there to forcedly terminate the FUSE session, and shutdown might get
stuck this way.

So we have to decide how to control the access to mounting with synch
umount. This is point 3).

Probably I just provide an implementation of it for the kernel module
and add a "sync_umount" option to mount_fusefs(8) and let it be as is
and we'll see how well it works out, and what to do about access control
-wise.

Csaba

Alejandro Pulver

2007-12-17 17:15:04 UTC

On Mon, 17 Dec 2007 17:41:16 +0100

[This message has also been posted to gmane.os.freebsd.devel.hackers.]

I understand now (about 3). So a user could write a FUSE daemon which
never replies properly (or doesn't reply at all) to the DESTROY code,
and the kernel module would be waiting indefinitely. Stalling the
shutdown sequence.

Maybe this could be solved with a timeout (see below).

[Btw, I used/use the "hang", "block", "make stuck" expressions
interchangeably, I'm sorry if it's not correct English or just sounds
unnaturally in some cases.]

I'm not a native English speaker, so these all seemed the same to me.

Thanks for the clarifications.

Point 2) is about a defect of a naive, straightforward implementation of
handling the DESTROY message in the FUSE library. That is, if the daemon
is mounted with synch umount, under certain circumstances (if I
understood correctly, this amounts to killing the daemon with a SIGTERM [ie.,
the FUSE session terminates due the sigterm and not because of doing an
unmount(2) on the fs]) the umount code of the lib falls into an infinite
loop. This is a bug which may or may not affect FreeBSD once DESTROY is
implemented for it -- the umount code in the lib is platform specific,
anyway. So this is just about a possible bug which really falls into
the "we will see it when we get there" category.

O.K.

OTOH, by the essence of the synch umount, mounting an fs daemon w/ synch
umount means that the daemon gets the control over the termination
of the unmount syscall. So being able to mount w/ synch umount assumes
some kind of trusted state -- it enables a malicious daemon to block its
unmounting. It's not a real risk if the unmount is done manually -- when
the person who unmounts the fs observes that the daemon is blocking the
unmount, she can turn to either a forced unmount or killing the daemon.
However, during the shutdown sequence, which is automated, noone will be
there to forcedly terminate the FUSE session, and shutdown might get
stuck this way.
So we have to decide how to control the access to mounting with synch
umount. This is point 3).

I can see 2 approaches to solving this (not sure if both are possible
though):

Maybe adding a timeout to the FUSE kernel module? I've seen the syncer
daemon say "giving up on ..." when I had ATA errors related to
configuration/cable problems (it retried for a while, and as disk
couldn't sync, it terminated anyways). So maybe something similar could
be implemented.

Otherwise IMHO only root should be allowed to do such mounts, or having
a sysctl disabled by default to allow users do this (like
vfs.usermount).

Probably I just provide an implementation of it for the kernel module
and add a "sync_umount" option to mount_fusefs(8) and let it be as is
and we'll see how well it works out, and what to do about access control
-wise.
Csaba

When the implementation is ready, and if these problems are sorted out,
do you think it could be enabled by default (at least for root)? Because
that's the behavior most filesystems would prefer I think.

Best Regards,
Ale

Csaba Henk

2008-01-03 15:27:56 UTC