514 Commits

Author SHA1 Message Date
Stricted bdecc6d184 Merge tag 'v3.10.108' into update
This is the 3.10.108 stable release
2018-03-21 23:07:40 +01:00
Stricted 47e5ca72da Merge tag 'v3.10.106' into update
This is the 3.10.106 stable release
2018-03-21 23:06:23 +01:00
Stricted b9e7bc93d6 Merge tag 'v3.10.103' into update
This is the 3.10.103 stable release
2018-03-21 22:58:21 +01:00
Stricted 647f2da1e2 Merge tag 'v3.10.98' into update
This is the 3.10.98 stable release
2018-03-21 22:51:37 +01:00
Stricted 38b8911896 Merge tag 'v3.10.85' into update
This is the 3.10.85 stable release
2018-03-21 22:46:39 +01:00
Stricted eff333fa3b Merge tag 'v3.10.73' into update
This is the 3.10.73 stable release
2018-03-21 22:41:03 +01:00
Stricted 6fa3eb70c0 import PULS_20160108 2018-03-13 20:29:02 +01:00
Mateusz Jurczyk c2ace3043f fuse: initialize the flock flag in fuse_file on allocation
commit 68227c03cba84a24faf8a7277d2b1a03c8959c2c upstream.

Before the patch, the flock flag could remain uninitialized for the
lifespan of the fuse_file allocation. Unless set to true in
fuse_file_flock(), it would remain in an indeterminate state until read in
an if statement in fuse_release_common(). This could consequently lead to
taking an unexpected branch in the code.

The bug was discovered by a runtime instrumentation designed to detect use
of uninitialized memory in the kernel.

Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Fixes: 37fb3a30b4 ("fuse: fix flock")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-11-01 22:12:44 +01:00
Miklos Szeredi 5beea85761 fuse: add missing FR_FORCE
commit 2e38bea99a80eab408adee27f873a188d57b76cb upstream.

fuse_file_put() was missing the "force" flag for the RELEASE request when
sending synchronously (fuseblk).

If this flag is not set, then a sync request may be interrupted before it
is dequeued by the userspace filesystem.  In this case the OPEN won't be
balanced with a RELEASE.

[js] force is a variable, not a bit

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 5a18ec176c ("fuse: fix hang of single threaded fuseblk filesystem")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-08 00:47:01 +02:00
Robert Doebbelin d3005b09a6 fuse: do not use iocb after it may have been freed
commit 7cabc61e01a0a8b663bd2b4c982aa53048218734 upstream.

There's a race in fuse_direct_IO(), whereby is_sync_kiocb() is called on an
iocb that could have been freed if async io has already completed.  The fix
in this case is simple and obvious: cache the result before starting io.

It was discovered by KASan:

Kernel: ==================================================================
Kernel: BUG: KASan: use after free in fuse_direct_IO+0xb1a/0xcc0 at addr ffff88036c414390

Signed-off-by: Robert Doebbelin <robert@quobyte.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: bcba24ccdc ("fuse: enable asynchronous processing direct IO")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-08 00:46:57 +02:00
Wei Fang 7f92e56587 fuse: fix wrong assignment of ->flags in fuse_send_init()
commit 9446385f05c9af25fed53dbed3cc75763730be52 upstream.

FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.

Signed-off-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 69fe05c90e ("fuse: add missing INIT flags")
Cc: <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27 11:40:35 +02:00
Roman Gushchin 020ef19153 fuse: break infinite loop in fuse_fill_write_pages()
commit 3ca8138f014a913f98e6ef40e939868e1e9ea876 upstream.

I got a report about unkillable task eating CPU. Further
investigation shows, that the problem is in the fuse_fill_write_pages()
function. If iov's first segment has zero length, we get an infinite
loop, because we never reach iov_iter_advance() call.

Fix this by calling iov_iter_advance() before repeating an attempt to
copy data from userspace.

A similar problem is described in 124d3b7041 ("fix writev regression:
pan hanging unkillable and un-straceable"). If zero-length segmend
is followed by segment with invalid address,
iov_iter_fault_in_readable() checks only first segment (zero-length),
iov_iter_copy_from_user_atomic() skips it, fails at second and
returns zero -> goto again without skipping zero-length segment.

Patch calls iov_iter_advance() before goto again: we'll skip zero-length
segment at second iteraction and iov_iter_fault_in_readable() will detect
invalid address.

Special thanks to Konstantin Khlebnikov, who helped a lot with the commit
description.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Maxim Patlasov <mpatlasov@parallels.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: ea9b9907b8 ("fuse: implement perform_write")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25 11:57:48 -08:00
Miklos Szeredi cc522c517a fuse: initialize fc->release before calling it
commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed upstream.

fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.

[Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling
fuse_conn_init(fc) instead of before.]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Fixes: a325f9b922 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03 09:29:46 -07:00
Miklos Szeredi 12bda52e8f fuse: notify: don't move pages
commit 0d2783626a53d4c922f82d51fa675cb5d13f0d36 upstream.

fuse_try_move_page() is not prepared for replacing pages that have already
been read.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26 15:00:57 +01:00
Miklos Szeredi ce6f883465 fuse: set stolen page uptodate
commit aa991b3b267e24f578bac7b09cc57579b660304b upstream.

Regular pipe buffers' ->steal method (generic_pipe_buf_steal()) doesn't set
PG_uptodate.

Don't warn on this condition, just set the uptodate flag.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26 15:00:57 +01:00
Miklos Szeredi d609df085c fuse: handle large user and group ID
commit 233a01fa9c4c7c41238537e8db8434667ff28a2f upstream.

If the number in "user_id=N" or "group_id=N" mount options was larger than
INT_MAX then fuse returned EINVAL.

Fix this to handle all valid uid/gid values.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-28 08:00:02 -07:00
Miklos Szeredi d840f9899a fuse: fix pipe_buf_operations
commit 28a625cbc2a14f17b83e47ef907b2658576a32aa upstream.

Having this struct in module memory could Oops when if the module is
unloaded while the buffer still persists in a pipe.

Since sock_pipe_buf_ops is essentially the same as fuse_dev_pipe_buf_steal
merge them into nosteal_pipe_buf_ops (this is the same as
default_pipe_buf_ops except stealing the page from the buffer is not
allowed).

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-13 13:47:59 -08:00
Maxim Patlasov 9abd30b435 fuse: fix fallocate vs. ftruncate race
commit 0ab08f576b9e6a6b689fc6b4e632079b978e619b upstream.

A former patch introducing FUSE_I_SIZE_UNSTABLE flag provided detailed
description of races between ftruncate and anyone who can extend i_size:

> 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size
> changed (due to our own fuse_do_setattr()) and is going to call
> truncate_pagecache() for some  'new_size' it believes valid right now. But by
> the time that particular truncate_pagecache() is called ...
> 2. fuse_do_setattr() returns (either having called truncate_pagecache() or
> not -- it doesn't matter).
> 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2).
> 4. mmap-ed write makes a page in the extended region dirty.

This patch adds necessary bits to fuse_file_fallocate() to protect from that
race.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:32 -07:00
Maxim Patlasov eb12ca30f1 fuse: wait for writeback in fuse_file_fallocate()
commit bde52788bdb755b9e4b75db6c434f30e32a0ca0b upstream.

The patch fixes a race between mmap-ed write and fallocate(PUNCH_HOLE):

1) An user makes a page dirty via mmap-ed write.
2) The user performs fallocate(2) with mode == PUNCH_HOLE|KEEP_SIZE
   and <offset, size> covering the page.
3) Before truncate_pagecache_range call from fuse_file_fallocate,
   the page goes to write-back. The page is fully processed by fuse_writepage
   (including end_page_writeback on the page), but fuse_flush_writepages did
   nothing because fi->writectr < 0.
4) truncate_pagecache_range is called and fuse_file_fallocate is finishing
   by calling fuse_release_nowrite. The latter triggers processing queued
   write-back request which will write stale data to the hole soon.

Changed in v2 (thanks to Brian for suggestion):
 - Do not truncate page cache until FUSE_FALLOCATE succeeded. Otherwise,
   we can end up in returning -ENOTSUPP while user data is already punched
   from page cache. Use filemap_write_and_wait_range() instead.
Changed in v3 (thanks to Miklos for suggestion):
 - fuse_wait_on_writeback() is prone to livelocks; use fuse_set_nowrite()
   instead. So far as we need a dirty-page barrier only, fuse_sync_writes()
   should be enough.
 - rebased to for-linus branch of fuse.git

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:31 -07:00
Miklos Szeredi e5362560a0 fuse: readdir: check for slash in names
commit efeb9e60d48f7778fdcad4a0f3ad9ea9b19e5dfd upstream.

Userspace can add names containing a slash character to the directory
listing.  Don't allow this as it could cause all sorts of trouble.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 17:18:30 -07:00
Maxim Patlasov 4e20830311 fuse: hotfix truncate_pagecache() issue
commit 06a7c3c2781409af95000c60a5df743fd4e2f8b4 upstream.

The way how fuse calls truncate_pagecache() from fuse_change_attributes()
is completely wrong. Because, w/o i_mutex held, we never sure whether
'oldsize' and 'attr->size' are valid by the time of execution of
truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we
released fc->lock in the middle of fuse_change_attributes(), we completely
loose control of actions which may happen with given inode until we reach
truncate_pagecache. The list of potentially dangerous actions includes
mmap-ed reads and writes, ftruncate(2) and write(2) extending file size.

The typical outcome of doing truncate_pagecache() with outdated arguments
is data corruption from user point of view. This is (in some sense)
acceptable in cases when the issue is triggered by a change of the file on
the server (i.e. externally wrt fuse operation), but it is absolutely
intolerable in scenarios when a single fuse client modifies a file without
any external intervention. A real life case I discovered by fsx-linux
looked like this:

1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends
FUSE_SETATTR to the server synchronously, but before getting fc->lock ...
2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP
to the server synchronously, then calls fuse_change_attributes(). The
latter updates i_size, releases fc->lock, but before comparing oldsize vs
attr->size..
3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and
updating attributes and i_size, but now oldsize is equal to
outarg.attr.size because i_size has just been updated (step 2). Hence,
fuse_do_setattr() returns w/o calling truncate_pagecache().
4. As soon as ftruncate(2) completes, the user extends file size by
write(2) making a hole in the middle of file, then reads data from the hole
either by read(2) or mmap-ed read. The user expects to get zero data from
the hole, but gets stale data because truncate_pagecache() is not executed
yet.

The scenario above illustrates one side of the problem: not truncating the
page cache even though we should. Another side corresponds to truncating
page cache too late, when the state of inode changed significantly.
Theoretically, the following is possible:

1. As in the previous scenario fuse_dentry_revalidate() discovered that
i_size changed (due to our own fuse_do_setattr()) and is going to call
truncate_pagecache() for some 'new_size' it believes valid right now. But
by the time that particular truncate_pagecache() is called ...
2. fuse_do_setattr() returns (either having called truncate_pagecache() or
not -- it doesn't matter).
3. The file is extended either by write(2) or ftruncate(2) or fallocate(2).
4. mmap-ed write makes a page in the extended region dirty.

The result will be the lost of data user wrote on the fourth step.

The patch is a hotfix resolving the issue in a simplistic way: let's skip
dangerous i_size update and truncate_pagecache if an operation changing
file size is in progress. This simplistic approach looks correct for the
cases w/o external changes. And to handle them properly, more sophisticated
and intrusive techniques (e.g. NFS-like one) would be required. I'd like to
postpone it until the issue is well discussed on the mailing list(s).

Changed in v2:
 - improved patch description to cover both sides of the issue.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 17:18:30 -07:00
Anand Avati eb97a45d12 fuse: invalidate inode attributes on xattr modification
commit d331a415aef98717393dda0be69b7947da08eba3 upstream.

Calls like setxattr and removexattr result in updation of ctime.
Therefore invalidate inode attributes to force a refresh.

Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 17:18:30 -07:00
Maxim Patlasov 3002e63bec fuse: postpone end_page_writeback() in fuse_writepage_locked()
commit 4a4ac4eba1010ef9a804569058ab29e3450c0315 upstream.

The patch fixes a race between ftruncate(2), mmap-ed write and write(2):

1) An user makes a page dirty via mmap-ed write.
2) The user performs shrinking truncate(2) intended to purge the page.
3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
   writeback. fuse_writepage_locked fills FUSE_WRITE request and releases
   the original page by end_page_writeback.
4) fuse_do_setattr() completes and successfully returns. Since now, i_mutex
   is free.
5) Ordinary write(2) extends i_size back to cover the page. Note that
   fuse_send_write_pages do wait for fuse writeback, but for another
   page->index.
6) fuse_writepage_locked proceeds by queueing FUSE_WRITE request.
   fuse_send_writepage is supposed to crop inarg->size of the request,
   but it doesn't because i_size has already been extended back.

Moving end_page_writeback to the end of fuse_writepage_locked fixes the
race because now the fact that truncate_pagecache is successfully returned
infers that fuse_writepage_locked has already called end_page_writeback.
And this, in turn, infers that fuse_flush_writepages has already called
fuse_send_writepage, and the latter used valid (shrunk) i_size. write(2)
could not extend it because of i_mutex held by ftruncate(2).

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26 17:18:30 -07:00
Miklos Szeredi 223828d8a6 fuse: readdirplus: sanity checks
commit a28ef45cbb1e7fadd5159deb17b02de15c6e4aaf upstream.

Add sanity checks before adding or updating an entry with data received
from readdirplus.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28 16:30:08 -07:00
Miklos Szeredi dc2a6c2d3e fuse: readdirplus: fix instantiate
commit 2914941e3178d84a216fc4eb85292dfef3b6d628 upstream.

Fuse does instantiation slightly differently from NFS/CIFS which use
d_materialise_unique().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28 16:30:07 -07:00
Niels de Vos 05ac7b3a7d fuse: readdirplus: fix dentry leak
commit 53ce9a3364de0723b27d861de93bfc882f7db050 upstream.

In case d_lookup() returns a dentry with d_inode == NULL, the dentry is not
returned with dput(). This results in triggering a BUG() in
shrink_dcache_for_umount_subtree():

  BUG: Dentry ...{i=0,n=...} still in use (1) [unmount of fuse fuse]

[SzM: need to d_drop() as well]

Reported-by: Justin Clift <jclift@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Brian Foster <bfoster@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28 16:30:07 -07:00
Maxim Patlasov 14c14414d1 fuse: hold i_mutex in fuse_file_fallocate()
Changing size of a file on server and local update (fuse_write_update_size)
should be always protected by inode->i_mutex. Otherwise a race like this is
possible:

1. Process 'A' calls fallocate(2) to extend file (~FALLOC_FL_KEEP_SIZE).
fuse_file_fallocate() sends FUSE_FALLOCATE request to the server.
2. Process 'B' calls ftruncate(2) shrinking the file. fuse_do_setattr()
sends shrinking FUSE_SETATTR request to the server and updates local i_size
by i_size_write(inode, outarg.attr.size).
3. Process 'A' resumes execution of fuse_file_fallocate() and calls
fuse_write_update_size(inode, offset + length). But 'offset + length' was
obsoleted by ftruncate from previous step.

Changed in v2 (thanks Brian and Anand for suggestions):
 - made relation between mutex_lock() and fuse_set_nowrite(inode) more
   explicit and clear.
 - updated patch description to use ftruncate(2) in example

Signed-off-by: Maxim V. Patlasov <MPatlasov@parallels.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-18 01:39:03 +02:00
Maxim Patlasov e5c5f05dca fuse: fix alignment in short read optimization for async_dio
The bug was introduced with async_dio feature: trying to optimize short reads,
we cut number-of-bytes-to-read to i_size boundary. Hence the following example:

	truncate --size=300 /mnt/file
	dd if=/mnt/file of=/dev/null iflag=direct

led to FUSE_READ request of 300 bytes size. This turned out to be problem
for userspace fuse implementations who rely on assumption that kernel fuse
does not change alignment of request from client FS.

The patch turns off the optimization if async_dio is disabled. And, if it's
enabled, the patch fixes adjustment of number-of-bytes-to-read to preserve
alignment.

Note, that we cannot throw out short read optimization entirely because
otherwise a direct read of a huge size issued on a tiny file would generate
a huge amount of fuse requests and most of them would be ACKed by userspace
with zero bytes read.

Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-03 15:15:42 +02:00
Brian Foster c9ecf989cc fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests
If request submission fails for an async request (i.e.,
get_user_pages() returns -ERESTARTSYS), we currently skip the
-EIOCBQUEUED return and drop into wait_for_sync_kiocb() forever.

Avoid this by always returning -EIOCBQUEUED for async requests. If
an error occurs, the error is passed into fuse_aio_complete(),
returned via aio_complete() and thus propagated to userspace via
io_getevents().

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-03 15:15:42 +02:00
Miklos Szeredi 28420dad23 fuse: fix readdirplus Oops in fuse_dentry_revalidate
Fix bug introduced by commit 4582a4ab2a "FUSE: Adapt readdirplus to application
usage patterns".

We need to check for a positive dentry; negative dentries are not added by
readdirplus.  Secondly we need to advise the use of readdirplus on the *parent*,
otherwise the whole thing is useless.  Thirdly all this is only relevant if
"readdirplus_auto" mode is selected by the filesystem.

We advise the use of readdirplus only if the dentry was still valid.  If we had
to redo the lookup then there was no use in doing the -plus version.

Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Feng Shuo <steve.shuo.feng@gmail.com>
CC: stable@vger.kernel.org
2013-06-03 14:40:22 +02:00
Brian Foster bee6c30780 fuse: update inode size and invalidate attributes on fallocate
An fallocate request without FALLOC_FL_KEEP_SIZE set can extend the
size of a file. Update the inode size after a successful fallocate.

Also invalidate the inode attributes after a successful fallocate
to ensure we pick up the latest attribute values (i.e., i_blocks).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-05-20 16:58:58 +02:00
Brian Foster 3634a63278 fuse: truncate pagecache range on hole punch
fuse supports hole punch via the fallocate() FALLOC_FL_PUNCH_HOLE
interface. When a hole punch is passed through, the page cache
is not cleared and thus allows reading stale data from the cache.

This is easily demonstrable (using FOPEN_KEEP_CACHE) by reading a
smallish random data file into cache, punching a hole and creating
a copy of the file. Drop caches or remount and observe that the
original file no longer matches the file copied after the hole
punch. The original file contains a zeroed range and the latter
file contains stale data.

Protect against writepage requests in progress and punch out the
associated page cache range after a successful client fs hole
punch.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-05-20 16:58:57 +02:00
Brian Foster de82b92301 fuse: allocate for_background dio requests based on io->async state
Commit 8b41e671 introduced explicit background checking for fuse_req
structures with BUG_ON() checks for the appropriate type of request in
in the associated send functions. Commit bcba24cc introduced the ability
to send dio requests as background requests but does not update the
request allocation based on the type of I/O request. As a result, a
BUG_ON() triggers in the fuse_request_send_background() background path if
an async I/O is sent.

Allocate a request based on the async state of the fuse_io_priv to avoid
the BUG.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-05-14 18:25:44 +02:00
Linus Torvalds 5af43c24ca Merge branch 'akpm' (incoming from Andrew)
Merge more incoming from Andrew Morton:

 - Various fixes which were stalled or which I picked up recently

 - A large rotorooting of the AIO code.  Allegedly to improve
   performance but I don't really have good performance numbers (I might
   have lost the email) and I can't raise Kent today.  I held this out
   of 3.9 and we could give it another cycle if it's all too late/scary.

I ended up taking only the first two thirds of the AIO rotorooting.  I
left the percpu parts and the batch completion for later.  - Linus

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (33 commits)
  aio: don't include aio.h in sched.h
  aio: kill ki_retry
  aio: kill ki_key
  aio: give shared kioctx fields their own cachelines
  aio: kill struct aio_ring_info
  aio: kill batch allocation
  aio: change reqs_active to include unreaped completions
  aio: use cancellation list lazily
  aio: use flush_dcache_page()
  aio: make aio_read_evt() more efficient, convert to hrtimers
  wait: add wait_event_hrtimeout()
  aio: refcounting cleanup
  aio: make aio_put_req() lockless
  aio: do fget() after aio_get_req()
  aio: dprintk() -> pr_debug()
  aio: move private stuff out of aio.h
  aio: add kiocb_cancel()
  aio: kill return value of aio_complete()
  char: add aio_{read,write} to /dev/{null,zero}
  aio: remove retry-based AIO
  ...
2013-05-07 20:49:51 -07:00
Kent Overstreet a27bb332c0 aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-07 20:16:25 -07:00
Linus Torvalds a26ea93a3d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
 "This contains two patchsets from Maxim Patlasov.

  The first reworks the request throttling so that only async requests
  are throttled.  Wakeup of waiting async requests is also optimized.

  The second series adds support for async processing of direct IO which
  optimizes direct IO and enables the use of the AIO userspace
  interface."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: add flag to turn on async direct IO
  fuse: truncate file if async dio failed
  fuse: optimize short direct reads
  fuse: enable asynchronous processing direct IO
  fuse: make fuse_direct_io() aware about AIO
  fuse: add support of async IO
  fuse: move fuse_release_user_pages() up
  fuse: optimize wake_up
  fuse: implement exclusive wakeup for blocked_waitq
  fuse: skip blocking on allocations of synchronous requests
  fuse: add flag fc->initialized
  fuse: make request allocations for background processing explicit
2013-05-07 10:12:32 -07:00
Miklos Szeredi 60b9df7a54 fuse: add flag to turn on async direct IO
Without async DIO write requests to a single file were always serialized.
With async DIO that's no longer the case.

So don't turn on async DIO by default for fear of breaking backward
compatibility.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-05-01 14:37:21 +02:00
Maxim Patlasov efb9fa9e91 fuse: truncate file if async dio failed
The patch improves error handling in fuse_direct_IO(): if we successfully
submitted several fuse requests on behalf of synchronous direct write
extending file and some of them failed, let's try to do our best to clean-up.

Changed in v2: reuse fuse_do_setattr(). Thanks to Brian for suggestion.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-18 10:55:24 +02:00
Maxim Patlasov 439ee5f0c5 fuse: optimize short direct reads
If user requested direct read beyond EOF, we can skip sending fuse requests
for positions beyond EOF because userspace would ACK them with zero bytes read
anyway. We can trust to i_size in fuse_direct_IO for such cases because it's
called from fuse_file_aio_read() and the latter updates fuse attributes
including i_size.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:59 +02:00
Maxim Patlasov bcba24ccdc fuse: enable asynchronous processing direct IO
In case of synchronous DIO request (i.e. read(2) or write(2) for a file
opened with O_DIRECT), the patch submits fuse requests asynchronously, but
waits for their completions before return from fuse_direct_IO().

In case of asynchronous DIO request (i.e. libaio io_submit() or a file opened
with O_DIRECT), the patch submits fuse requests asynchronously and return
-EIOCBQUEUED immediately.

The only special case is async DIO extending file. Here the patch falls back
to old behaviour because we can't return -EIOCBQUEUED and update i_size later,
without i_mutex hold. And we have no method to wait on real async I/O
requests.

The patch also clean __fuse_direct_write() up: it's better to update i_size
in its callers. Thanks Brian for suggestion.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:59 +02:00
Maxim Patlasov 36cf66ed9f fuse: make fuse_direct_io() aware about AIO
The patch implements passing "struct fuse_io_priv *io" down the stack up to
fuse_send_read/write where it is used to submit request asynchronously.
io->async==0 designates synchronous processing.

Non-trivial part of the patch is changes in fuse_direct_io(): resources
like fuse requests and user pages cannot be released immediately in async
case.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:59 +02:00
Maxim Patlasov 01e9d11a3e fuse: add support of async IO
The patch implements a framework to process an IO request asynchronously. The
idea is to associate several fuse requests with a single kiocb by means of
fuse_io_priv structure. The structure plays the same role for FUSE as 'struct
dio' for direct-io.c.

The framework is supposed to be used like this:
 - someone (who wants to process an IO asynchronously) allocates fuse_io_priv
   and initializes it setting 'async' field to non-zero value.
 - as soon as fuse request is filled, it can be submitted (in non-blocking way)
   by fuse_async_req_send()
 - when all submitted requests are ACKed by userspace, io->reqs drops to zero
   triggering aio_complete()

In case of IO initiated by libaio, aio_complete() will finish processing the
same way as in case of dio_complete() calling aio_complete(). But the
framework may be also used for internal FUSE use when initial IO request
was synchronous (from user perspective), but it's beneficial to process it
asynchronously. Then the caller should wait on kiocb explicitly and
aio_complete() will wake the caller up.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:59 +02:00
Maxim Patlasov 187c5c3633 fuse: move fuse_release_user_pages() up
fuse_release_user_pages() will be indirectly used by fuse_send_read/write
in future patches.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:58 +02:00
Miklos Szeredi 3c18ef8117 fuse: optimize wake_up
Normally blocked_waitq will be inactive, so optimize this case.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 21:50:58 +02:00
Maxim Patlasov 722d2bea8c fuse: implement exclusive wakeup for blocked_waitq
The patch solves thundering herd problem. So far as previous patches ensured
that only allocations for background may block, it's safe to wake up one
waiter. Whoever it is, it will wake up another one in request_end() afterwards.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 12:31:45 +02:00
Maxim Patlasov 0aada88476 fuse: skip blocking on allocations of synchronous requests
A task may have at most one synchronous request allocated. So these
requests need not be otherwise limited.

The patch re-works fuse_get_req() to follow this idea.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 12:31:45 +02:00
Maxim Patlasov 796523fb24 fuse: add flag fc->initialized
Existing flag fc->blocked is used to suspend request allocation both in case
of many background request submitted and period of time before init_reply
arrives from userspace. Next patch will skip blocking allocations of
synchronous request (disregarding fc->blocked). This is mostly OK, but
we still need to suspend allocations if init_reply is not arrived yet. The
patch introduces flag fc->initialized which will serve this purpose.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 12:31:44 +02:00
Maxim Patlasov 8b41e6715e fuse: make request allocations for background processing explicit
There are two types of processing requests in FUSE: synchronous (via
fuse_request_send()) and asynchronous (via adding to fc->bg_queue).

Fortunately, the type of processing is always known in advance, at the time
of request allocation. This preparatory patch utilizes this fact making
fuse_get_req() aware about the type. Next patches will use it.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-04-17 12:31:44 +02:00
Al Viro 6447a3cf19 get rid of pipe->inode
it's used only as a flag to distinguish normal pipes/FIFOs from the
internal per-task one used by file-to-file splice.  And pipe->files
would work just as well for that purpose...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:01 -04:00
Al Viro 8d71db4f08 lift sb_start_write/sb_end_write out of ->aio_write()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:12:55 -04:00