kernel_optimize_test

Author	SHA1	Message	Date
Mark Fasheh	e8aed3450c	ocfs2: Re-journal buffers after transaction extend ocfs2_extend_trans() might call journal_restart() which will commit dirty buffers and then restart the transaction. This means that any buffers which still need changes should be passed to journal_access() again. Some paths during extend weren't doing this right. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:51:23 -08:00
Mark Fasheh	0879c584ff	ocfs2: Allow for debugging of transaction extends The nastiest cases of transaction extends are also the rarest. We can expose them more quickly at the expense of performance by going straight to the journal_restart() in ocfs2_extend_trans(). Wrap things in OCFS2_DEBUG_FS so that we only do this when "expensive debugging" is turned on. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:51:14 -08:00
Mark Fasheh	92295d8054	ocfs2: Don't panic when truncating an empty extent This BUG_ON() was unintentionally left in after the sparse file support was written. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:51:04 -08:00
Mark Fasheh	a86370fbb6	ocfs2: fix exit-while-locked bug in ocfs2_queue_orphans() We're holding the cluster lock when a failure might happen in ocfs2_dir_foreach() so it needs to be released. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:49:43 -08:00
Al Viro	97bd7919e2	remove nonsense force-casts from ocfs2 endianness annotations in networking code had been in place for quite a while; in particular, sin_port and s_addr are annotated as big-endian. Code in ocfs2 had __force casts added apparently to shut the sparse warnings up; of course, these days they only serve to produce warnings for no reason whatsoever... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:25:20 -08:00
Mark Fasheh	b1967d0edd	ocfs2: reverse inline-data truncate args ocfs2_truncate() and ocfs2_remove_inode_range() had reversed their "set i_size" arguments to ocfs2_truncate_inline(). Fix things so that truncate sets i_size, and punching a hole ignores it. This exposed a problem where punching a hole in an inline-data file wasn't updating the page cache, so fix that too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:03 -08:00
Mark Fasheh	0d8a4e0cd6	ocfs2: Fix comparison in ocfs2_size_fits_inline_data() This was causing us to prematurely push out inline data by one byte. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:03 -08:00
Mark Fasheh	bccb9dad89	ocfs2: Remove bug statement in ocfs2_dentry_iput() The existing bug statement didn't take into account unhashed dentries which might not have a cluster lock on them. This could happen if a node exporting the file system via NFS is rebooted, re-exported to nfs clients and then unmounted. It's fine in this case to not have a dentry cluster lock. Just remove the bug statement and replace it with an error print, which does the proper checks. Though we want to know if something has happened which might have prevented a cluster lock from being created, it's definitely not necessary to panic the machine for this. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Jan Kara	5a58c3ef22	[PATCH] ocfs2: Remove expensive bitmap scanning Enable expensive bitmap scanning only if DEBUG option is enabled. The bitmap scanning quite loads the CPU and on my machine the write throughput of dd if=/dev/zero of=/ocfs2/file bs=1M count=500 conv=sync improves from 37 MB/s to 45.4 MB/s in local mode... Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Mark Fasheh	a46043e08f	ocfs2: log valid inode # on bad inode If the inode block isn't valid then we don't want to print the value from that, instead print the block number which was passed in (which should always be correct). Also, turn this into a debug print for now - folks who hit an actual problem always have other logs indicating what the source is. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Mark Fasheh	ef9f86ceb6	ocfs2: Filter -ENOSPC in mlog_errno() It's almost never worth printing in that situation and we keep forgetting to manually filter it out. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Joe Perches	2759236f84	[PATCH] fs/ocfs2: Add missing "space" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Mark Fasheh	e001e796e4	ocfs2: Reset journal parameters after s_mount_opt update Right now we're just setting them from the existing parameters, not the new ones that a remount specified. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Trond Myklebust	91cf45f02a	[NET]: Add the helper kernel_sock_shutdown() ...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers. Looking at the sock->op->shutdown() handlers, it looks as if all of them take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the RCV_SHUTDOWN/SEND_SHUTDOWN arguments. Add a helper, and then define the SHUT_* enum to ensure that kernel users of shutdown() don't get confused. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-12 18:10:39 -08:00
Srinivas Eeda	e325a88f17	ocfs2: fix rename vs unlink race If another node unlinks the destination while ocfs2_rename() is waiting on a cluster lock, ocfs2_rename() simply logs an error and continues. This causes a crash because the renaming node is now trying to delete a non-existent inode. The correct solution is to return -ENOENT. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:40 -08:00
Jan Kara	bc7e97cbdd	[PATCH] Fix possibly too long write in o2hb_setup_one_bio() We should subtract start of our IO from PAGE_CACHE_SIZE to get the right length of the write we want to perform. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:35 -08:00
Mark Fasheh	4e9563fd55	ocfs2: fix write() performance regression On file systems which don't support sparse files, Ocfs2_map_page_blocks() was reading blocks on appending writes. This caused write performance to suffer dramatically. Fix this by detecting an appending write on a nonsparse fs and skipping the read. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:29 -08:00
Mark Fasheh	9ea2d32f40	ocfs2: Commit journal on sync writes We're missing a meta data commit for extending sync writes. In thoery, write could return with the meta data required to read the data uncommitted to disk. Fix that by detecting an allocating write and forcing a journal commit in the sync case. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:32:00 -08:00
Mark Fasheh	9f70968af3	ocfs2: Re-order iput in ocfs2_drop_dentry_lock Do this to avoid a theoretical (I haven't seen this in practice) race where the downconvert thread might drop the dentry lock, allowing a remote unlink to proceed before dropping the inode locks. This could bounce access to the orphan dir between nodes. There doesn't seem to be a need to do the same in ocfs2_dentry_iput() as that's never called for the last ref drop from the downconvert thread. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:52 -08:00
Mark Fasheh	019d1b2247	ocfs2: Create locks at initially requested level If we have not yet created a cluster lock, ocfs2_cluster_lock() will first create it at NLMODE, and then convert the lock to either PRMODE or EXMODE (whichever is requested). Change ocfs2_cluster_lock() to just create the lock at the initially requested level. ocfs2_locking_ast() handles this case fine, so the only update required was in setup of locking state. This should reduce the number of network messages required for a new lock by one, providing an incremental performance enhancement. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:45 -08:00
Roel Kluin	3cf0c507dd	[PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c} Fixes priority mistakes similar to '!x & y' Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:39 -08:00
Adrian Bunk	0af4bd3887	[2.6 patch] make ocfs2_find_entry_el() static ocfs2_find_entry_el() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:06 -08:00
Christoph Hellwig	3965516440	exportfs: make struct export_operations const Now that nfsd has stopped writing to the find_exported_dentry member we an mark the export_operations const Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	644f9ab3b0	ocfs2: new export ops OCFS2 has it's own 64bit-firendly filehandle format so we can't use the generic helpers here. I'll add a struct for the types later. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Robert P. J. Day	3a4fa0a25d	Fix misspellings of "system", "controller", "interrupt" and "necessary". Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:10:43 +02:00
Pavel Emelyanov	ba25f9dcc4	Use helpers to obtain task pid in printks The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Mathieu Desnoyers	2b47c3611d	Fix f_version type: should be u64 instead of unsigned long Fix f_version type: should be u64 instead of long There is a type inconsistency between struct inode i_version and struct file f_version. fs.h: struct inode u64 i_version; and struct file unsigned long f_version; Users do: fs/ext3/dir.c: if (filp->f_version != inode->i_version) { So why isn't f_version a u64 ? It becomes a problem if versions gets higher than 2^32 and we are on an architecture where longs are 32 bits. This patch changes the f_version type to u64, and updates the users accordingly. It applies to 2.6.23-rc2-mm2. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Martin Bligh <mbligh@google.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: <linux-ext4@vger.kernel.org> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:53 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Peter Zijlstra	e0bf68ddec	mm: bdi init hooks provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Nick Piggin	b6af1bcd87	ocfs2: convert to new aops Plug ocfs2 into the ->write_begin and ->write_end aops. A bunch of custom code is now gone - the iovec iteration stuff during write and the ocfs2 splice write actor. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:58 -07:00
Linus Torvalds	efefc6eb38	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits) PM: merge device power-management source files sysfs: add copyrights kobject: update the copyrights kset: add some kerneldoc to help describe what these strange things are Driver core: rename ktype_edd and ktype_efivar Driver core: rename ktype_driver Driver core: rename ktype_device Driver core: rename ktype_class driver core: remove subsystem_init() sysfs: move sysfs file poll implementation to sysfs_open_dirent sysfs: implement sysfs_open_dirent sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir sysfs: make sysfs_root a regular directory dirent sysfs: open code sysfs_attach_dentry() sysfs: make s_elem an anonymous union sysfs: make bin attr open get active reference of parent too sysfs: kill unnecessary NULL pointer check in sysfs_release() sysfs: kill unnecessary sysfs_get() in open paths sysfs: reposition sysfs_dirent->s_mode. sysfs: kill sysfs_update_file() ...	2007-10-12 15:49:37 -07:00
Greg Kroah-Hartman	34980ca8fa	Drivers: clean up direct setting of the name of a kset A kset should not have its name set directly, so dynamically set the name at runtime. This is needed to remove the static array in the kobject structure which will be changed in a future patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-12 14:51:02 -07:00
Mark Fasheh	e7b3401960	ocfs2: Optionally return filldir errors Modify ocfs2_dir_foreach_blk() to optionally return any error from the filldir callback. This way ocfs2_dirforeach() can terminate early, as opposed to always passing through the entire directory. This fixes a bug introduced during a previous code refactor where ocfs2_empty_dir() would loop infinitely. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:41 -07:00
Mark Fasheh	5b6a3a2b4a	ocfs2: Write support for directories with inline data Create all new directories with OCFS2_INLINE_DATA_FL and the inline data bytes formatted as an empty directory. Inode size field reflects the actual amount of inline data available, which makes searching for dirent space very similar to the regular directory search. Inline-data directories are automatically pushed out to extents on any insert request which is too large for the available space. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:41 -07:00
Mark Fasheh	23193e513d	ocfs2: Read support for directories with inline data This splits out extent based directory read support and implements inline-data versions of those functions. All knowledge of inline-data versus extent based directories is internalized. For lookups the code uses ocfs2_find_entry_id(), full dir iterations make use of ocfs2_dir_foreach_blk_id(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:40 -07:00
Mark Fasheh	1afc32b952	ocfs2: Write support for inline data This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline inode data. For the most part, the changes to the core write code can be relied on to do the heavy lifting. Any code calling ocfs2_write_begin (including shared writeable mmap) can count on it doing the right thing with respect to growing inline data to an extent tree. Size reducing truncates, including UNRESVP can simply zero that portion of the inode block being removed. Size increasing truncatesm, including RESVP have to be a little bit smarter and grow the inode to an extent tree if necessary. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:40 -07:00
Mark Fasheh	6798d35a31	ocfs2: Read support for inline data This hooks up ocfs2_readpage() to populate a page with data from an inode block. Direct IO reads from inline data are modified to fall back to buffered I/O. Appropriate checks are also placed in the extent map code to avoid reading an extent list when inline data might be stored. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:39 -07:00
Mark Fasheh	15b1e36bdb	ocfs2: Structure updates for inline data Add the disk, network and memory structures needed to support data in inode. Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing inline data. A new inode field, i_dyn_features, is added to facilitate tracking of dynamic inode state. Since it will be used often, we want to mirror it on ocfs2_inode_info, and transfer it via the meta data lvb. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:39 -07:00
Mark Fasheh	8553cf4f36	ocfs2: Cleanup dirent size check The check to see if a new dirent would fit in an old one is pretty ugly, and it's done at least twice. Clean things up by putting this in it's own easier-to-read function. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:39 -07:00
Mark Fasheh	38760e2432	ocfs2: Rename cleanups ocfs2_rename() does direct manipulation of the dirent it's gotten back from a directory search. Wrap this manipulation inside of a function so that we can transparently change directory update behavior in the future. As an added bonus, this gets rid of an ugly macro. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:38 -07:00
Mark Fasheh	be94d11704	ocfs2: Provide convenience function for ino lookup A couple paths which needed to just match a parent dir + name pair to an inode number were a bit messy because they had to deal with ocfs2_find_files_on_disk() which returns a larger number of values. Provide a convenience function, ocfs2_lookup_ino_from_name() which internalizes all the extra accounting. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:38 -07:00
Mark Fasheh	0bfbbf62a8	ocfs2: Implement ocfs2_empty_dir() as a caller of ocfs2_dir_foreach() We can preserve the behavior of ocfs2_empty_dir(), while getting rid of the open coded directory walk by just providing a smart filldir callback. This also automatically gets to use the dir readahead code, though in this case any advantage is minor at best. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:37 -07:00
Mark Fasheh	5eae5b96fc	ocfs2: Remove open coded readdir() ocfs2_queue_orphans() has an open coded readdir loop which can easily just use a directory accessor function. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:37 -07:00
Mark Fasheh	7e8536797d	ocfs2: Pass raw u64 to filldir filldir_t can take this, so don't turn de->inode into a 32 bit value. Right now this doesn't make a difference since no ocfs2 inodes overflow that, but it could be a nasty surprise later on if some kernel code is calling ocfs2_dir_foreach_blk() and expecting real inode numbers back... Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:37 -07:00
Mark Fasheh	b8bc5f4fde	ocfs2: Abstract out core dir listing functionality Put this in it's own function so that the functionality can be overridden. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:36 -07:00
Mark Fasheh	316f4b9f98	ocfs2: Move directory manipulation code into dir.c The code for adding, removing, deleting directory entries was splattered all over namei.c. I'd rather have this all centralized, so that it's easier to make changes for inline dir data, and eventually indexed directories. None of the code in any of the functions was changed. I only removed the static keyword from some prototypes so that they could be exported. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:36 -07:00
Mark Fasheh	1d410a6e33	ocfs2: Small refactor of truncate zeroing code We'll want to reuse most of this when pushing inline data back out to an extent. Keeping this part as a seperate patch helps to keep the upcoming changes for write support uncluttered. The core portion of ocfs2_zero_cluster_pages() responsible for making sure a page is mapped and properly dirtied is abstracted out into it's own function, ocfs2_map_and_dirty_page(). Actual functionality doesn't change, though zeroing becomes optional. We also turn part of ocfs2_free_write_ctxt() into a common function for unlocking and freeing a page array. This operation is very common (and uniform) for Ocfs2 cluster sizes greater than page size, so it makes sense to keep the code in one place. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:35 -07:00
Mark Fasheh	65ed39d6ca	ocfs2: move nonsparse hole-filling into ocfs2_write_begin() By doing this, we can remove any higher level logic which has to have knowledge of btree functionality - any callers of ocfs2_write_begin() can now expect it to do anything necessary to prepare the inode for new data. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by: Joel Becker <joel.becker@oracle.com>	2007-10-12 11:54:35 -07:00
Mark Fasheh	92e91ce2a3	ocfs2: Sync ocfs2_fs.h with ocfs2-tools ocfs2-tools added some on-disk fields and flags which are used by tunefs.ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:34 -07:00
Denis Cheng	bddb8eb37f	[PATCH] fs/ocfs2/: removed unneeded initial value and function's return value Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:34 -07:00
Sunil Mushran	d550071c03	ocfs2: Implement show_options() Implement sops->show_options() so as to allow /proc/mounts to show the mount options. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:33 -07:00
Mark Fasheh	19b613d410	ocfs2: Clear slot map when umounting a local volume This is technically harmless (recovery will clean it out later), but leaves a bogus entry in the slot_map which really shouldn't be there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:33 -07:00
Mark Fasheh	015452b15f	ocfs2: Remove unused structure field c_used_tail_recs in struct ocfs2_merge_ctxt is only ever set, so we can remove it. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:32 -07:00
Tao Mao	518d7269f3	ocfs2: remove unused variable delete_tail_recs in ocfs2_try_to_merge_extent() was only ever set, remove it. Signed-off-by: Tao Mao <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:32 -07:00
Tao Mao	c77534f6fb	ocfs2: remove mostly unused field from insert structure ocfs2_insert_type->ins_free_records was only used in one place, and was set incorrectly in most places. We can free up some memory and lose some code by removing this. * Small warning fixup contributed by Andrew Mortom <akpm@linux-foundation.org> Signed-off-by: Tao Mao <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-12 11:54:32 -07:00
Al Viro	782e3b3b38	Fix up more bio fallout Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-12 00:29:50 -07:00
NeilBrown	6712ecf8f6	Drop 'size' argument from bio_endio and bi_end_io As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
Sunil Mushran	bda0233b89	ocfs2: Unlock mutex in local alloc failure case The fs was not unlocking the local alloc inode mutex in the code path in which it failed to find a window of free bits in the global bitmap. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-03 11:14:45 -07:00
Sunil Mushran	813d974c53	ocfs2: Pack vote message and response structures The ocfs2_vote_msg and ocfs2_response_msg structs needed to be packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this, we had inadvertantly broken 32/64 bit cross mounts. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:10 -07:00
Mark Fasheh	5c26a7b70f	ocfs2: Don't double set write parameters The target page offsets were being incorrectly set a second time in ocfs2_prepare_page_for_write(), which was causing problems on a 16k page size kernel. Additionally, ocfs2_write_failure() was incorrectly using those parameters instead of the parameters for the individual page being cleaned up. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:10 -07:00
Mark Fasheh	db56246c69	ocfs2: Fix pos/len passed to ocfs2_write_cluster This was broken for file systems whose cluster size is greater than page size. Pos needs to be incremented as we loop through the descriptors, and len needs to be capped to the size of a single cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:09 -07:00
Mark Fasheh	415cb80037	ocfs2: Allow smaller allocations during large writes The ocfs2 write code loops through a page much like the block code, except that ocfs2 allocation units can be any size, including larger than page size. Typically it's equal to or larger than page size - most kernels run 4k pages, the minimum ocfs2 allocation (cluster) size. Some changes introduced during 2.6.23 changed the way writes to pages are handled, and inadvertantly broke support for > 4k page size. Instead of just writing one cluster at a time, we now handle the whole page in one pass. This means that multiple (small) seperate allocations might happen in the same pass. The allocation code howver typically optimizes by getting the maximum which was reserved. This triggered a BUG_ON in the extend code where it'd ask for a single bit (for one part of a > 4k page) and get back more than it asked for. Fix this by providing a variant of the high level allocation function which allows the caller to specify a maximum. The traditional function remains and just calls the new one with a maximum determined from the initial reservation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:09 -07:00
Mark Fasheh	e535e2efd2	ocfs2: Fix calculation of i_blocks during truncate We were setting i_blocks too early - before truncating any allocation. Correct things to set i_blocks after the allocation change. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:39:46 -07:00
tao.ma@oracle.com	30b8548f2c	[PATCH] ocfs2: Fix a wrong cluster calculation. In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated by the byte length only. This may cause some problems if we start to write at some position in the end of one cluster and last to a second cluster while the "len" is smaller than a cluster size. In that case, we have to write 2 clusters actually. So we have to take the start position into consideration also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:39:05 -07:00
Tiger Yang	c0123adef6	[PATCH] ocfs2: fix mount option parsing For some mount option types, ocfs2_parse_options() will try to access sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that hasn't been allocated yet and will cause a kernel crash. Fix this by storing options in a struct which can then get pushed into the ocfs2_super once it's been allocated later. If we need more options which store to the ocfs2_super in the future, we can just fields to this struct. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:38:48 -07:00
Mark Fasheh	e0dceaf0a4	ocfs2: set non-default s_time_gran during mount We need to manually set this to '1' during mount, otherwise inode_setattr() will chop off the nanosecond portion of our timestamps. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:58 -07:00
Sunil Mushran	ce17204ae6	ocfs2: Retry sendpage() if it returns EAGAIN Instead of treating EAGAIN, returned from sendpage(), as an error, this patch retries the operation. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:38 -07:00
Sunil Mushran	480214d71f	ocfs2: Fix rename/extend race If one process is extending a file while another is renaming it, there exists a window when rename could flush the old inode's stale i_size to disk. This patch recognizes the fact that rename is only updating the old inode's ctime, so it ensures only that value is flushed to disk. Signed-off-by: Sunil Mushran <sunil.musran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:10 -07:00
Adrian Bunk	6a18380e7d	[2.6 patch] ocfs2_insert_extent(): remove dead code This patch removes some now dead code. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:26:03 -07:00
Mark Fasheh	5a25403175	ocfs2: Fix max offset calculations ocfs2_max_file_offset() was over-estimating the largest file size for several cases. This wasn't really a problem before, but now that we support sparse files, it needs to be more accurate. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:49 -07:00
Mark Fasheh	ce76fd30ce	ocfs2: check ia_size limits in setattr We have to manually check the requested truncate size as the check in vmtruncate() comes too late for Ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:38 -07:00
Mark Fasheh	7c08d70c69	ocfs2: Fix some casting errors related to file writes ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating destination start for memcpy. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:27 -07:00
Mark Fasheh	a00cce356b	ocfs2: use s_maxbytes directly in ocfs2_change_file_space() There's no need to recalculate things via ocfs2_max_file_offset() as we've already done that to fill s_maxbytes, so use that instead. We can also un-export ocfs2_max_file_offset() then. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:07 -07:00
Mark Fasheh	c11e9fafb3	ocfs2: Restrict inode changes in ocfs2_update_inode_atime() ocfs2_update_inode_atime() calls ocfs2_mark_inode_dirty() to push changes from the struct inode into the ocfs2 disk inode. The problem is, ocfs2_mark_inode_dirty() might change other fields, depending on what happened to the struct inode. Since we don't always have locking to serialize changes to other fields (like i_size, etc), just fix things up to only touch the atime field. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:23:50 -07:00
Jens Axboe	3836df6b52	ocfs2: bad kunmap_atomic() kunmap_atomic() takes the virtual address, not the mapped page as argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-24 16:02:55 -07:00
Nick Piggin	1833633803	fix some conversion overflows Fix page index to offset conversion overflows in buffer layer, ecryptfs, and ocfs2. It would be nice to convert the whole tree to page_offset, but for now just fix the bugs. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-20 08:44:19 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Linus Torvalds	f745bb1c73	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ->fallocate() support	2007-07-19 14:16:44 -07:00
Nick Piggin	d0217ac04c	mm: fault feedback #1 Change ->fault prototype. We now return an int, which contains VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte. FAULT_RET_ code tells the VM whether a page was found, whether it has been locked, and potentially other things. This is not quite the way he wanted it yet, but that's changed in the next patch (which requires changes to arch code). This means we no longer set VM_CAN_INVALIDATE in the vma in order to say that a page is locked which requires filemap_nopage to go away (because we can no longer remain backward compatible without that flag), but we were going to do that anyway. struct fault_data is renamed to struct vm_fault as Linus asked. address is now a void __user * that we should firmly encourage drivers not to use without really good reason. The page is now returned via a page pointer in the vm_fault struct. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	54cb8821de	mm: merge populate and nopage into fault (fixes nonlinear) Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	d00806b183	mm: fix fault vs invalidate race for linear mappings Fix the race between invalidate_inode_pages and do_no_page. Andrea Arcangeli identified a subtle race between invalidation of pages from pagecache with userspace mappings, and do_no_page. The issue is that invalidation has to shoot down all mappings to the page, before it can be discarded from the pagecache. Between shooting down ptes to a particular page, and actually dropping the struct page from the pagecache, do_no_page from any process might fault on that page and establish a new mapping to the page just before it gets discarded from the pagecache. The most common case where such invalidation is used is in file truncation. This case was catered for by doing a sort of open-coded seqlock between the file's i_size, and its truncate_count. Truncation will decrease i_size, then increment truncate_count before unmapping userspace pages; do_no_page will read truncate_count, then find the page if it is within i_size, and then check truncate_count under the page table lock and back out and retry if it had subsequently been changed (ptl will serialise against unmapping, and ensure a potentially updated truncate_count is actually visible). Complexity and documentation issues aside, the locking protocol fails in the case where we would like to invalidate pagecache inside i_size. do_no_page can come in anytime and filemap_nopage is not aware of the invalidation in progress (as it is when it is outside i_size). The end result is that dangling (->mapping == NULL) pages that appear to be from a particular file may be mapped into userspace with nonsense data. Valid mappings to the same place will see a different page. Andrea implemented two working fixes, one using a real seqlock, another using a page->flags bit. He also proposed using the page lock in do_no_page, but that was initially considered too heavyweight. However, it is not a global or per-file lock, and the page cacheline is modified in do_no_page to increment _count and _mapcount anyway, so a further modification should not be a large performance hit. Scalability is not an issue. This patch implements this latter approach. ->nopage implementations return with the page locked if it is possible for their underlying file to be invalidated (in that case, they must set a special vm_flags bit to indicate so). do_no_page only unlocks the page after setting up the mapping completely. invalidation is excluded because it holds the page lock during invalidation of each page (and ensures that the page is not mapped while holding the lock). This also allows significant simplifications in do_no_page, because we have the page locked in the right place in the pagecache from the start. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Mark Fasheh	385820a38d	ocfs2: ->fallocate() support Plug ocfs2 into the ->fallocate() callback. This just re-uses the existing preallocation code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-19 00:23:55 -07:00
Jeremy Fitzhardinge	86313c488a	usermodehelper: Tidy up waiting Rather than using a tri-state integer for the wait flag in call_usermodehelper_exec, define a proper enum, and use that. I've preserved the integer values so that any callers I've missed should still work OK. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com>	2007-07-18 08:47:40 -07:00
Linus Torvalds	b8c638acac	Merge branch 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: arch/i386/* fs/* ipc/: mark variables with uninitialized_var() drivers/: mark variables with uninitialized_var()	2007-07-17 15:19:06 -07:00
Jeff Garzik	8e1c091ccc	arch/i386/* fs/* ipc/*: mark variables with uninitialized_var() Mark variables with uninitialized_var() if such a warning appears, and analysis proves that the var is initialized properly on all paths it is used. Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-07-17 16:23:19 -04:00
Satyam Sharma	3bd858ab1c	Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 12:00:03 -07:00
Christoph Hellwig	a569425512	knfsd: exportfs: add exportfs.h header currently the export_operation structure and helpers related to it are in fs.h. fs.h is already far too large and there are very few places needing the export bits, so split them off into a separate header. [akpm@linux-foundation.org: fix cifs build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:06 -07:00
Linus Torvalds	add096909d	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits) [PATCH] ocfs2: zero_user_page conversion ocfs2: Support xfs style space reservation ioctls ocfs2: support for removing file regions ocfs2: update truncate handling of partial clusters ocfs2: btree support for removal of arbirtrary extents ocfs2: Support creation of unwritten extents ocfs2: support writing of unwritten extents ocfs2: small cleanup of ocfs2_write_begin_nolock() ocfs2: btree changes for unwritten extents ocfs2: abstract btree growing calls ocfs2: use all extent block suballocators ocfs2: plug truncate into cached dealloc routines ocfs2: simplify deallocation locking ocfs2: harden buffer check during mapping of page blocks ocfs2: shared writeable mmap ocfs2: factor out write aops into nolock variants ocfs2: rework ocfs2_buffered_write_cluster() ocfs2: take ip_alloc_sem during entire truncate ocfs2: Add "preferred slot" mount option [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c ...	2007-07-16 10:52:55 -07:00
Tejun Heo	7b595756ec	sysfs: kill unnecessary attribute->owner sysfs is now completely out of driver/module lifetime game. After deletion, a sysfs node doesn't access anything outside sysfs proper, so there's no reason to hold onto the attribute owners. Note that often the wrong modules were accounted for as owners leading to accessing removed modules. This patch kills now unnecessary attribute->owner. Note that with this change, userland holding a sysfs node does not prevent the backing module from being unloaded. For more info regarding lifetime rule cleanup, please read the following message. http://article.gmane.org/gmane.linux.kernel/510293 (tweaked by Greg to not delete the field just yet, to make it easier to merge things properly.) Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Eric Sandeen	54c57dc3b6	[PATCH] ocfs2: zero_user_page conversion Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:10 -07:00
Mark Fasheh	b25801038d	ocfs2: Support xfs style space reservation ioctls We re-use the RESVSP/UNRESVSP ioctls from xfs which allow the user to allocate and deallocate regions to a file without zeroing data or changing i_size. Though renamed, the structure passed in from user is identical to struct xfs_flock64. The three fields that are actually used right now are l_whence, l_start and l_len. This should get ocfs2 immediate compatibility with userspace software using the pre-existing xfs ioctls. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:09 -07:00
Mark Fasheh	063c4561f5	ocfs2: support for removing file regions Provide an internal interface for the removal of arbitrary file regions. ocfs2_remove_inode_range() takes a byte range within a file and will remove existing extents within that range. Partial clusters will be zeroed so that any read from within the region will return zeros. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:08 -07:00
Mark Fasheh	35edec1d52	ocfs2: update truncate handling of partial clusters The partial cluster zeroing code used during truncate usually assumes that the rightmost byte in the range to be zeroed lies on a cluster boundary. This makes sense for truncate, but punching holes might require zeroing on non-aligned rightmost boundaries. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:07 -07:00
Mark Fasheh	d0c7d7082e	ocfs2: btree support for removal of arbirtrary extents Add code to the btree paths to support the removal of arbitrary regions within an existing extent. With proper higher level support this can be used to "punch holes" in a file. Truncate (a special case of hole punching) could also be converted to use these methods. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:05 -07:00
Mark Fasheh	2ae99a6037	ocfs2: Support creation of unwritten extents This can now be trivially supported with re-use of our existing extend code. ocfs2_allocate_unwritten_extents() takes a start offset and a byte length and iterates over the inode, adding extents (marked as unwritten) until len is reached. Existing extents are skipped over. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:04 -07:00
Mark Fasheh	b27b7cbcf1	ocfs2: support writing of unwritten extents Update the write code to detect when the user is asking to write to an unwritten extent. Like writing to a hole, we must zero the region between the write and the cluster boundaries. Most of the existing cluster zeroing logic can be re-used with some additional checks for the unwritten flag on extent records. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:03 -07:00
Mark Fasheh	0d172baa55	ocfs2: small cleanup of ocfs2_write_begin_nolock() We can easily seperate out the write descriptor setup and manipulation into helper functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:01 -07:00
Mark Fasheh	328d5752e1	ocfs2: btree changes for unwritten extents Writes to a region marked as unwritten might result in a record split or merge. We can support splits by making minor changes to the existing insert code. Merges require left rotations which mostly re-use right rotation support functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:00 -07:00
Mark Fasheh	c3afcbb344	ocfs2: abstract btree growing calls The top level calls and logic for growing a tree can easily be abstracted out of ocfs2_insert_extent() into a seperate function - ocfs2_grow_tree(). This allows future code to easily grow btrees when needed. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:58 -07:00
Mark Fasheh	1f6697d072	ocfs2: use all extent block suballocators Now that we have a method to deallocate blocks from them, each node should allocate extent blocks from their local suballocator file. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:56 -07:00

1 2 3 4 5 ...

494 Commits