The former is now strict, it will fail if it cannot honor the allocation
within the node, while the later implements the previous semantic which
falls back to allocating anywhere.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We now provide a default (weak) implementation of memblock_nid_range()
which uses the early_pfn_map[] if CONFIG_ARCH_POPULATES_NODE_MAP
is set. Sparc still needs to use its own method due to the way
the pages can be scattered between nodes.
This implementation is inefficient due to our main algorithm and
callback construct wanting to work on an ascending addresses bases
while early_pfn_map[] would rather work with nid's (it's unsorted
at that stage). But it should work and we can look into improving
it subsequently, possibly using arch compile options to chose a
different algorithm alltogether.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To constraint the search of a region between two boundaries,
which will be used by the new NUMA aware allocator among others.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some archs such as ARM want to avoid coalescing accross things such
as the lowmem/highmem boundary or similar. This provides the option
to control it via an arch callback for which a weak default is provided
which always allows coalescing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When one of the array gets full, we resize it. After much thinking and
a few iterations of that code, I went back to on-demand resizing using
the (new) internal memblock_find_base() function, which is pretty much what
Yinghai initially proposed, though there some differences in the details.
To work this relies on the default alloc limit being set sensibly by
the architecture.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some shuffling is needed for doing array resize so we may as well
put some sense into the ordering of the functions in the whole memblock.c
file. No code change. Added some comments.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This will be used by the array resize code and might prove useful
to some arch code as well at which point it can be made non-static.
Also add comment as to why aligning size is important
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2. Fix loss of size alignment
v3. Fix result code
This function will be used to locate a free area to put the new memblock
arrays when attempting to resize them. memblock_alloc_region() is gone,
the two callsites now call memblock_add_region().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2. Fix membase_alloc_nid_region() conversion
Since we allocate one more than needed, why not do a bit of sanity checking
here to ensure we don't walk past the end of the array ?
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is in preparation for having resizable arrays.
Note that we still allocate one more than needed, this is unchanged from
the previous implementation.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Right now, both the "memory" and "reserved" memblock_type structures have
a "size" member. It represents the calculated memory size in the former
case and is unused in the latter.
This moves it out to the main memblock structure instead
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Let's not waste space and cycles on archs that don't support >32-bit
physical address space.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.
We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).
The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.
It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.
Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().
The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Walk memblock's using for_each_memblock() and use memblock_region_base/end_pfn() for
getting to PFNs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The implementation is pretty much similar. There is a -small- added
overhead by having another function call and the address shift.
If that becomes a concern, I suppose we could actually have memblock
itself expose a memblock_pfn_valid() which then ARM can use directly
with an appropriate #define...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To make it fast, we steal ARM's binary search for memblock_is_memory()
and we use that to also the replace existing implementation of
memblock_is_reserved().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
All callers expect a boolean result which is true if the region
overlaps a reserved region. However, the implementation actually
returns -1 if there is no overlap, and a region index (0 based)
if there is.
Make it behave as callers (and common sense) expect.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
GFS2: Fix recovery stuck bug (try #2)
GFS2: Fix typo in stuffed file data copy handling
Revert "GFS2: recovery stuck on transaction lock"
GFS2: Make "try" lock not try quite so hard
GFS2: remove dependency on __GFP_NOFAIL
GFS2: Simplify gfs2_write_alloc_required
GFS2: Wait for journal id on mount if not specified on mount command line
GFS2: Use nobh_writepage
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: do not warn unnecessarily
UBI: do not print message about corruptes PEBs if we have none of them
UBI: improve delete-compatible volumes handling
UBI: fix error message and compilation warnings
UBI: generate random image_seq when formatting MTD devices
UBI: improve ECC error message
UBI: improve corrupted flash handling
UBI: introduce eraseblock counter variables
UBI: introduce a new IO return code
UBI: simplify IO error codes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (22 commits)
9p: fix sparse warnings in new xattr code
fs/9p: remove sparse warning in vfs_inode
fs/9p: destroy fid on failed remove
fs/9p: Prevent parallel rename when doing fid_lookup
fs/9p: Add support user. xattr
net/9p: Implement TXATTRCREATE 9p call
net/9p: Implement attrwalk 9p call
9p: Implement LOPEN
fs/9p: This patch implements TLCREATE for 9p2000.L protocol.
9p: Implement TMKDIR
9p: Implement TMKNOD
9p: Define and implement TSYMLINK for 9P2000.L
9p: Define and implement TLINK for 9P2000.L
9p: Define and implement TLINK for 9P2000.L
9p: Implement client side of setattr for 9P2000.L protocol.
9p: getattr client implementation for 9P2000.L protocol.
fs/9p: Pass the correct user credentials during attach
net/9p: Handle the server returned error properly
9p: readdir implementation for 9p2000.L
9p: Make use of iounit for read/write
...
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (29 commits)
cifs: fsc should not default to "on"
[CIFS] remove redundant path walking in dfs_do_refmount
cifs: ignore the "mand", "nomand" and "_netdev" mount options
cifs: map NT_STATUS_ERROR_WRITE_PROTECTED to -EROFS
cifs: don't allow cifs_iget to match inodes of the wrong type
[CIFS] relinquish fscache cookie before freeing CIFSTconInfo
cifs: add separate cred_uid field to sesInfo
fs: cifs: check kmalloc() result
[CIFS] Missing ifdef
[CIFS] Missing line from previous commit
[CIFS] Fix build break when CONFIG_CIFS_FSCACHE disabled
cifs: add mount option to enable local caching
cifs: read pages from FS-Cache
cifs: store pages into local cache
cifs: FS-Cache page management
cifs: define inode-level cache object and register them
cifs: define superblock-level cache index objects and register them
cifs: remove unused cifsUidInfo struct
cifs: clean up cifs_find_smb_ses (try #2)
cifs: match secType when searching for existing tcp session
...
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (291 commits)
ARM: AMBA: Add pclk support to AMBA bus infrastructure
ARM: 6278/2: fix regression in RealView after the introduction of pclk
ARM: 6277/1: mach-shmobile: Allow users to select HZ, default to 128
ARM: 6276/1: mach-shmobile: remove duplicate NR_IRQS_LEGACY
ARM: 6246/1: mmci: support larger MMCIDATALENGTH register
ARM: 6245/1: mmci: enable hardware flow control on Ux500 variants
ARM: 6244/1: mmci: add variant data and default MCICLOCK support
ARM: 6243/1: mmci: pass power_mode to the translate_vdd callback
ARM: 6274/1: add global control registers definition header file for nuc900
mx2_camera: fix type of dma buffer virtual address pointer
mx2_camera: Add soc_camera support for i.MX25/i.MX27
arm/imx/gpio: add spinlock protection
ARM: Add support for the LPC32XX arch
ARM: LPC32XX: Arch config menu supoport and makefiles
ARM: LPC32XX: Phytec 3250 platform support
ARM: LPC32XX: Misc support functions
ARM: LPC32XX: Serial support code
ARM: LPC32XX: System suspend support
ARM: LPC32XX: GPIO, timer, and IRQ drivers
ARM: LPC32XX: Clock driver
...
In 'mount_ubifs()', in case of 'ubifs_leb_unmap()' falure,
free allocated resources.
Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
fixes:
CHECK fs/9p/xattr.c
fs/9p/xattr.c:73:6: warning: Using plain integer as NULL pointer
fs/9p/xattr.c:135:6: warning: Using plain integer as NULL pointer
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
9P spec says:
"It is correct to consider remove to be a clunk with the
side effect of removing the file if permissions allow. "
So even if remove fails we need to destroy the fid.
Without this patch an rmdir on a directory with contents leave
the new cloned directory fid fid attached to fidlist. On umount
we dump the fids on the fidlist
~# rmdir /mnt2/test4/
rmdir: failed to remove `/mnt2/test4/': Directory not empty
~# umount /mnt2/
~# dmesg
[ 228.474323] Found fid 3 not clunked
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
During fid lookup we need to make sure that the dentry->d_parent doesn't
change so that we can safely walk the parent dentries. To ensure that
we need to prevent cross directory rename during fid_lookup. Add a
per superblock rename_sem rw_semaphore to prevent parallel fid lookup and
rename.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
TXATTRCREATE: Prepare a fid for setting xattr value on a file system object.
size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
size[4] RXATTRCREATE tag[2]
txattrcreate gets a fid pointing to xattr. This fid can later be
used to set the xattr value.
flag value is derived from set Linux setxattr. The manpage says
"The flags parameter can be used to refine the semantics of the operation.
XATTR_CREATE specifies a pure create, which fails if the named attribute
exists already. XATTR_REPLACE specifies a pure replace operation, which
fails if the named attribute does not already exist. By default (no flags),
the extended attribute will be created if need be, or will simply replace
the value if the attribute exists."
The actual setxattr operation happens when the fid is clunked. At that point
the written byte count and the attr_size specified in TXATTRCREATE should be
same otherwise an error will be returned.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
TXATTRWALK: Descend a ATTR namespace
size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
size[4] RXATTRWALK tag[2] size[8]
txattrwalk gets a fid pointing to xattr. This fid can later be
used to read the xattr value. If name is NULL the fid returned
can be used to get the list of extended attribute associated to
the file system object.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>