2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* This is <linux/capability.h>
|
|
|
|
*
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 14:31:36 +08:00
|
|
|
* Andrew G. Morgan <morgan@kernel.org>
|
2005-04-17 06:20:36 +08:00
|
|
|
* Alexander Kjeldaas <astor@guardian.no>
|
|
|
|
* with help from Aleph1, Roland Buresund and Andrew Main.
|
|
|
|
*
|
|
|
|
* See here for the libcap library ("POSIX draft" compliance):
|
|
|
|
*
|
2009-06-16 16:26:25 +08:00
|
|
|
* ftp://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 14:31:36 +08:00
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
#ifndef _LINUX_CAPABILITY_H
|
|
|
|
#define _LINUX_CAPABILITY_H
|
|
|
|
|
2012-10-13 17:46:48 +08:00
|
|
|
#include <uapi/linux/capability.h>
|
2008-02-05 14:29:42 +08:00
|
|
|
|
2008-05-28 13:05:17 +08:00
|
|
|
|
|
|
|
#define _KERNEL_CAPABILITY_VERSION _LINUX_CAPABILITY_VERSION_3
|
|
|
|
#define _KERNEL_CAPABILITY_U32S _LINUX_CAPABILITY_U32S_3
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-01-30 23:09:30 +08:00
|
|
|
extern int file_caps_enabled;
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
typedef struct kernel_cap_struct {
|
2008-05-28 13:05:17 +08:00
|
|
|
__u32 cap[_KERNEL_CAPABILITY_U32S];
|
2005-04-17 06:20:36 +08:00
|
|
|
} kernel_cap_t;
|
|
|
|
|
2008-11-11 18:48:10 +08:00
|
|
|
/* exact same as vfs_cap_data but in cpu endian and always filled completely */
|
|
|
|
struct cpu_vfs_cap_data {
|
|
|
|
__u32 magic_etc;
|
|
|
|
kernel_cap_t permitted;
|
|
|
|
kernel_cap_t inheritable;
|
|
|
|
};
|
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
#define _USER_CAP_HEADER_SIZE (sizeof(struct __user_cap_header_struct))
|
2005-04-17 06:20:36 +08:00
|
|
|
#define _KERNEL_CAP_T_SIZE (sizeof(kernel_cap_t))
|
|
|
|
|
|
|
|
|
2013-04-15 01:06:31 +08:00
|
|
|
struct file;
|
2011-11-15 08:24:06 +08:00
|
|
|
struct inode;
|
userns: security: make capabilities relative to the user namespace
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-24 07:43:17 +08:00
|
|
|
struct dentry;
|
|
|
|
struct user_namespace;
|
|
|
|
|
|
|
|
struct user_namespace *current_user_ns(void);
|
|
|
|
|
|
|
|
extern const kernel_cap_t __cap_empty_set;
|
|
|
|
extern const kernel_cap_t __cap_init_eff_set;
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* Internal kernel functions only
|
|
|
|
*/
|
Implement file posix capabilities
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 14:31:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
#define CAP_FOR_EACH_U32(__capi) \
|
2008-05-28 13:05:17 +08:00
|
|
|
for (__capi = 0; __capi < _KERNEL_CAPABILITY_U32S; ++__capi)
|
2008-02-05 14:29:42 +08:00
|
|
|
|
2009-04-13 22:56:14 +08:00
|
|
|
/*
|
|
|
|
* CAP_FS_MASK and CAP_NFSD_MASKS:
|
|
|
|
*
|
|
|
|
* The fs mask is all the privileges that fsuid==0 historically meant.
|
|
|
|
* At one time in the past, that included CAP_MKNOD and CAP_LINUX_IMMUTABLE.
|
|
|
|
*
|
|
|
|
* It has never meant setting security.* and trusted.* xattrs.
|
|
|
|
*
|
|
|
|
* We could also define fsmask as follows:
|
|
|
|
* 1. CAP_FS_MASK is the privilege to bypass all fs-related DAC permissions
|
|
|
|
* 2. The security.* and trusted.* xattrs are fs-related MAC permissions
|
|
|
|
*/
|
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
# define CAP_FS_MASK_B0 (CAP_TO_MASK(CAP_CHOWN) \
|
2009-04-13 22:56:14 +08:00
|
|
|
| CAP_TO_MASK(CAP_MKNOD) \
|
2008-02-05 14:29:42 +08:00
|
|
|
| CAP_TO_MASK(CAP_DAC_OVERRIDE) \
|
|
|
|
| CAP_TO_MASK(CAP_DAC_READ_SEARCH) \
|
|
|
|
| CAP_TO_MASK(CAP_FOWNER) \
|
|
|
|
| CAP_TO_MASK(CAP_FSETID))
|
|
|
|
|
Smack: Simplified Mandatory Access Control Kernel
Smack is the Simplified Mandatory Access Control Kernel.
Smack implements mandatory access control (MAC) using labels
attached to tasks and data containers, including files, SVIPC,
and other tasks. Smack is a kernel based scheme that requires
an absolute minimum of application support and a very small
amount of configuration data.
Smack uses extended attributes and
provides a set of general mount options, borrowing technics used
elsewhere. Smack uses netlabel for CIPSO labeling. Smack provides
a pseudo-filesystem smackfs that is used for manipulation of
system Smack attributes.
The patch, patches for ls and sshd, a README, a startup script,
and x86 binaries for ls and sshd are also available on
http://www.schaufler-ca.com
Development has been done using Fedora Core 7 in a virtual machine
environment and on an old Sony laptop.
Smack provides mandatory access controls based on the label attached
to a task and the label attached to the object it is attempting to
access. Smack labels are deliberately short (1-23 characters) text
strings. Single character labels using special characters are reserved
for system use. The only operation applied to Smack labels is equality
comparison. No wildcards or expressions, regular or otherwise, are
used. Smack labels are composed of printable characters and may not
include "/".
A file always gets the Smack label of the task that created it.
Smack defines and uses these labels:
"*" - pronounced "star"
"_" - pronounced "floor"
"^" - pronounced "hat"
"?" - pronounced "huh"
The access rules enforced by Smack are, in order:
1. Any access requested by a task labeled "*" is denied.
2. A read or execute access requested by a task labeled "^"
is permitted.
3. A read or execute access requested on an object labeled "_"
is permitted.
4. Any access requested on an object labeled "*" is permitted.
5. Any access requested by a task on an object with the same
label is permitted.
6. Any access requested that is explicitly defined in the loaded
rule set is permitted.
7. Any other access is denied.
Rules may be explicitly defined by writing subject,object,access
triples to /smack/load.
Smack rule sets can be easily defined that describe Bell&LaPadula
sensitivity, Biba integrity, and a variety of interesting
configurations. Smack rule sets can be modified on the fly to
accommodate changes in the operating environment or even the time
of day.
Some practical use cases:
Hierarchical levels. The less common of the two usual uses
for MLS systems is to define hierarchical levels, often
unclassified, confidential, secret, and so on. To set up smack
to support this, these rules could be defined:
C Unclass rx
S C rx
S Unclass rx
TS S rx
TS C rx
TS Unclass rx
A TS process can read S, C, and Unclass data, but cannot write it.
An S process can read C and Unclass. Note that specifying that
TS can read S and S can read C does not imply TS can read C, it
has to be explicitly stated.
Non-hierarchical categories. This is the more common of the
usual uses for an MLS system. Since the default rule is that a
subject cannot access an object with a different label no
access rules are required to implement compartmentalization.
A case that the Bell & LaPadula policy does not allow is demonstrated
with this Smack access rule:
A case that Bell&LaPadula does not allow that Smack does:
ESPN ABC r
ABC ESPN r
On my portable video device I have two applications, one that
shows ABC programming and the other ESPN programming. ESPN wants
to show me sport stories that show up as news, and ABC will
only provide minimal information about a sports story if ESPN
is covering it. Each side can look at the other's info, neither
can change the other. Neither can see what FOX is up to, which
is just as well all things considered.
Another case that I especially like:
SatData Guard w
Guard Publish w
A program running with the Guard label opens a UDP socket and
accepts messages sent by a program running with a SatData label.
The Guard program inspects the message to ensure it is wholesome
and if it is sends it to a program running with the Publish label.
This program then puts the information passed in an appropriate
place. Note that the Guard program cannot write to a Publish
file system object because file system semanitic require read as
well as write.
The four cases (categories, levels, mutual read, guardbox) here
are all quite real, and problems I've been asked to solve over
the years. The first two are easy to do with traditonal MLS systems
while the last two you can't without invoking privilege, at least
for a while.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Joshua Brindle <method@manicmethod.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 14:29:50 +08:00
|
|
|
# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE))
|
|
|
|
|
2008-05-28 13:05:17 +08:00
|
|
|
#if _KERNEL_CAPABILITY_U32S != 2
|
2008-02-05 14:29:42 +08:00
|
|
|
# error Fix up hand-coded capability macro initializers
|
|
|
|
#else /* HAND-CODED capability initializers */
|
|
|
|
|
2014-07-24 03:36:26 +08:00
|
|
|
#define CAP_LAST_U32 ((_KERNEL_CAPABILITY_U32S) - 1)
|
|
|
|
#define CAP_LAST_U32_VALID_MASK (CAP_TO_MASK(CAP_LAST_CAP + 1) -1)
|
|
|
|
|
2008-04-30 03:54:28 +08:00
|
|
|
# define CAP_EMPTY_SET ((kernel_cap_t){{ 0, 0 }})
|
2014-07-24 03:36:26 +08:00
|
|
|
# define CAP_FULL_SET ((kernel_cap_t){{ ~0, CAP_LAST_U32_VALID_MASK }})
|
2009-04-13 22:56:14 +08:00
|
|
|
# define CAP_FS_SET ((kernel_cap_t){{ CAP_FS_MASK_B0 \
|
|
|
|
| CAP_TO_MASK(CAP_LINUX_IMMUTABLE), \
|
|
|
|
CAP_FS_MASK_B1 } })
|
2009-03-17 06:34:20 +08:00
|
|
|
# define CAP_NFSD_SET ((kernel_cap_t){{ CAP_FS_MASK_B0 \
|
2009-04-13 22:56:14 +08:00
|
|
|
| CAP_TO_MASK(CAP_SYS_RESOURCE), \
|
|
|
|
CAP_FS_MASK_B1 } })
|
2008-02-05 14:29:42 +08:00
|
|
|
|
2008-05-28 13:05:17 +08:00
|
|
|
#endif /* _KERNEL_CAPABILITY_U32S != 2 */
|
2008-02-05 14:29:42 +08:00
|
|
|
|
|
|
|
# define cap_clear(c) do { (c) = __cap_empty_set; } while (0)
|
|
|
|
|
|
|
|
#define cap_raise(c, flag) ((c).cap[CAP_TO_INDEX(flag)] |= CAP_TO_MASK(flag))
|
|
|
|
#define cap_lower(c, flag) ((c).cap[CAP_TO_INDEX(flag)] &= ~CAP_TO_MASK(flag))
|
|
|
|
#define cap_raised(c, flag) ((c).cap[CAP_TO_INDEX(flag)] & CAP_TO_MASK(flag))
|
|
|
|
|
|
|
|
#define CAP_BOP_ALL(c, a, b, OP) \
|
|
|
|
do { \
|
|
|
|
unsigned __capi; \
|
|
|
|
CAP_FOR_EACH_U32(__capi) { \
|
|
|
|
c.cap[__capi] = a.cap[__capi] OP b.cap[__capi]; \
|
|
|
|
} \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
#define CAP_UOP_ALL(c, a, OP) \
|
|
|
|
do { \
|
|
|
|
unsigned __capi; \
|
|
|
|
CAP_FOR_EACH_U32(__capi) { \
|
|
|
|
c.cap[__capi] = OP a.cap[__capi]; \
|
|
|
|
} \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
static inline kernel_cap_t cap_combine(const kernel_cap_t a,
|
|
|
|
const kernel_cap_t b)
|
|
|
|
{
|
|
|
|
kernel_cap_t dest;
|
|
|
|
CAP_BOP_ALL(dest, a, b, |);
|
|
|
|
return dest;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_intersect(const kernel_cap_t a,
|
|
|
|
const kernel_cap_t b)
|
|
|
|
{
|
|
|
|
kernel_cap_t dest;
|
|
|
|
CAP_BOP_ALL(dest, a, b, &);
|
|
|
|
return dest;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_drop(const kernel_cap_t a,
|
|
|
|
const kernel_cap_t drop)
|
|
|
|
{
|
|
|
|
kernel_cap_t dest;
|
|
|
|
CAP_BOP_ALL(dest, a, drop, &~);
|
|
|
|
return dest;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_invert(const kernel_cap_t c)
|
|
|
|
{
|
|
|
|
kernel_cap_t dest;
|
|
|
|
CAP_UOP_ALL(dest, c, ~);
|
|
|
|
return dest;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline int cap_isclear(const kernel_cap_t a)
|
|
|
|
{
|
|
|
|
unsigned __capi;
|
|
|
|
CAP_FOR_EACH_U32(__capi) {
|
|
|
|
if (a.cap[__capi] != 0)
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-11-11 18:48:07 +08:00
|
|
|
/*
|
|
|
|
* Check if "a" is a subset of "set".
|
|
|
|
* return 1 if ALL of the capabilities in "a" are also in "set"
|
|
|
|
* cap_issubset(0101, 1111) will return 1
|
|
|
|
* return 0 if ANY of the capabilities in "a" are not in "set"
|
|
|
|
* cap_issubset(1111, 0101) will return 0
|
|
|
|
*/
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline int cap_issubset(const kernel_cap_t a, const kernel_cap_t set)
|
|
|
|
{
|
|
|
|
kernel_cap_t dest;
|
|
|
|
dest = cap_drop(a, set);
|
|
|
|
return cap_isclear(dest);
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
/* Used to decide between falling back on the old suser() or fsuser(). */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_drop_fs_set(const kernel_cap_t a)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-02-05 14:29:42 +08:00
|
|
|
const kernel_cap_t __cap_fs_set = CAP_FS_SET;
|
|
|
|
return cap_drop(a, __cap_fs_set);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_raise_fs_set(const kernel_cap_t a,
|
|
|
|
const kernel_cap_t permitted)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-02-05 14:29:42 +08:00
|
|
|
const kernel_cap_t __cap_fs_set = CAP_FS_SET;
|
|
|
|
return cap_combine(a,
|
|
|
|
cap_intersect(permitted, __cap_fs_set));
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_drop_nfsd_set(const kernel_cap_t a)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-02-05 14:29:42 +08:00
|
|
|
const kernel_cap_t __cap_fs_set = CAP_NFSD_SET;
|
|
|
|
return cap_drop(a, __cap_fs_set);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-02-05 14:29:42 +08:00
|
|
|
static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a,
|
|
|
|
const kernel_cap_t permitted)
|
|
|
|
{
|
|
|
|
const kernel_cap_t __cap_nfsd_set = CAP_NFSD_SET;
|
|
|
|
return cap_combine(a,
|
|
|
|
cap_intersect(permitted, __cap_nfsd_set));
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
kernel: conditionally support non-root users, groups and capabilities
There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root. For these systems,
supporting multiple users is not necessary.
This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional. It is enabled
under CONFIG_EXPERT menu.
When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.
The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.
Also, groups.c is compiled out completely.
In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.
This change saves about 25 KB on a defconfig build. The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB. (The 25k goes down a bit with allnoconfig, but not that much.
The kernel was booted in Qemu. All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.
Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Iulia Manda <iulia.manda21@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-16 07:16:41 +08:00
|
|
|
#ifdef CONFIG_MULTIUSER
|
2011-03-24 07:43:21 +08:00
|
|
|
extern bool has_capability(struct task_struct *t, int cap);
|
|
|
|
extern bool has_ns_capability(struct task_struct *t,
|
|
|
|
struct user_namespace *ns, int cap);
|
|
|
|
extern bool has_capability_noaudit(struct task_struct *t, int cap);
|
2012-01-04 01:25:15 +08:00
|
|
|
extern bool has_ns_capability_noaudit(struct task_struct *t,
|
|
|
|
struct user_namespace *ns, int cap);
|
userns: security: make capabilities relative to the user namespace
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-24 07:43:17 +08:00
|
|
|
extern bool capable(int cap);
|
|
|
|
extern bool ns_capable(struct user_namespace *ns, int cap);
|
kernel: conditionally support non-root users, groups and capabilities
There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root. For these systems,
supporting multiple users is not necessary.
This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional. It is enabled
under CONFIG_EXPERT menu.
When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.
The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.
Also, groups.c is compiled out completely.
In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.
This change saves about 25 KB on a defconfig build. The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB. (The 25k goes down a bit with allnoconfig, but not that much.
The kernel was booted in Qemu. All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.
Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Iulia Manda <iulia.manda21@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-16 07:16:41 +08:00
|
|
|
#else
|
|
|
|
static inline bool has_capability(struct task_struct *t, int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
static inline bool has_ns_capability(struct task_struct *t,
|
|
|
|
struct user_namespace *ns, int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
static inline bool has_capability_noaudit(struct task_struct *t, int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
static inline bool has_ns_capability_noaudit(struct task_struct *t,
|
|
|
|
struct user_namespace *ns, int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
static inline bool capable(int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
static inline bool ns_capable(struct user_namespace *ns, int cap)
|
|
|
|
{
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_MULTIUSER */
|
2014-06-11 03:45:42 +08:00
|
|
|
extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
|
2013-04-15 01:06:31 +08:00
|
|
|
extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
|
2006-01-12 04:17:46 +08:00
|
|
|
|
2008-11-11 18:48:14 +08:00
|
|
|
/* audit system wants to get cap info from files as well */
|
|
|
|
extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif /* !_LINUX_CAPABILITY_H */
|