commit 6406887a12ee5dcdaffff1a8508d91113d545559 upstream.
Commit 02a52c5c8c ("selinux: move policy commit after updating
selinuxfs") moved the selinux_policy_commit() call out of
security_load_policy() into sel_write_load(), which caused a subtle yet
rather serious bug.
The problem is that security_load_policy() passes a reference to the
convert_params local variable to sidtab_convert(), which stores it in
the sidtab, where it may be accessed until the policy is swapped over
and RCU synchronized. Before 02a52c5c8c, selinux_policy_commit() was
called directly from security_load_policy(), so the convert_params
pointer remained valid all the way until the old sidtab was destroyed,
but now that's no longer the case and calls to sidtab_context_to_sid()
on the old sidtab after security_load_policy() returns may cause invalid
memory accesses.
This can be easily triggered using the stress test from commit
ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve
performance"):
```
function rand_cat() {
echo $(( $RANDOM % 1024 ))
}
function do_work() {
while true; do
echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
>/sys/fs/selinux/context 2>/dev/null || true
done
}
do_work >/dev/null &
do_work >/dev/null &
do_work >/dev/null &
while load_policy; do echo -n .; sleep 0.1; done
kill %1
kill %2
kill %3
```
Fix this by allocating the temporary sidtab convert structures
dynamically and passing them among the
selinux_policy_{load,cancel,commit} functions.
Fixes: 02a52c5c8c ("selinux: move policy commit after updating selinuxfs")
Cc: stable@vger.kernel.org
Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: merge fuzz in security.h and services.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
checkreqprot data member in selinux_state struct is accessed directly by
SELinux functions to get and set. This could cause unexpected read or
write access to this data member due to compiler optimizations and/or
compiler's reordering of access to this field.
Add helper functions to get and set checkreqprot data member in
selinux_state struct. These helper functions use READ_ONCE and
WRITE_ONCE macros to ensure atomic read or write of memory for
this data member.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Use READ_ONCE/WRITE_ONCE for all accesses to the
selinux_state.policycaps booleans to prevent compiler
mischief.
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Remove the security_policydb_len() calls from sel_open_policy() and
instead update the inode size from the size returned from
security_read_policy().
Since after this change security_policydb_len() is only called from
security_load_policy(), remove it entirely and just open-code it there.
Also, since security_load_policy() is always called with policy_mutex
held, make it dereference the policy pointer directly and drop the
unnecessary RCU locking.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Move the mutex used to synchronize policy changes (reloads and setting
of booleans) from selinux_fs_info to selinux_state and use it in
lockdep checks for rcu_dereference_protected() calls in the security
server functions. This makes the dependency on the mutex explicit
in the code rather than relying on comments.
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Convert the policy read-write lock to RCU. This is significantly
simplified by the earlier work to encapsulate the policy data
structures and refactor the policy load and boolean setting logic.
Move the latest_granting sequence number into the selinux_policy
structure so that it can be updated atomically with the policy.
Since removing the policy rwlock and moving latest_granting reduces
the selinux_ss structure to nothing more than a wrapper around the
selinux_policy pointer, get rid of the extra layer of indirection.
At present this change merely passes a hardcoded 1 to
rcu_dereference_check() in the cases where we know we do not need to
take rcu_read_lock(), with the preceding comment explaining why.
Alternatively we could pass fsi->mutex down from selinuxfs and
apply a lockdep check on it instead.
Based in part on earlier attempts to convert the policy rwlock
to RCU by Kaigai Kohei [1] and by Peter Enderborg [2].
[1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@tycho.nsa.gov/
[2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@sony.com/
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
With the refactoring of the policy load logic in the security
server from the previous change, it is now possible to split out
the committing of the new policy from security_load_policy() and
perform it only after successful updating of selinuxfs. Change
security_load_policy() to return the newly populated policy
data structures to the caller, export selinux_policy_commit()
for external callers, and introduce selinux_policy_cancel() to
provide a way to cancel the policy load in the event of an error
during updating of the selinuxfs directory tree. Further, rework
the interfaces used by selinuxfs to get information from the policy
when creating the new directory tree to take and act upon the
new policy data structure rather than the current/active policy.
Update selinuxfs to use these updated and new interfaces. While
we are here, stop re-creating the policy_capabilities directory
on each policy load since it does not depend on the policy, and
stop trying to create the booleans and classes directories during
the initial creation of selinuxfs since no information is available
until first policy load.
After this change, a failure while updating the booleans and class
directories will cause the entire policy load to be canceled, leaving
the original policy intact, and policy load notifications to userspace
will only happen after a successful completion of updating those
directories. This does not (yet) provide full atomicity with respect
to the updating of the directory trees themselves.
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Presently mdp does not enable any SELinux policy capabilities
in the dummy policy it generates. Thus, policies derived from
it will by default lack various features commonly used in modern
policies such as open permission, extended socket classes, network
peer controls, etc. Split the policy capability definitions out into
their own headers so that we can include them into mdp without pulling in
other kernel headers and extend mdp generate policycap statements for the
policy capabilities known to the kernel. Policy authors may wish to
selectively remove some of these from the generated policy.
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Implement a new, more space-efficient way of storing filename
transitions in the binary policy. The internal structures have already
been converted to this new representation; this patch just implements
reading/writing an equivalent represntation from/to the binary policy.
This new format reduces the size of Fedora policy from 7.6 MB to only
3.3 MB (with policy optimization enabled in both cases). With the
unconfined module disabled, the size is reduced from 3.3 MB to 2.4 MB.
The time to load policy into kernel is also shorter with the new format.
On Fedora Rawhide x86_64 it dropped from 157 ms to 106 ms; without the
unconfined module from 115 ms to 105 ms.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Currently symlinks on kernel filesystems, like sysfs, are labeled on
creation with the parent filesystem root sid.
Allow symlinks to inherit the parent directory context, so fine-grained
kernfs labeling can be applied to symlinks too and checking contexts
doesn't complain about them.
For backward-compatibility this behavior is contained in a new policy
capability: genfs_seclabel_symlinks
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
It fits more naturally in selinux_state, since it reflects also global
state (the enforcing and policyload fields).
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The disabled/enforcing/initialized flags are all accessed concurrently
by threads so use the appropriate accessors that ensure atomicity and
document that it is expected.
Use smp_load/acquire...() helpers (with memory barriers) for the
initialized flag, since it gates access to the rest of the state
structures.
Note that the disabled flag is currently not used for anything other
than avoiding double disable, but it will be used for bailing out of
hooks once security_delete_hooks() is removed.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Randomize the layout of key selinux data structures.
Initially this is applied to the selinux_state, selinux_ss,
policydb, and task_security_struct data structures.
NB To test/use this mechanism, one must install the
necessary build-time dependencies, e.g. gcc-plugin-devel on Fedora,
and enable CONFIG_GCC_PLUGIN_RANDSTRUCT in the kernel configuration.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Kees Cook <keescook@chromium.org>
[PM: double semi-colon fixed]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Rename selinux_enabled to selinux_enabled_boot to make it clear that
it only reflects whether SELinux was enabled at boot. Replace the
references to it in the MAC_STATUS audit log in sel_write_enforce()
with hardcoded "1" values because this code is only reachable if SELinux
is enabled and does not change its value, and update the corresponding
MAC_STATUS audit log in sel_write_disable(). Stop clearing
selinux_enabled in selinux_disable() since it is not used outside of
initialization code that runs before selinux_disable() can be reached.
Mark both selinux_enabled_boot and selinux_enforcing_boot as __initdata
since they are only used in initialization code.
Wrap the disabled field in the struct selinux_state with
CONFIG_SECURITY_SELINUX_DISABLE since it is only used for
runtime disable.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This replaces the reverse table lookup and reverse cache with a
hashtable which improves cache-miss reverse-lookup times from
O(n) to O(1)* and maintains the same performance as a reverse
cache hit.
This reduces the time needed to add a new sidtab entry from ~500us
to 5us on a Pixel 3 when there are ~10,000 sidtab entries.
The implementation uses the kernel's generic hashtable API,
It uses the context's string represtation as the hash source,
and the kernels generic string hashing algorithm full_name_hash()
to reduce the string to a 32 bit value.
This change also maintains the improvement introduced in
commit ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve
performance") which removed the need to keep the current sidtab
locked during policy reload. It does however introduce periodic
locking of the target sidtab while converting the hashtable. Sidtab
entries are never modified or removed, so the context struct stored
in the sid_to_context tree can also be used for the context_to_sid
hashtable to reduce memory usage.
This bug was reported by:
- On the selinux bug tracker.
BUG: kernel softlockup due to too many SIDs/contexts #37https://github.com/SELinuxProject/selinux-kernel/issues/37
- Jovana Knezevic on Android's bugtracker.
Bug: 140252993
"During multi-user performance testing, we create and remove users
many times. selinux_android_restorecon_pkgdir goes from 1ms to over
20ms after about 200 user creations and removals. Accumulated over
~280 packages, that adds a significant time to user creation,
making perf benchmarks unreliable."
* Hashtable lookup is only O(1) when n < the number of buckets.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Reported-by: Jovana Knezevic <jovanak@google.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: subj tweak, removed changelog from patch description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
A policy developer can now specify glblub as a default_range default and
the computed transition will be the intersection of the mls range of
the two contexts.
The glb (greatest lower bound) lub (lowest upper bound) of a range is calculated
as the greater of the low sensitivities and the lower of the high sensitivities
and the and of each category bitmap.
This can be used by MLS solution developers to compute a context that satisfies,
for example, the range of a network interface and the range of a user logging in.
Some examples are:
User Permitted Range | Network Device Label | Computed Label
---------------------|----------------------|----------------
s0-s1:c0.c12 | s0 | s0
s0-s1:c0.c12 | s0-s1:c0.c1023 | s0-s1:c0.c12
s0-s4:c0.c512 | s1-s1:c0.c1023 | s1-s1:c0.c512
s0-s15:c0,c2 | s4-s6:c0.c128 | s4-s6:c0,c2
s0-s4 | s2-s6 | s2-s4
s0-s4 | s5-s8 | INVALID
s5-s8 | s0-s4 | INVALID
Signed-off-by: Joshua Brindle <joshua.brindle@crunchydata.com>
[PM: subject lines and checkpatch.pl fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since kernfs supports the security xattr handlers, we can simply use
these to determine the inode's context, dropping the need to update it
from kernfs explicitly using a security_inode_notifysecctx() call.
We achieve this by setting a new sbsec flag SE_SBGENFS_XATTR to all
mounts that are known to use kernfs under the hood and then fetching the
xattrs after determining the fallback genfs sid in
inode_doinit_with_dentry() when this flag is set.
This will allow implementing full security xattr support in kernfs and
removing the ...notifysecctx() call in a subsequent patch.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: more manual merge fixups]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull vfs mount infrastructure updates from Al Viro:
"The rest of core infrastructure; no new syscalls in that pile, but the
old parts are switched to new infrastructure. At that point
conversions of individual filesystems can happen independently; some
are done here (afs, cgroup, procfs, etc.), there's also a large series
outside of that pile dealing with NFS (quite a bit of option-parsing
stuff is getting used there - it's one of the most convoluted
filesystems in terms of mount-related logics), but NFS bits are the
next cycle fodder.
It got seriously simplified since the last cycle; documentation is
probably the weakest bit at the moment - I considered dropping the
commit introducing Documentation/filesystems/mount_api.txt (cutting
the size increase by quarter ;-), but decided that it would be better
to fix it up after -rc1 instead.
That pile allows to do followup work in independent branches, which
should make life much easier for the next cycle. fs/super.c size
increase is unpleasant; there's a followup series that allows to
shrink it considerably, but I decided to leave that until the next
cycle"
* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
afs: Use fs_context to pass parameters over automount
afs: Add fs_context support
vfs: Add some logging to the core users of the fs_context log
vfs: Implement logging through fs_context
vfs: Provide documentation for new mount API
vfs: Remove kern_mount_data()
hugetlbfs: Convert to fs_context
cpuset: Use fs_context
kernfs, sysfs, cgroup, intel_rdt: Support fs_context
cgroup: store a reference to cgroup_ns into cgroup_fs_context
cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper
cgroup_do_mount(): massage calling conventions
cgroup: stash cgroup_root reference into cgroup_fs_context
cgroup2: switch to option-by-option parsing
cgroup1: switch to option-by-option parsing
cgroup: take options parsing into ->parse_monolithic()
cgroup: fold cgroup1_mount() into cgroup1_get_tree()
cgroup: start switching to fs_context
ipc: Convert mqueue fs to fs_context
proc: Add fs_context support to procfs
...
Implement the new mount API LSM hooks for SELinux. At some point the old
hooks will need to be removed.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paul Moore <paul@paul-moore.com>
cc: Stephen Smalley <sds@tycho.nsa.gov>
cc: selinux@tycho.nsa.gov
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In case a file has an invalid context set, in an AVC record generated
upon access to such file, the target context is always reported as
unlabeled. This patch adds new optional fields to the AVC record
(srawcon and trawcon) that report the actual context string if it
differs from the one reported in scontext/tcontext. This is useful for
diagnosing SELinux denials involving invalid contexts.
To trigger an AVC that illustrates this situation:
# setenforce 0
# touch /tmp/testfile
# setfattr -n security.selinux -v system_u:object_r:banana_t:s0 /tmp/testfile
# runcon system_u:system_r:sshd_t:s0 cat /tmp/testfile
AVC before:
type=AVC msg=audit(1547801083.248:11): avc: denied { open } for pid=1149 comm="cat" path="/tmp/testfile" dev="tmpfs" ino=6608 scontext=system_u:system_r:sshd_t:s0 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=file permissive=1
AVC after:
type=AVC msg=audit(1547801083.248:11): avc: denied { open } for pid=1149 comm="cat" path="/tmp/testfile" dev="tmpfs" ino=6608 scontext=system_u:system_r:sshd_t:s0 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=file permissive=1 trawcon=system_u:object_r:banana_t:s0
Note that it is also possible to encounter this situation with the
'scontext' field - e.g. when a new policy is loaded while a process is
running, whose context is not valid in the new policy.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1135683
Cc: Daniel Walsh <dwalsh@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Wrap the AVC state within the selinux_state structure and
pass it explicitly to all AVC functions. The AVC private state
is encapsulated in a selinux_avc structure that is referenced
from the selinux_state.
This change should have no effect on SELinux behavior or
APIs (userspace or LSM).
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Define a selinux state structure (struct selinux_state) for
global SELinux state and pass it explicitly to all security server
functions. The public portion of the structure contains state
that is used throughout the SELinux code, such as the enforcing mode.
The structure also contains a pointer to a selinux_ss structure whose
definition is private to the security server and contains security
server specific state such as the policy database and SID table.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps SELinux state and passes it
explicitly as needed.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: minor fixups needed due to collisions with the SCTP patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update my email address since epoch.ncsc.mil no longer exists.
MAINTAINERS and CREDITS are already correct.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
As systemd ramps up enabling NNP (NoNewPrivileges) for system services,
it is increasingly breaking SELinux domain transitions for those services
and their descendants. systemd enables NNP not only for services whose
unit files explicitly specify NoNewPrivileges=yes but also for services
whose unit files specify any of the following options in combination with
running without CAP_SYS_ADMIN (e.g. specifying User= or a
CapabilityBoundingSet= without CAP_SYS_ADMIN): SystemCallFilter=,
SystemCallArchitectures=, RestrictAddressFamilies=, RestrictNamespaces=,
PrivateDevices=, ProtectKernelTunables=, ProtectKernelModules=,
MemoryDenyWriteExecute=, or RestrictRealtime= as per the systemd.exec(5)
man page.
The end result is bad for the security of both SELinux-disabled and
SELinux-enabled systems. Packagers have to turn off these
options in the unit files to preserve SELinux domain transitions. For
users who choose to disable SELinux, this means that they miss out on
at least having the systemd-supported protections. For users who keep
SELinux enabled, they may still be missing out on some protections
because it isn't necessarily guaranteed that the SELinux policy for
that service provides the same protections in all cases.
commit 7b0d0b40cd ("selinux: Permit bounded transitions under
NO_NEW_PRIVS or NOSUID.") allowed bounded transitions under NNP in
order to support limited usage for sandboxing programs. However,
defining typebounds for all of the affected service domains
is impractical to implement in policy, since typebounds requires us
to ensure that each domain is allowed everything all of its descendant
domains are allowed, and this has to be repeated for the entire chain
of domain transitions. There is no way to clone all allow rules from
descendants to their ancestors in policy currently, and doing so would
be undesirable even if it were practical, as it requires leaking
permissions to objects and operations into ancestor domains that could
weaken their own security in order to allow them to the descendants
(e.g. if a descendant requires execmem permission, then so do all of
its ancestors; if a descendant requires execute permission to a file,
then so do all of its ancestors; if a descendant requires read to a
symbolic link or temporary file, then so do all of its ancestors...).
SELinux domains are intentionally not hierarchical / bounded in this
manner normally, and making them so would undermine their protections
and least privilege.
We have long had a similar tension with SELinux transitions and nosuid
mounts, albeit not as severe. Users often have had to choose between
retaining nosuid on a mount and allowing SELinux domain transitions on
files within those mounts. This likewise leads to unfortunate tradeoffs
in security.
Decouple NNP/nosuid from SELinux transitions, so that we don't have to
make a choice between them. Introduce a nnp_nosuid_transition policy
capability that enables transitions under NNP/nosuid to be based on
a permission (nnp_transition for NNP; nosuid_transition for nosuid)
between the old and new contexts in addition to the current support
for bounded transitions. Domain transitions can then be allowed in
policy without requiring the parent to be a strict superset of all of
its children.
With this change, systemd unit files can be left unmodified from upstream.
SELinux-disabled and SELinux-enabled users will benefit from retaining any
of the systemd-provided protections. SELinux policy will only need to
be adapted to enable the new policy capability and to allow the
new permissions between domain pairs as appropriate.
NB: Allowing nnp_transition between two contexts opens up the potential
for the old context to subvert the new context by installing seccomp
filters before the execve. Allowing nosuid_transition between two contexts
opens up the potential for a context transition to occur on a file from
an untrusted filesystem (e.g. removable media or remote filesystem). Use
with care.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add a type for Infiniband ports and an access vector for subnet
management packets. Implement the ib_port_smp hook to check that the
caller has permission to send and receive SMPs on the end port specified
by the device name and port. Add interface to query the SID for a IB
port, which walks the IB_PORT ocontexts to find an entry for the
given name and port.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add a type and access vector for PKeys. Implement the ib_pkey_access
hook to check that the caller has permission to access the PKey on the
given subnet prefix. Add an interface to get the PKey SID. Walk the PKey
ocontexts to find an entry for the given subnet prefix and pkey.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Support for Infiniband requires the addition of two new object contexts,
one for infiniband PKeys and another IB Ports. Added handlers to read
and write the new ocontext types when reading or writing a binary policy
representation.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Log the state of SELinux policy capabilities when a policy is loaded.
For each policy capability known to the kernel, log the policy capability
name and the value set in the policy. For policy capabilities that are
set in the loaded policy but unknown to the kernel, log the policy
capability index, since this is the only information presently available
in the policy.
Sample output with a policy created with a new capability defined
that is not known to the kernel:
SELinux: policy capability network_peer_controls=1
SELinux: policy capability open_perms=1
SELinux: policy capability extended_socket_class=1
SELinux: policy capability always_check_network=0
SELinux: policy capability cgroup_seclabel=0
SELinux: unknown policy capability 5
Resolves: https://github.com/SELinuxProject/selinux-kernel/issues/32
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit 1ea0ce4069 ("selinux: allow
changing labels for cgroupfs") broke the Android init program,
which looks up security contexts whenever creating directories
and attempts to assign them via setfscreatecon().
When creating subdirectories in cgroup mounts, this would previously
be ignored since cgroup did not support userspace setting of security
contexts. However, after the commit, SELinux would attempt to honor
the requested context on cgroup directories and fail due to permission
denial. Avoid breaking existing userspace/policy by wrapping this change
with a conditional on a new cgroup_seclabel policy capability. This
preserves existing behavior until/unless a new policy explicitly enables
this capability.
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Extend SELinux to support distinctions among all network address families
implemented by the kernel by defining new socket security classes
and mapping to them. Otherwise, many sockets are mapped to the generic
socket class and are indistinguishable in policy. This has come up
previously with regard to selectively allowing access to bluetooth sockets,
and more recently with regard to selectively allowing access to AF_ALG
sockets. Guido Trentalancia submitted a patch that took a similar approach
to add only support for distinguishing AF_ALG sockets, but this generalizes
his approach to handle all address families implemented by the kernel.
Socket security classes are also added for ICMP and SCTP sockets.
Socket security classes were not defined for AF_* values that are reserved
but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH,
AF_ECONET, AF_SNA, AF_WANPIPE.
Backward compatibility is provided by only enabling the finer-grained
socket classes if a new policy capability is set in the policy; older
policies will behave as before. The legacy redhat1 policy capability
that was only ever used in testing within Fedora for ptrace_child
is reclaimed for this purpose; as far as I can tell, this policy
capability is not enabled in any supported distro policy.
Add a pair of conditional compilation guards to detect when new AF_* values
are added so that we can update SELinux accordingly rather than having to
belatedly update it long after new address families are introduced.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option
Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo
This was only needed on Fedora 3 and 4 and just causes issues now,
so drop it.
The MAX and MIN should just be whatever the kernel can support.
Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Make validatetrans decisions available through selinuxfs.
"/validatetrans" is added to selinuxfs for this purpose.
This functionality is needed by file system servers
implemented in userspace or kernelspace without the VFS
layer.
Writing "$oldcontext $newcontext $tclass $taskcontext"
to /validatetrans is expected to return 0 if the transition
is allowed and -EPERM otherwise.
Signed-off-by: Andrew Perepechko <anserper@ya.ru>
CC: andrew.perepechko@seagate.com
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
There seems to be a little confusion as to whether the scontext_len
parameter of security_context_to_sid() includes the nul-byte or
not. Reading security_context_to_sid_core(), it seems that the
expectation is that it does not (both the string copying and the test
for scontext_len being zero hint at that).
Introduce the helper security_context_str_to_sid() to do the strlen()
call and fix all callers.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Add extended permissions logic to selinux. Extended permissions
provides additional permissions in 256 bit increments. Extend the
generic ioctl permission check to use the extended permissions for
per-command filtering. Source/target/class sets including the ioctl
permission may additionally include a set of commands. Example:
allowxperm <source> <target>:<class> ioctl unpriv_app_socket_cmds
auditallowxperm <source> <target>:<class> ioctl priv_gpu_cmds
Where unpriv_app_socket_cmds and priv_gpu_cmds are macros
representing commonly granted sets of ioctl commands.
When ioctl commands are omitted only the permissions are checked.
This feature is intended to provide finer granularity for the ioctl
permission that may be too imprecise. For example, the same driver
may use ioctls to provide important and benign functionality such as
driver version or socket type as well as dangerous capabilities such
as debugging features, read/write/execute to physical memory or
access to sensitive data. Per-command filtering provides a mechanism
to reduce the attack surface of the kernel, and limit applications
to the subset of commands required.
The format of the policy binary has been modified to include ioctl
commands, and the policy version number has been incremented to
POLICYDB_VERSION_XPERMS_IOCTL=30 to account for the format
change.
The extended permissions logic is deliberately generic to allow
components to be reused e.g. netlink filters
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Add support for per-file labeling of debugfs files so that
we can distinguish them in policy. This is particularly
important in Android where certain debugfs files have to be writable
by apps and therefore the debugfs directory tree can be read and
searched by all.
Since debugfs is entirely kernel-generated, the directory tree is
immutable by userspace, and the inodes are pinned in memory, we can
simply use the same approach as with proc and label the inodes from
policy based on pathname from the root of the debugfs filesystem.
Generalize the existing labeling support used for proc and reuse it
for debugfs too.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
To increase compiler portability there is <linux/compiler.h> which
provides convenience macros for various gcc constructs. Eg: __packed
for __attribute__((packed)).
This patch is part of a large task I've taken to clean the gcc
specific attributes and use the the macros instead.
Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)
Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.
CC: Paul Moore <paul@paul-moore.com>
CC: Dave Jones <davej@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Fan Du <fan.du@windriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: LSM list <linux-security-module@vger.kernel.org>
CC: SELinux list <selinux@tycho.nsa.gov>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Update the policy version (POLICYDB_VERSION_CONSTRAINT_NAMES) to allow
holding of policy source info for constraints.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Conflicts:
security/selinux/hooks.c
Pull Eric's existing SELinux tree as there are a number of patches in
there that are not yet upstream. There was some minor fixup needed to
resolve a conflict in security/selinux/hooks.c:selinux_set_mnt_opts()
between the labeled NFS patches and Eric's security_fs_use()
simplification patch.
This reverts commit 308ab70c46.
It breaks my FC6 test box. /dev/pts is not mounted. dmesg says
SELinux: mount invalid. Same superblock, different security settings
for (dev devpts, type devpts)
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Currently the packet class in SELinux is not checked if there are no
SECMARK rules in the security or mangle netfilter tables. Some systems
prefer that packets are always checked, for example, to protect the system
should the netfilter rules fail to load or if the nefilter rules
were maliciously flushed.
Add the always_check_network policy capability which, when enabled, treats
SECMARK as enabled, even if there are no netfilter SECMARK rules and
treats peer labeling as enabled, even if there is no Netlabel or
labeled IPSEC configuration.
Includes definition of "redhat1" SELinux policy capability, which
exists in the SELinux userpace library, to keep ordering correct.
The SELinux userpace portion of this was merged last year, but this kernel
change fell on the floor.
Signed-off-by: Chris PeBenito <cpebenito@tresys.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Rather than passing pointers to memory locations, strings, and other
stuff just give up on the separation and give security_fs_use the
superblock. It just makes the code easier to read (even if not easier to
reuse on some other OS)
Signed-off-by: Eric Paris <eparis@redhat.com>
Instead of having special code around the 'non-mount' seclabel mount option
just handle it like the mount options.
Signed-off-by: Eric Paris <eparis@redhat.com>
We only have 6 options, so char is good enough, but use a short as that
packs nicely. This shrinks the superblock_security_struct just a little
bit.
Signed-off-by: Eric Paris <eparis@redhat.com>
Just to make it clear that we have mount time options and flags,
separate them. Since I decided to move the non-mount options above
above 0x10, we need a short instead of a char. (x86 padding says
this takes up no additional space as we have a 3byte whole in the
structure)
Signed-off-by: Eric Paris <eparis@redhat.com>
There currently doesn't exist a labeling type that is adequate for use with
labeled NFS. Since NFS doesn't really support xattrs we can't use the use xattr
labeling behavior. For this we developed a new labeling type. The native
labeling type is used solely by NFS to ensure NFS inodes are labeled at runtime
by the NFS code instead of relying on the SELinux security server on the client
end.
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>