Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.12:
API:
- Add batch registration for acomp/scomp
- Change acomp testing to non-unique compressed result
- Extend algorithm name limit to 128 bytes
- Require setkey before accept(2) in algif_aead
Algorithms:
- Add support for deflate rfc1950 (zlib)
Drivers:
- Add accelerated crct10dif for powerpc
- Add crc32 in stm32
- Add sha384/sha512 in ccp
- Add 3des/gcm(aes) for v5 devices in ccp
- Add Queue Interface (QI) backend support in caam
- Add new Exynos RNG driver
- Add ThunderX ZIP driver
- Add driver for hardware random generator on MT7623 SoC"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (101 commits)
crypto: stm32 - Fix OF module alias information
crypto: algif_aead - Require setkey before accept(2)
crypto: scomp - add support for deflate rfc1950 (zlib)
crypto: scomp - allow registration of multiple scomps
crypto: ccp - Change ISR handler method for a v5 CCP
crypto: ccp - Change ISR handler method for a v3 CCP
crypto: crypto4xx - rename ce_ring_contol to ce_ring_control
crypto: testmgr - Allow ecb(cipher_null) in FIPS mode
Revert "crypto: arm64/sha - Add constant operand modifier to ASM_EXPORT"
crypto: ccp - Disable interrupts early on unload
crypto: ccp - Use only the relevant interrupt bits
hwrng: mtk - Add driver for hardware random generator on MT7623 SoC
dt-bindings: hwrng: Add Mediatek hardware random generator bindings
crypto: crct10dif-vpmsum - Fix missing preempt_disable()
crypto: testmgr - replace compression known answer test
crypto: acomp - allow registration of multiple acomps
hwrng: n2 - Use devm_kcalloc() in n2rng_probe()
crypto: chcr - Fix error handling related to 'chcr_alloc_shash'
padata: get_next is never NULL
crypto: exynos - Add new Exynos RNG driver
...
Some cipher implementations will crash if you try to use them
without calling setkey first. This patch adds a check so that
the accept(2) call will fail with -ENOKEY if setkey hasn't been
done on the socket yet.
Fixes: 400c40cf78 ("crypto: algif - add AEAD support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add scomp backend for zlib-deflate compression algorithm.
This backend outputs data using the format defined in rfc1950
(raw deflate surrounded by zlib header and footer).
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add crypto_register_scomps and crypto_unregister_scomps to allow
the registration of multiple implementations with one call.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cipher_null is not a real cipher, FIPS mode should not restrict its use.
It is used for several tests (for example in cryptsetup testsuite) and also
temporarily for reencryption of not yet encrypted device in cryptsetup-reencrypt tool.
Problem is easily reproducible with
cryptsetup benchmark -c null
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Acked-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Compression implementations might return valid outputs that
do not match what specified in the test vectors.
For this reason, the testmgr might report that a compression
implementation failed the test even if the data produced
by the compressor is correct.
This implements a decompress-and-verify test for acomp
compression tests rather than a known answer test.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add crypto_register_acomps and crypto_unregister_acomps to allow
the registration of multiple implementations with one call.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes the following problems:
- regression in new XTS/LRW code when used with async crypto
- long-standing bug in ahash API when used with certain algos
- bogus memory dereference in async algif_aead with certain algos"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_aead - Fix bogus request dereference in completion function
crypto: ahash - Fix EINPROGRESS notification callback
crypto: lrw - Fix use-after-free on EINPROGRESS
crypto: xts - Fix use-after-free on EINPROGRESS
Decompress function in LZ4 library is supposed to return an error code or
negative result. But, it returns -1 when any error is detected. Return
error code when the library returns negative value.
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes the hard-coded 64-byte limit on the length
of the algorithm name through bind(2). The address length can
now exceed that. The user-space structure remains unchanged.
In order to use a longer name simply extend the salg_name array
beyond its defined 64 bytes length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch hard-codes CRYPTO_MAX_NAME in the user-space API to
64, which is the current value of CRYPTO_MAX_ALG_NAME. This patch
also replaces all remaining occurences of CRYPTO_MAX_ALG_NAME
in the user-space API with CRYPTO_MAX_NAME.
This way the user-space API will not be modified when we raise
the value of CRYPTO_MAX_ALG_NAME.
Furthermore, the code has been updated to handle names longer than
the user-space API. They will be truncated.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Tested-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
The algif_aead completion function tries to deduce the aead_request
from the crypto_async_request argument. This is broken because
the API does not guarantee that the same request will be pased to
the completion function. Only the value of req->data can be used
in the completion function.
This patch fixes it by storing a pointer to sk in areq and using
that instead of passing in sk through req->data.
Fixes: 83094e5e9e ("crypto: af_alg - add async support to...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ahash API modifies the request's callback function in order
to clean up after itself in some corner cases (unaligned final
and missing finup).
When the request is complete ahash will restore the original
callback and everything is fine. However, when the request gets
an EBUSY on a full queue, an EINPROGRESS callback is made while
the request is still ongoing.
In this case the ahash API will incorrectly call its own callback.
This patch fixes the problem by creating a temporary request
object on the stack which is used to relay EINPROGRESS back to
the original completion function.
This patch also adds code to preserve the original flags value.
Fixes: ab6bf4e5e5 ("crypto: hash - Fix the pointer voodoo in...")
Cc: <stable@vger.kernel.org>
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When we get an EINPROGRESS completion in lrw, we will end up marking
the request as done and freeing it. This then blows up when the
request is really completed as we've already freed the memory.
Fixes: 700cb3f5fe ("crypto: lrw - Convert to skcipher")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When we get an EINPROGRESS completion in xts, we will end up marking
the request as done and freeing it. This then blows up when the
request is really completed as we've already freed the memory.
Fixes: f1c131b454 ("crypto: xts - Convert to skcipher")
Cc: <stable@vger.kernel.org>
Reported-by: Nathan Royce <nroycea+kernel@gmail.com>
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
Since the gf128mul_x_ble function used by xts.c is now defined inline
in the header file, the XTS module no longer depends on gf128mul.
Therefore, the 'select CRYPTO_GF128MUL' line can be safely removed.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Reviewd-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, gf128mul_x_ble works with pointers to be128, even though it
actually interprets the words as little-endian. Consequently, it uses
cpu_to_le64/le64_to_cpu on fields of type __be64, which is incorrect.
This patch fixes that by changing the function to accept pointers to
le128 and updating all users accordingly.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Reviewd-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The gf128mul_x_ble function is currently defined in gf128mul.c, because
it depends on the gf128mul_table_be multiplication table.
However, since the function is very small and only uses two values from
the table, it is better for it to be defined as inline function in
gf128mul.h. That way, the function can be inlined by the compiler for
better performance.
For consistency, the other gf128mul_x_* functions are also moved to the
header file. In addition, the code is rewritten to be constant-time.
After this change, the speed of the generic 'xts(aes)' implementation
increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
benchmark -c aes-xts-plain64' on an Intel system with CRYPTO_AES_X86_64
and CRYPTO_AES_NI_INTEL disabled).
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Reviewd-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- memory corruption when kmalloc fails in xts/lrw
- mark some CCP DMA channels as private
- fix reordering race in padata
- regression in omap-rng DT description"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: xts,lrw - fix out-of-bounds write after kmalloc failure
crypto: ccp - Make some CCP DMA channels private
padata: avoid race in reordering
dt-bindings: rng: clocks property on omap_rng not always mandatory
An SGL to be initialized only once even when its buffers are written
to several times.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
3DES is missing the fips_allowed flag for CTR mode.
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The md5_transform function is no longer used any where in the tree,
except for the crypto api's actual implementation of md5, so we can drop
the function from lib and put it as a static function of the crypto
file, where it belongs. There should be no new users of md5_transform,
anyway, since there are more modern ways of doing what it once achieved.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
vpmsum implementations often don't kick in for short test vectors.
This is a simple test module that does a configurable number of
random tests, each up to 64kB and each with random offsets.
Both CRC-T10DIF and CRC32C are tested.
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
T10DIF is a CRC16 used heavily in NVMe.
It turns out we can accelerate it with a CRC32 library and a few
little tricks.
Provide the accelerator based the refactored CRC32 code.
Cc: Anton Blanchard <anton@samba.org>
Thanks-to: Hong Bo Peng <penghb@cn.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the generic XTS and LRW algorithms, for input data > 128 bytes, a
temporary buffer is allocated to hold the values to be XOR'ed with the
data before and after encryption or decryption. If the allocation
fails, the fixed-size buffer embedded in the request buffer is meant to
be used as a fallback --- resulting in more calls to the ECB algorithm,
but still producing the correct result. However, we weren't correctly
limiting subreq->cryptlen in this case, resulting in pre_crypt()
overrunning the embedded buffer. Fix this by setting subreq->cryptlen
correctly.
Fixes: f1c131b454 ("crypto: xts - Convert to skcipher")
Fixes: 700cb3f5fe ("crypto: lrw - Convert to skcipher")
Cc: stable@vger.kernel.org # v4.10+
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
The theory lockdep comes up with is as follows:
(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:
mmap_sem must be taken before sk_lock-AF_RXRPC
(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:
sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:
sk_lock-AF_INET must be taken before mmap_sem
However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.
Fix the general case by:
(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.
(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.
Note that the child created by sk_clone_lock() inherits the parent's
kern setting.
(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().
Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.
Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:
irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()
because they follow a sock_create_kern() and accept off of that.
Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When requesting a fallback algorithm, we should propagate the
NEED_FALLBACK bit when search for the underlying algorithm.
This will prevents drivers from allocating unnecessary fallbacks that
are never called. For instance, currently the vmx-crypto driver will use
the following chain of calls when calling the fallback implementation:
p8_aes_ctr -> ctr(p8_aes) -> aes-generic
However p8_aes will always delegate its calls to aes-generic. With this
patch, p8_aes_ctr will be able to use ctr(aes-generic) directly as its
fallback. The same applies to aes_s390.
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When requesting a fallback algorithm, we should propagate the
NEED_FALLBACK bit when search for the underlying algorithm.
This will prevents drivers from allocating unnecessary fallbacks that
are never called. For instance, currently the vmx-crypto driver will use
the following chain of calls when calling the fallback implementation:
p8_aes_cbc -> cbc(p8_aes) -> aes-generic
However p8_aes will always delegate its calls to aes-generic. With this
patch, p8_aes_cbc will be able to use cbc(aes-generic) directly as its
fallback. The same applies to aes_s390.
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cryptographic test vectors should never be modified, so constify them to
enforce this at both compile-time and run-time. This moves a significant
amount of data from .data to .rodata when the crypto tests are enabled.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Constify the buffer passed to crypto_kpp_set_secret() and
kpp_alg.set_secret, since it is never modified.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To prevent unnecessary branching, mark the exit condition of the
primary loop as likely(), given that a carry in a 32-bit counter
occurs very rarely.
On arm64, the resulting code is emitted by GCC as
9a8: cmp w1, #0x3
9ac: add x3, x0, w1, uxtw
9b0: b.ls 9e0 <crypto_inc+0x38>
9b4: ldr w2, [x3,#-4]!
9b8: rev w2, w2
9bc: add w2, w2, #0x1
9c0: rev w4, w2
9c4: str w4, [x3]
9c8: cbz w2, 9d0 <crypto_inc+0x28>
9cc: ret
where the two remaining branch conditions (one for size < 4 and one for
the carry) are statically predicted as non-taken, resulting in optimal
execution in the vast majority of cases.
Also, replace the open coded alignment test with IS_ALIGNED().
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Constify the multiplication tables passed to the 4k and 64k
multiplication functions, as they are not modified by these functions.
Cc: Alex Cope <alexcope@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Though the GF(2^128) byte overflow tables were named the "lle" and "bbe"
tables, they are not actually tied to these element formats
specifically, but rather to particular a "bit endianness". For example,
the bbe table is actually used for both bbe and ble multiplication.
Therefore, rename the tables to "le" and "be" and update the comment to
explain this.
Cc: Alex Cope <alexcope@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The xx() macro serves no purpose and can be removed.
Cc: Alex Cope <alexcope@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix incorrect references to GF(128) instead of GF(2^128), as these are
two entirely different fields, and fix a few other incorrect comments.
Cc: Alex Cope <alexcope@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
- vmalloc stack regression in CCM
- Build problem in CRC32 on ARM
- Memory leak in cavium
- Missing Kconfig dependencies in atmel and mediatek
- XTS Regression on some platforms (s390 and ppc)
- Memory overrun in CCM test vector
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: vmx - Use skcipher for xts fallback
crypto: vmx - Use skcipher for cbc fallback
crypto: testmgr - Pad aes_ccm_enc_tv_template vector
crypto: arm/crc32 - add build time test for CRC instruction support
crypto: arm/crc32 - fix build error with outdated binutils
crypto: ccm - move cbcmac input off the stack
crypto: xts - Propagate NEED_FALLBACK bit
crypto: api - Add crypto_requires_off helper
crypto: atmel - CRYPTO_DEV_MEDIATEK should depend on HAS_DMA
crypto: atmel - CRYPTO_DEV_ATMEL_TDES and CRYPTO_DEV_ATMEL_SHA should depend on HAS_DMA
crypto: cavium - fix leak on curr if curr->head fails to be allocated
crypto: cavium - Fix couple of static checker errors
We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/stat.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix up affected files that include this signal functionality via sched.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to move scheduler ABI details to <uapi/linux/sched/types.h>,
which will be used from a number of .c files.
Create empty placeholder header that maps to <linux/types.h>.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Running with KASAN and crypto tests currently gives
BUG: KASAN: global-out-of-bounds in __test_aead+0x9d9/0x2200 at addr ffffffff8212fca0
Read of size 16 by task cryptomgr_test/1107
Address belongs to variable 0xffffffff8212fca0
CPU: 0 PID: 1107 Comm: cryptomgr_test Not tainted 4.10.0+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
Call Trace:
dump_stack+0x63/0x8a
kasan_report.part.1+0x4a7/0x4e0
? __test_aead+0x9d9/0x2200
? crypto_ccm_init_crypt+0x218/0x3c0 [ccm]
kasan_report+0x20/0x30
check_memory_region+0x13c/0x1a0
memcpy+0x23/0x50
__test_aead+0x9d9/0x2200
? kasan_unpoison_shadow+0x35/0x50
? alg_test_akcipher+0xf0/0xf0
? crypto_skcipher_init_tfm+0x2e3/0x310
? crypto_spawn_tfm2+0x37/0x60
? crypto_ccm_init_tfm+0xa9/0xd0 [ccm]
? crypto_aead_init_tfm+0x7b/0x90
? crypto_alloc_tfm+0xc4/0x190
test_aead+0x28/0xc0
alg_test_aead+0x54/0xd0
alg_test+0x1eb/0x3d0
? alg_find_test+0x90/0x90
? __sched_text_start+0x8/0x8
? __wake_up_common+0x70/0xb0
cryptomgr_test+0x4d/0x60
kthread+0x173/0x1c0
? crypto_acomp_scomp_free_ctx+0x60/0x60
? kthread_create_on_node+0xa0/0xa0
ret_from_fork+0x2c/0x40
Memory state around the buggy address:
ffffffff8212fb80: 00 00 00 00 01 fa fa fa fa fa fa fa 00 00 00 00
ffffffff8212fc00: 00 01 fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
>ffffffff8212fc80: fa fa fa fa 00 05 fa fa fa fa fa fa 00 00 00 00
^
ffffffff8212fd00: 01 fa fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
ffffffff8212fd80: fa fa fa fa 00 00 00 00 00 05 fa fa fa fa fa fa
This always happens on the same IV which is less than 16 bytes.
Per Ard,
"CCM IVs are 16 bytes, but due to the way they are constructed
internally, the final couple of bytes of input IV are dont-cares.
Apparently, we do read all 16 bytes, which triggers the KASAN errors."
Fix this by padding the IV with null bytes to be at least 16 bytes.
Cc: stable@vger.kernel.org
Fixes: 0bc5a6c5c7 ("crypto: testmgr - Disable rfc4309 test and convert
test vectors")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit f15f05b0a5 ("crypto: ccm - switch to separate cbcmac driver")
refactored the CCM driver to allow separate implementations of the
underlying MAC to be provided by a platform. However, in doing so, it
moved some data from the linear region to the stack, which violates the
SG constraints when the stack is virtually mapped.
So move idata/odata back to the request ctx struct, of which we can
reasonably expect that it has been allocated using kmalloc() et al.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Fixes: f15f05b0a5 ("crypto: ccm - switch to separate cbcmac driver")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When we're used as a fallback algorithm, we should propagate
the NEED_FALLBACK bit when searching for the underlying ECB mode.
This just happens to fix a hang too because otherwise the search
may end up loading the same module that triggered this XTS creation.
Cc: stable@vger.kernel.org #4.10
Fixes: f1c131b454 ("crypto: xts - Convert to skcipher")
Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Update the crypto modules using LZ4 compression as well as the test
cases in testmgr.h to work with the new LZ4 module version.
Link: http://lkml.kernel.org/r/1486321748-19085-4-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Cc: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Rui Salvaterra <rsalvaterra@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since the
commit f1c131b454
crypto: xts - Convert to skcipher
the XTS mode is based on ECB, so the mode must select
ECB otherwise it can fail to initialize.
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CCM driver forces 32-bit alignment even if the underlying ciphers
don't care about alignment. This is because crypto_xor() used to require
this, but since this is no longer the case, drop the hardcoded minimum
of 32 bits.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CCM driver was recently updated to defer the MAC part of the algorithm
to a dedicated crypto transform, and a template for instantiating such
transforms was added at the same time.
However, this new cbcmac template fails to take the alignmask of the
encapsulated cipher into account, which may result in buffer addresses
being passed down that are not sufficiently aligned.
So update the code to ensure that the digest buffer in the desc ctx
appears at a sufficiently aligned offset, and tweak the code so that all
calls to crypto_cipher_encrypt_one() operate on this buffer exclusively.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Instead of unconditionally forcing 4 byte alignment for all generic
chaining modes that rely on crypto_xor() or crypto_inc() (which may
result in unnecessary copying of data when the underlying hardware
can perform unaligned accesses efficiently), make those functions
deal with unaligned input explicitly, but only if the Kconfig symbol
HAVE_EFFICIENT_UNALIGNED_ACCESS is set. This will allow us to drop
the alignmasks from the CBC, CMAC, CTR, CTS, PCBC and SEQIV drivers.
For crypto_inc(), this simply involves making the 4-byte stride
conditional on HAVE_EFFICIENT_UNALIGNED_ACCESS being set, given that
it typically operates on 16 byte buffers.
For crypto_xor(), an algorithm is implemented that simply runs through
the input using the largest strides possible if unaligned accesses are
allowed. If they are not, an optimal sequence of memory accesses is
emitted that takes the relative alignment of the input buffers into
account, e.g., if the relative misalignment of dst and src is 4 bytes,
the entire xor operation will be completed using 4 byte loads and stores
(modulo unaligned bits at the start and end). Note that all expressions
involving misalign are simply eliminated by the compiler when
HAVE_EFFICIENT_UNALIGNED_ACCESS is defined.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
An ancient gcc bug (first reported in 2003) has apparently resurfaced
on MIPS, where kernelci.org reports an overly large stack frame in the
whirlpool hash algorithm:
crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=]
With some testing in different configurations, I'm seeing large
variations in stack frames size up to 1500 bytes for what should have
around 300 bytes at most. I also checked the reference implementation,
which is essentially the same code but also comes with some test and
benchmarking infrastructure.
It seems that recent compiler versions on at least arm, arm64 and powerpc
have a partial fix for this problem, but enabling "-fsched-pressure", but
even with that fix they suffer from the issue to a certain degree. Some
testing on arm64 shows that the time needed to hash a given amount of
data is roughly proportional to the stack frame size here, which makes
sense given that the wp512 implementation is doing lots of loads for
table lookups, and the problem with the overly large stack is a result
of doing a lot more loads and stores for spilled registers (as seen from
inspecting the object code).
Disabling -fschedule-insns consistently fixes the problem for wp512,
in my collection of cross-compilers, the results are consistently better
or identical when comparing the stack sizes in this function, though
some architectures (notable x86) have schedule-insns disabled by
default.
The four columns are:
default: -O2
press: -O2 -fsched-pressure
nopress: -O2 -fschedule-insns -fno-sched-pressure
nosched: -O2 -no-schedule-insns (disables sched-pressure)
default press nopress nosched
alpha-linux-gcc-4.9.3 1136 848 1136 176
am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104
arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
cris-linux-gcc-4.9.3 272 272 272 272
frv-linux-gcc-4.9.3 1128 1000 1128 280
hppa64-linux-gcc-4.9.3 1128 336 1128 184
hppa-linux-gcc-4.9.3 644 308 644 276
i386-linux-gcc-4.9.3 352 352 352 352
m32r-linux-gcc-4.9.3 720 656 720 268
microblaze-linux-gcc-4.9.3 1108 604 1108 256
mips64-linux-gcc-4.9.3 1328 592 1328 208
mips-linux-gcc-4.9.3 1096 624 1096 240
powerpc64-linux-gcc-4.9.3 1088 432 1088 160
powerpc-linux-gcc-4.9.3 1080 584 1080 224
s390-linux-gcc-4.9.3 456 456 624 360
sh3-linux-gcc-4.9.3 292 292 292 292
sparc64-linux-gcc-4.9.3 992 240 992 208
sparc-linux-gcc-4.9.3 680 592 680 312
x86_64-linux-gcc-4.9.3 224 240 272 224
xtensa-linux-gcc-4.9.3 1152 704 1152 304
aarch64-linux-gcc-7.0.0 224 224 1104 208
arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
mips-linux-gcc-7.0.0 1120 648 1120 272
x86_64-linux-gcc-7.0.1 240 240 304 240
arm-linux-gnueabi-gcc-4.4.7 840 392
arm-linux-gnueabi-gcc-4.5.4 784 728 784 320
arm-linux-gnueabi-gcc-4.6.4 736 728 736 304
arm-linux-gnueabi-gcc-4.7.4 944 784 944 352
arm-linux-gnueabi-gcc-4.8.5 464 464 760 352
arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336
arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344
arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
Trying the same test for serpent-generic, the picture is a bit different,
and while -fno-schedule-insns is generally better here than the default,
-fsched-pressure wins overall, so I picked that instead.
default press nopress nosched
alpha-linux-gcc-4.9.3 1392 864 1392 960
am33_2.0-linux-gcc-4.9.3 536 524 536 528
arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
cris-linux-gcc-4.9.3 528 528 528 528
frv-linux-gcc-4.9.3 536 400 536 504
hppa64-linux-gcc-4.9.3 524 208 524 480
hppa-linux-gcc-4.9.3 768 472 768 508
i386-linux-gcc-4.9.3 564 564 564 564
m32r-linux-gcc-4.9.3 712 576 712 532
microblaze-linux-gcc-4.9.3 724 392 724 512
mips64-linux-gcc-4.9.3 720 384 720 496
mips-linux-gcc-4.9.3 728 384 728 496
powerpc64-linux-gcc-4.9.3 704 304 704 480
powerpc-linux-gcc-4.9.3 704 296 704 480
s390-linux-gcc-4.9.3 560 560 592 536
sh3-linux-gcc-4.9.3 540 540 540 540
sparc64-linux-gcc-4.9.3 544 352 544 496
sparc-linux-gcc-4.9.3 544 344 544 496
x86_64-linux-gcc-4.9.3 528 536 576 528
xtensa-linux-gcc-4.9.3 752 544 752 544
aarch64-linux-gcc-7.0.0 432 432 656 480
arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
mips-linux-gcc-7.0.0 720 464 720 488
x86_64-linux-gcc-7.0.1 536 528 600 536
arm-linux-gnueabi-gcc-4.4.7 592 440
arm-linux-gnueabi-gcc-4.5.4 776 448 776 544
arm-linux-gnueabi-gcc-4.6.4 776 448 776 544
arm-linux-gnueabi-gcc-4.7.4 768 448 768 544
arm-linux-gnueabi-gcc-4.8.5 488 488 776 544
arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
arm-linux-gnueabi-gcc-5.3.1 552 552 776 536
arm-linux-gnueabi-gcc-6.1.1 560 560 776 536
arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
I did not do any runtime tests with serpent, so it is possible that stack
frame size does not directly correlate with runtime performance here and
it actually makes things worse, but it's more likely to help here, and
the reduced stack frame size is probably enough reason to apply the patch,
especially given that the crypto code is often used in deep call chains.
Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/
Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>