kernel_optimize_test/samples/bpf/hbm.h

/* SPDX-License-Identifier: GPL-2.0
 *
 * Copyright (c) 2019 Facebook
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of version 2 of the GNU General Public
 * License as published by the Free Software Foundation.
 *
 * Include file for Host Bandwidth Management (HBM) programs
 */
struct hbm_vqueue {
	struct bpf_spin_lock lock;
	/* 4 byte hole */
	unsigned long long lasttime;	/* In ns */
	int credit;			/* In bytes */
	unsigned int rate;		/* In bytes per NS << 20 */
};

struct hbm_queue_stats {
	unsigned long rate;		/* in Mbps*/
	unsigned long stats:1,		/* get HBM stats (marked, dropped,..) */
		loopback:1,		/* also limit flows using loopback */
		no_cn:1;		/* do not use cn flags */
	unsigned long long pkts_marked;
	unsigned long long bytes_marked;
	unsigned long long pkts_dropped;
	unsigned long long bytes_dropped;
	unsigned long long pkts_total;
	unsigned long long bytes_total;
	unsigned long long firstPacketTime;
	unsigned long long lastPacketTime;
	unsigned long long pkts_ecn_ce;
	unsigned long long returnValCount[4];
	unsigned long long sum_cwnd;
	unsigned long long sum_rtt;
	unsigned long long sum_cwnd_cnt;
	long long sum_credit;
};
bpf: Sample HBM BPF program to limit egress bw A cgroup skb BPF program to limit cgroup output bandwidth. It uses a modified virtual token bucket queue to limit average egress bandwidth. The implementation uses credits instead of tokens. Negative credits imply that queueing would have happened (this is a virtual queue, so no queueing is done by it. However, queueing may occur at the actual qdisc (which is not used for rate limiting). This implementation uses 3 thresholds, one to start marking packets and the other two to drop packets: CREDIT - <--------------------------\|------------------------> + \| \| \| 0 \| Large pkt \| \| drop thresh \| Small pkt drop Mark threshold thresh The effect of marking depends on the type of packet: a) If the packet is ECN enabled, then the packet is ECN ce marked. The current mark threshold is tuned for DCTCP. c) Else, it is dropped if it is a large packet. If the credit is below the drop threshold, the packet is dropped. Note that dropping a packet through the BPF program does not trigger CWR (Congestion Window Reduction) in TCP packets. A future patch will add support for triggering CWR. This BPF program actually uses 2 drop thresholds, one threshold for larger packets (>= 120 bytes) and another for smaller packets. This protects smaller packets such as SYNs, ACKs, etc. The default bandwidth limit is set at 1Gbps but this can be changed by a user program through a shared BPF map. In addition, by default this BPF program does not limit connections using loopback. This behavior can be overwritten by the user program. There is also an option to calculate some statistics, such as percent of packets marked or dropped, which the user program can access. A latter patch provides such a program (hbm.c) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 2019-03-02 04:38:48 +08:00			`/* SPDX-License-Identifier: GPL-2.0`
			`*`
			`* Copyright (c) 2019 Facebook`
			`*`
			`* This program is free software; you can redistribute it and/or`
			`* modify it under the terms of version 2 of the GNU General Public`
			`* License as published by the Free Software Foundation.`
			`*`
			`* Include file for Host Bandwidth Management (HBM) programs`
			`*/`
			`struct hbm_vqueue {`
			`struct bpf_spin_lock lock;`
			`/* 4 byte hole */`
			`unsigned long long lasttime; /* In ns */`
			`int credit; /* In bytes */`
			`unsigned int rate; /* In bytes per NS << 20 */`
			`};`

			`struct hbm_queue_stats {`
			`unsigned long rate; /* in Mbps*/`
			`unsigned long stats:1, /* get HBM stats (marked, dropped,..) */`
bpf: Add cn support to hbm_out_kern.c Update hbm_out_kern.c to support returning cn notifications. Also updates relevant files to allow disabling cn notifications. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 2019-05-29 07:59:39 +08:00			`loopback:1, /* also limit flows using loopback */`
			`no_cn:1; /* do not use cn flags */`
bpf: Sample HBM BPF program to limit egress bw A cgroup skb BPF program to limit cgroup output bandwidth. It uses a modified virtual token bucket queue to limit average egress bandwidth. The implementation uses credits instead of tokens. Negative credits imply that queueing would have happened (this is a virtual queue, so no queueing is done by it. However, queueing may occur at the actual qdisc (which is not used for rate limiting). This implementation uses 3 thresholds, one to start marking packets and the other two to drop packets: CREDIT - <--------------------------\|------------------------> + \| \| \| 0 \| Large pkt \| \| drop thresh \| Small pkt drop Mark threshold thresh The effect of marking depends on the type of packet: a) If the packet is ECN enabled, then the packet is ECN ce marked. The current mark threshold is tuned for DCTCP. c) Else, it is dropped if it is a large packet. If the credit is below the drop threshold, the packet is dropped. Note that dropping a packet through the BPF program does not trigger CWR (Congestion Window Reduction) in TCP packets. A future patch will add support for triggering CWR. This BPF program actually uses 2 drop thresholds, one threshold for larger packets (>= 120 bytes) and another for smaller packets. This protects smaller packets such as SYNs, ACKs, etc. The default bandwidth limit is set at 1Gbps but this can be changed by a user program through a shared BPF map. In addition, by default this BPF program does not limit connections using loopback. This behavior can be overwritten by the user program. There is also an option to calculate some statistics, such as percent of packets marked or dropped, which the user program can access. A latter patch provides such a program (hbm.c) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 2019-03-02 04:38:48 +08:00			`unsigned long long pkts_marked;`
			`unsigned long long bytes_marked;`
			`unsigned long long pkts_dropped;`
			`unsigned long long bytes_dropped;`
			`unsigned long long pkts_total;`
			`unsigned long long bytes_total;`
			`unsigned long long firstPacketTime;`
			`unsigned long long lastPacketTime;`
bpf: Add more stats to HBM Adds more stats to HBM, including average cwnd and rtt of all TCP flows, percents of packets that are ecn ce marked and distribution of return values. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 2019-05-29 07:59:40 +08:00			`unsigned long long pkts_ecn_ce;`
			`unsigned long long returnValCount[4];`
			`unsigned long long sum_cwnd;`
			`unsigned long long sum_rtt;`
			`unsigned long long sum_cwnd_cnt;`
			`long long sum_credit;`
bpf: Sample HBM BPF program to limit egress bw A cgroup skb BPF program to limit cgroup output bandwidth. It uses a modified virtual token bucket queue to limit average egress bandwidth. The implementation uses credits instead of tokens. Negative credits imply that queueing would have happened (this is a virtual queue, so no queueing is done by it. However, queueing may occur at the actual qdisc (which is not used for rate limiting). This implementation uses 3 thresholds, one to start marking packets and the other two to drop packets: CREDIT - <--------------------------\|------------------------> + \| \| \| 0 \| Large pkt \| \| drop thresh \| Small pkt drop Mark threshold thresh The effect of marking depends on the type of packet: a) If the packet is ECN enabled, then the packet is ECN ce marked. The current mark threshold is tuned for DCTCP. c) Else, it is dropped if it is a large packet. If the credit is below the drop threshold, the packet is dropped. Note that dropping a packet through the BPF program does not trigger CWR (Congestion Window Reduction) in TCP packets. A future patch will add support for triggering CWR. This BPF program actually uses 2 drop thresholds, one threshold for larger packets (>= 120 bytes) and another for smaller packets. This protects smaller packets such as SYNs, ACKs, etc. The default bandwidth limit is set at 1Gbps but this can be changed by a user program through a shared BPF map. In addition, by default this BPF program does not limit connections using loopback. This behavior can be overwritten by the user program. There is also an option to calculate some statistics, such as percent of packets marked or dropped, which the user program can access. A latter patch provides such a program (hbm.c) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> 2019-03-02 04:38:48 +08:00			`};`