Skip to content

Commit 2b14e1e

Browse files
committed
Merge branch 'net-sctp-Avoid-allocating-high-order-memory-with-kmalloc'
Konstantin Khorenko says: ==================== net/sctp: Avoid allocating high order memory with kmalloc() Each SCTP association can have up to 65535 input and output streams. For each stream type an array of sctp_stream_in or sctp_stream_out structures is allocated using kmalloc_array() function. This function allocates physically contiguous memory regions, so this can lead to allocation of memory regions of very high order, i.e.: sizeof(struct sctp_stream_out) == 24, ((65535 * 24) / 4096) == 383 memory pages (4096 byte per page), which means 9th memory order. This can lead to a memory allocation failures on the systems under a memory stress. We actually do not need these arrays of memory to be physically contiguous. Possible simple solution would be to use kvmalloc() instread of kmalloc() as kvmalloc() can allocate physically scattered pages if contiguous pages are not available. But the problem is that the allocation can happed in a softirq context with GFP_ATOMIC flag set, and kvmalloc() cannot be used in this scenario. So the other possible solution is to use flexible arrays instead of contiguios arrays of memory so that the memory would be allocated on a per-page basis. This patchset replaces kvmalloc() with flex_array usage. It consists of two parts: * First patch is preparatory - it mechanically wraps all direct access to assoc->stream.out[] and assoc->stream.in[] arrays with SCTP_SO() and SCTP_SI() wrappers so that later a direct array access could be easily changed to an access to a flex_array (or any other possible alternative). * Second patch replaces kmalloc_array() with flex_array usage. v2 changes: sctp_stream_in() users are updated to provide stream as an argument, sctp_stream_{in,out}_ptr() are now just sctp_stream_{in,out}(). v3 changes: Move type chages struct sctp_stream_out -> flex_array to next patch. Make sctp_stream_{in,out}() static incline and move them to a header. Performance results (single stream): ==================================== * Kernel: v4.18-rc6 - stock and with 2 patches from Oleg (earlier in this thread) * Node: CPU (8 cores): Intel(R) Xeon(R) CPU E31230 @ 3.20GHz RAM: 32 Gb * netperf: taken from https://github.com/HewlettPackard/netperf.git, compiled from sources with sctp support * netperf server and client are run on the same node * ip link set lo mtu 1500 The script used to run tests: # cat run_tests.sh #!/bin/bash for test in SCTP_STREAM SCTP_STREAM_MANY SCTP_RR SCTP_RR_MANY; do echo "TEST: $test"; for i in `seq 1 3`; do echo "Iteration: $i"; set -x netperf -t $test -H localhost -p 22222 -S 200000,200000 -s 200000,200000 \ -l 60 -- -m 1452; set +x done done ================================================ Results (a bit reformatted to be more readable): Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec v4.18-rc7 v4.18-rc7 + fixes TEST: SCTP_STREAM 212992 212992 1452 60.21 1125.52 1247.04 212992 212992 1452 60.20 1376.38 1149.95 212992 212992 1452 60.20 1131.40 1163.85 TEST: SCTP_STREAM_MANY 212992 212992 1452 60.00 1111.00 1310.05 212992 212992 1452 60.00 1188.55 1130.50 212992 212992 1452 60.00 1108.06 1162.50 =========== Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size Size Time Rate bytes Bytes bytes bytes secs. per sec v4.18-rc7 v4.18-rc7 + fixes TEST: SCTP_RR 212992 212992 1 1 60.00 45486.98 46089.43 212992 212992 1 1 60.00 45584.18 45994.21 212992 212992 1 1 60.00 45703.86 45720.84 TEST: SCTP_RR_MANY 212992 212992 1 1 60.00 40.75 40.77 212992 212992 1 1 60.00 40.58 40.08 212992 212992 1 1 60.00 39.98 39.97 Performance results for many streams: ===================================== * Kernel: v4.18-rc8 - stock and with 2 patches v3 * Node: CPU (8 cores): Intel(R) Xeon(R) CPU E31230 @ 3.20GHz RAM: 32 Gb * sctp_test: https://github.com/sctp/lksctp-tools * both server and client are run on the same node * ip link set lo mtu 1500 * sysctl -w vm.max_map_count=65530000 (need it to make memory fragmented) The script used to run tests: ============================= # cat run_sctp_test.sh #!/bin/bash set -x uname -r ip link set lo mtu 1500 swapoff -a free cat /proc/buddyinfo ./src/apps/sctp_test -H 127.0.0.1 -P 22222 -l -d 0 & sleep 3 time ./src/apps/sctp_test -H 127.0.0.1 -P 22221 -h 127.0.0.1 -p 22222 \ -s -c 1 -M 65535 -T -t 1 -x 100000 -d 0 1>/dev/null killall -9 lt-sctp_test =============================== Results (a bit reformatted to be more readable): 1) ms stock kernel v4.18-rc8, no memory fragmentation test 1 test 2 test 3 real 0m14.715s 0m14.593s 0m15.954s user 0m0.954s 0m0.955s 0m0.854s sys 0m13.388s 0m12.537s 0m13.749s 2) kernel with fixes, no memory fragmentation test 1 test 2 test 3 real 0m14.959s 0m14.693s 0m14.762s user 0m0.948s 0m0.921s 0m0.929s sys 0m13.538s 0m13.225s 0m13.217s 3) kernel with fixes, memory fragmented 'free': total used free shared buff/cache available Mem: 32906008 30555200 302740 764 2048068 266452 Mem: 32906008 30379948 541436 764 1984624 442376 Mem: 32906008 30717312 262380 764 1926316 109908 /proc/buddyinfo: Node 0, zone Normal 40773 37 34 29 0 0 0 0 0 0 0 Node 0, zone Normal 100332 68 8 4 2 1 1 0 0 0 0 Node 0, zone Normal 31113 7 2 1 0 0 0 0 0 0 0 test 1 test 2 test 3 real 0m14.159s 0m15.252s 0m15.826s user 0m0.839s 0m1.004s 0m1.048s sys 0m11.827s 0m14.240s 0m14.778s ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2 parents b70f1f3 + 0d493b4 commit 2b14e1e

File tree

9 files changed

+172
-105
lines changed

9 files changed

+172
-105
lines changed

include/net/sctp/structs.h

Lines changed: 28 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
#include <linux/atomic.h> /* This gets us atomic counters. */
5858
#include <linux/skbuff.h> /* We need sk_buff_head. */
5959
#include <linux/workqueue.h> /* We need tq_struct. */
60+
#include <linux/flex_array.h> /* We need flex_array. */
6061
#include <linux/sctp.h> /* We need sctp* header structs. */
6162
#include <net/sctp/auth.h> /* We need auth specific structs */
6263
#include <net/ip.h> /* For inet_skb_parm */
@@ -398,37 +399,35 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new);
398399

399400
/* What is the current SSN number for this stream? */
400401
#define sctp_ssn_peek(stream, type, sid) \
401-
((stream)->type[sid].ssn)
402+
(sctp_stream_##type((stream), (sid))->ssn)
402403

403404
/* Return the next SSN number for this stream. */
404405
#define sctp_ssn_next(stream, type, sid) \
405-
((stream)->type[sid].ssn++)
406+
(sctp_stream_##type((stream), (sid))->ssn++)
406407

407408
/* Skip over this ssn and all below. */
408409
#define sctp_ssn_skip(stream, type, sid, ssn) \
409-
((stream)->type[sid].ssn = ssn + 1)
410+
(sctp_stream_##type((stream), (sid))->ssn = ssn + 1)
410411

411412
/* What is the current MID number for this stream? */
412413
#define sctp_mid_peek(stream, type, sid) \
413-
((stream)->type[sid].mid)
414+
(sctp_stream_##type((stream), (sid))->mid)
414415

415416
/* Return the next MID number for this stream. */
416417
#define sctp_mid_next(stream, type, sid) \
417-
((stream)->type[sid].mid++)
418+
(sctp_stream_##type((stream), (sid))->mid++)
418419

419420
/* Skip over this mid and all below. */
420421
#define sctp_mid_skip(stream, type, sid, mid) \
421-
((stream)->type[sid].mid = mid + 1)
422-
423-
#define sctp_stream_in(asoc, sid) (&(asoc)->stream.in[sid])
422+
(sctp_stream_##type((stream), (sid))->mid = mid + 1)
424423

425424
/* What is the current MID_uo number for this stream? */
426425
#define sctp_mid_uo_peek(stream, type, sid) \
427-
((stream)->type[sid].mid_uo)
426+
(sctp_stream_##type((stream), (sid))->mid_uo)
428427

429428
/* Return the next MID_uo number for this stream. */
430429
#define sctp_mid_uo_next(stream, type, sid) \
431-
((stream)->type[sid].mid_uo++)
430+
(sctp_stream_##type((stream), (sid))->mid_uo++)
432431

433432
/*
434433
* Pointers to address related SCTP functions.
@@ -1440,8 +1439,8 @@ struct sctp_stream_in {
14401439
};
14411440

14421441
struct sctp_stream {
1443-
struct sctp_stream_out *out;
1444-
struct sctp_stream_in *in;
1442+
struct flex_array *out;
1443+
struct flex_array *in;
14451444
__u16 outcnt;
14461445
__u16 incnt;
14471446
/* Current stream being sent, if any */
@@ -1463,6 +1462,23 @@ struct sctp_stream {
14631462
struct sctp_stream_interleave *si;
14641463
};
14651464

1465+
static inline struct sctp_stream_out *sctp_stream_out(
1466+
const struct sctp_stream *stream,
1467+
__u16 sid)
1468+
{
1469+
return flex_array_get(stream->out, sid);
1470+
}
1471+
1472+
static inline struct sctp_stream_in *sctp_stream_in(
1473+
const struct sctp_stream *stream,
1474+
__u16 sid)
1475+
{
1476+
return flex_array_get(stream->in, sid);
1477+
}
1478+
1479+
#define SCTP_SO(s, i) sctp_stream_out((s), (i))
1480+
#define SCTP_SI(s, i) sctp_stream_in((s), (i))
1481+
14661482
#define SCTP_STREAM_CLOSED 0x00
14671483
#define SCTP_STREAM_OPEN 0x01
14681484

net/sctp/chunk.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,8 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
325325
if (SCTP_PR_TTL_ENABLED(chunk->sinfo.sinfo_flags) &&
326326
time_after(jiffies, chunk->msg->expires_at)) {
327327
struct sctp_stream_out *streamout =
328-
&chunk->asoc->stream.out[chunk->sinfo.sinfo_stream];
328+
SCTP_SO(&chunk->asoc->stream,
329+
chunk->sinfo.sinfo_stream);
329330

330331
if (chunk->sent_count) {
331332
chunk->asoc->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
@@ -339,7 +340,8 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
339340
} else if (SCTP_PR_RTX_ENABLED(chunk->sinfo.sinfo_flags) &&
340341
chunk->sent_count > chunk->sinfo.sinfo_timetolive) {
341342
struct sctp_stream_out *streamout =
342-
&chunk->asoc->stream.out[chunk->sinfo.sinfo_stream];
343+
SCTP_SO(&chunk->asoc->stream,
344+
chunk->sinfo.sinfo_stream);
343345

344346
chunk->asoc->abandoned_sent[SCTP_PR_INDEX(RTX)]++;
345347
streamout->ext->abandoned_sent[SCTP_PR_INDEX(RTX)]++;

net/sctp/outqueue.c

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ static inline void sctp_outq_head_data(struct sctp_outq *q,
8080
q->out_qlen += ch->skb->len;
8181

8282
stream = sctp_chunk_stream_no(ch);
83-
oute = q->asoc->stream.out[stream].ext;
83+
oute = SCTP_SO(&q->asoc->stream, stream)->ext;
8484
list_add(&ch->stream_list, &oute->outq);
8585
}
8686

@@ -101,7 +101,7 @@ static inline void sctp_outq_tail_data(struct sctp_outq *q,
101101
q->out_qlen += ch->skb->len;
102102

103103
stream = sctp_chunk_stream_no(ch);
104-
oute = q->asoc->stream.out[stream].ext;
104+
oute = SCTP_SO(&q->asoc->stream, stream)->ext;
105105
list_add_tail(&ch->stream_list, &oute->outq);
106106
}
107107

@@ -372,7 +372,7 @@ static int sctp_prsctp_prune_sent(struct sctp_association *asoc,
372372
sctp_insert_list(&asoc->outqueue.abandoned,
373373
&chk->transmitted_list);
374374

375-
streamout = &asoc->stream.out[chk->sinfo.sinfo_stream];
375+
streamout = SCTP_SO(&asoc->stream, chk->sinfo.sinfo_stream);
376376
asoc->sent_cnt_removable--;
377377
asoc->abandoned_sent[SCTP_PR_INDEX(PRIO)]++;
378378
streamout->ext->abandoned_sent[SCTP_PR_INDEX(PRIO)]++;
@@ -416,7 +416,7 @@ static int sctp_prsctp_prune_unsent(struct sctp_association *asoc,
416416
asoc->abandoned_unsent[SCTP_PR_INDEX(PRIO)]++;
417417
if (chk->sinfo.sinfo_stream < asoc->stream.outcnt) {
418418
struct sctp_stream_out *streamout =
419-
&asoc->stream.out[chk->sinfo.sinfo_stream];
419+
SCTP_SO(&asoc->stream, chk->sinfo.sinfo_stream);
420420

421421
streamout->ext->abandoned_unsent[SCTP_PR_INDEX(PRIO)]++;
422422
}
@@ -1082,6 +1082,7 @@ static void sctp_outq_flush_data(struct sctp_flush_ctx *ctx,
10821082
/* Finally, transmit new packets. */
10831083
while ((chunk = sctp_outq_dequeue_data(ctx->q)) != NULL) {
10841084
__u32 sid = ntohs(chunk->subh.data_hdr->stream);
1085+
__u8 stream_state = SCTP_SO(&ctx->asoc->stream, sid)->state;
10851086

10861087
/* Has this chunk expired? */
10871088
if (sctp_chunk_abandoned(chunk)) {
@@ -1091,7 +1092,7 @@ static void sctp_outq_flush_data(struct sctp_flush_ctx *ctx,
10911092
continue;
10921093
}
10931094

1094-
if (ctx->asoc->stream.out[sid].state == SCTP_STREAM_CLOSED) {
1095+
if (stream_state == SCTP_STREAM_CLOSED) {
10951096
sctp_outq_head_data(ctx->q, chunk);
10961097
break;
10971098
}

net/sctp/socket.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1911,7 +1911,7 @@ static int sctp_sendmsg_to_asoc(struct sctp_association *asoc,
19111911
goto err;
19121912
}
19131913

1914-
if (unlikely(!asoc->stream.out[sinfo->sinfo_stream].ext)) {
1914+
if (unlikely(!SCTP_SO(&asoc->stream, sinfo->sinfo_stream)->ext)) {
19151915
err = sctp_stream_init_ext(&asoc->stream, sinfo->sinfo_stream);
19161916
if (err)
19171917
goto err;
@@ -7154,7 +7154,7 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
71547154
if (!asoc || params.sprstat_sid >= asoc->stream.outcnt)
71557155
goto out;
71567156

7157-
streamoute = asoc->stream.out[params.sprstat_sid].ext;
7157+
streamoute = SCTP_SO(&asoc->stream, params.sprstat_sid)->ext;
71587158
if (!streamoute) {
71597159
/* Not allocated yet, means all stats are 0 */
71607160
params.sprstat_abandoned_unsent = 0;

0 commit comments

Comments
 (0)