984 Commits

Author SHA1 Message Date
Alexander Potapenko faa132ab55 sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
commit 15339e441ec46fbc3bf3486bb1ae4845b0f1bb8d upstream.

KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
Make sure all fields of an IPv6 address are initialized, which
guarantees that the IPv4 fields are also initialized.

==================================================================
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
 RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
 RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
 R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
 R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
 origin description: ----dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==================================================================
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
 RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
 RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
 R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
 R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
 origin description: ----dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==================================================================

Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-11-02 07:16:22 +01:00
Xin Long 075de21562 sctp: listen on the sock only when it's state is listening or closed
commit 34b2789f1d9bf8dcca9b5cb553d076ca2cd898ee upstream.

Now sctp doesn't check sock's state before listening on it. It could
even cause changing a sock with any state to become a listening sock
when doing sctp_listen.

This patch is to fix it by checking sock's state in sctp_listen, so
that it will listen on the sock with right state.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-20 14:04:51 +02:00
Marcelo Ricardo Leitner 620d73a9b7 sctp: deny peeloff operation on asocs with threads sleeping on it
commit dfcb9f4f99f1e9a49e43398a7bfbf56927544af1 upstream.

commit 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
attempted to avoid a BUG_ON call when the association being used for a
sendmsg() is blocked waiting for more sndbuf and another thread did a
peeloff operation on such asoc, moving it to another socket.

As Ben Hutchings noticed, then in such case it would return without
locking back the socket and would cause two unlocks in a row.

Further analysis also revealed that it could allow a double free if the
application managed to peeloff the asoc that is created during the
sendmsg call, because then sctp_sendmsg() would try to free the asoc
that was created only for that call.

This patch takes another approach. It will deny the peeloff operation
if there is a thread sleeping on the asoc, so this situation doesn't
exist anymore. This avoids the issues described above and also honors
the syscalls that are already being handled (it can be multiple sendmsg
calls).

Joint work with Xin Long.

Fixes: 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-08 00:47:10 +02:00
Marcelo Ricardo Leitner 1f7cb732af sctp: avoid BUG_ON on sctp_wait_for_sndbuf
commit 2dcab598484185dea7ec22219c76dcdd59e3cb90 upstream.

Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.

This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.

Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-08 00:47:10 +02:00
Daniel Borkmann 8ceaec02b1 net: sctp: rework multihoming retransmission path selection to rfc4960
commit 4c47af4d5eb2c2f78f886079a3920a7078a6f0a0 upstream.

Problem statement: 1) both paths (primary path1 and alternate
path2) are up after the association has been established i.e.,
HB packets are normally exchanged, 2) path2 gets inactive after
path_max_retrans * max_rto timed out (i.e. path2 is down completely),
3) now, if a transmission times out on the only surviving/active
path1 (any ~1sec network service impact could cause this like
a channel bonding failover), then the retransmitted packets are
sent over the inactive path2; this happens with partial failover
and without it.

Besides not being optimal in the above scenario, a small failure
or timeout in the only existing path has the potential to cause
long delays in the retransmission (depending on RTO_MAX) until
the still active path is reselected. Further, when the T3-timeout
occurs, we have active_patch == retrans_path, and even though the
timeout occurred on the initial transmission of data, not a
retransmit, we end up updating retransmit path.

RFC4960, section 6.4. "Multi-Homed SCTP Endpoints" states under
6.4.1. "Failover from an Inactive Destination Address" the
following:

  Some of the transport addresses of a multi-homed SCTP endpoint
  may become inactive due to either the occurrence of certain
  error conditions (see Section 8.2) or adjustments from the
  SCTP user.

  When there is outbound data to send and the primary path
  becomes inactive (e.g., due to failures), or where the SCTP
  user explicitly requests to send data to an inactive
  destination transport address, before reporting an error to
  its ULP, the SCTP endpoint should try to send the data to an
  alternate __active__ destination transport address if one
  exists.

  When retransmitting data that timed out, if the endpoint is
  multihomed, it should consider each source-destination address
  pair in its retransmission selection policy. When retransmitting
  timed-out data, the endpoint should attempt to pick the most
  divergent source-destination pair from the original
  source-destination pair to which the packet was transmitted.

  Note: Rules for picking the most divergent source-destination
  pair are an implementation decision and are not specified
  within this document.

So, we should first reconsider to take the current active
retransmission transport if we cannot find an alternative
active one. If all of that fails, we can still round robin
through unkown, partial failover, and inactive ones in the
hope to find something still suitable.

Commit 4141ddc02a ("sctp: retran_path update bug fix") broke
that behaviour by selecting the next inactive transport when
no other active transport was found besides the current assoc's
peer.retran_path. Before commit 4141ddc02a, we would have
traversed through the list until we reach our peer.retran_path
again, and in case that is still in state SCTP_ACTIVE, we would
take it and return. Only if that is not the case either, we
take the next inactive transport.

Besides all that, another issue is that transports in state
SCTP_UNKNOWN could be preferred over transports in state
SCTP_ACTIVE in case a SCTP_ACTIVE transport appears after
SCTP_UNKNOWN in the transport list yielding a weaker transport
state to be used in retransmission.

This patch mostly reverts 4141ddc02a, but also rewrites
this function to introduce more clarity and strictness into
the code. A strict priority of transport states is enforced
in this patch, hence selection is active > unkown > partial
failover > inactive.

Fixes: 4141ddc02a ("sctp: retran_path update bug fix")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Acked-by: Vlad Yasevich <yasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: picked updated function from 3.12 except the debug statement]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-06-08 00:47:09 +02:00
Marcelo Ricardo Leitner 5b97e5f348 sctp: assign assoc_id earlier in __sctp_connect
commit 7233bc84a3aeda835d334499dc00448373caf5c0 upstream.

sctp_wait_for_connect() currently already holds the asoc to keep it
alive during the sleep, in case another thread release it. But Andrey
Konovalov and Dmitry Vyukov reported an use-after-free in such
situation.

Problem is that __sctp_connect() doesn't get a ref on the asoc and will
do a read on the asoc after calling sctp_wait_for_connect(), but by then
another thread may have closed it and the _put on sctp_wait_for_connect
will actually release it, causing the use-after-free.

Fix is, instead of doing the read after waiting for the connect, do it
before so, and avoid this issue as the socket is still locked by then.
There should be no issue on returning the asoc id in case of failure as
the application shouldn't trust on that number in such situations
anyway.

This issue doesn't exist in sctp_sendmsg() path.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-02-10 11:03:55 +01:00
Marcelo Ricardo Leitner 0058a4c1b6 sctp: validate chunk len before actually using it
commit bf911e985d6bbaa328c20c3e05f4eb03de11fdd6 upstream.

Andrey Konovalov reported that KASAN detected that SCTP was using a slab
beyond the boundaries. It was caused because when handling out of the
blue packets in function sctp_sf_ootb() it was checking the chunk len
only after already processing the first chunk, validating only for the
2nd and subsequent ones.

The fix is to just move the check upwards so it's also validated for the
1st chunk.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-02-10 11:03:53 +01:00
Jiri Slaby 2b8ed05b14 net: sctp, forbid negative length
commit a4b8e71b05c27bae6bad3bdecddbc6b68a3ad8cf upstream.

Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
        if (len < sizeof(int))
                return -EINVAL;

On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.

Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.

Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.

If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
 0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
 ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
 0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
Call Trace:
 [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
 [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
 [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
 [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
 [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
 [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
 [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
 [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2017-02-10 11:03:53 +01:00
Karl Heiss 1cf7c76efb sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
commit 635682a14427d241bab7bbdeebb48a7d7b91638e upstream.

A case can occur when sctp_accept() is called by the user during
a heartbeat timeout event after the 4-way handshake.  Since
sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
the listening socket but released with the new association socket.
The result is a deadlock on any future attempts to take the listening
socket lock.

Note that this race can occur with other SCTP timeouts that take
the bh_lock_sock() in the event sctp_accept() is called.

 BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
 ...
 RIP: 0010:[<ffffffff8152d48e>]  [<ffffffff8152d48e>] _spin_lock+0x1e/0x30
 RSP: 0018:ffff880028323b20  EFLAGS: 00000206
 RAX: 0000000000000002 RBX: ffff880028323b20 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff880028323be0 RDI: ffff8804632c4b48
 RBP: ffffffff8100bb93 R08: 0000000000000000 R09: 0000000000000000
 R10: ffff880610662280 R11: 0000000000000100 R12: ffff880028323aa0
 R13: ffff8804383c3880 R14: ffff880028323a90 R15: ffffffff81534225
 FS:  0000000000000000(0000) GS:ffff880028320000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 CR2: 00000000006df528 CR3: 0000000001a85000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process swapper (pid: 0, threadinfo ffff880616b70000, task ffff880616b6cab0)
 Stack:
 ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000
 <d> 0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00
 <d> ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8
 Call Trace:
 <IRQ>
 [<ffffffffa01c2582>] ? sctp_rcv+0x492/0xa10 [sctp]
 [<ffffffff8148c559>] ? nf_iterate+0x69/0xb0
 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148c716>] ? nf_hook_slow+0x76/0x120
 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8149757d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497808>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496ccd>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff81497255>] ? ip_rcv+0x275/0x350
 [<ffffffff8145cfeb>] ? __netif_receive_skb+0x4ab/0x750
 ...

With lockdep debugging:

 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 CslRx/12087 is trying to release lock (slock-AF_INET) at:
 [<ffffffffa01bcae0>] sctp_generate_timeout_event+0x40/0xe0 [sctp]
 but there are no more locks to release!

 other info that might help us debug this:
 2 locks held by CslRx/12087:
 #0:  (&asoc->timers[i]){+.-...}, at: [<ffffffff8108ce1f>] run_timer_softirq+0x16f/0x3e0
 #1:  (slock-AF_INET){+.-...}, at: [<ffffffffa01bcac3>] sctp_generate_timeout_event+0x23/0xe0 [sctp]

Ensure the socket taken is also the same one that is released by
saving a copy of the socket before entering the timeout event
critical section.

Signed-off-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: adjusted, 3.10 uses sctp_bh_unlock_sock() instead of bh_lock_sock()]

Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-08-27 11:40:33 +02:00
Xin Long 9ad2f2e8e1 sctp: lack the check for ports in sctp_v6_cmp_addr
commit 40b4f0fd74e46c017814618d67ec9127ff20f157 upstream.

As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check
the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by
sctp_cmp_addr_exact().

Now sctp_v6_cmp_addr just check the port when two addresses have different
family, and lack the port check for two ipv6 addresses. that will make
sctp_hash_cmp() cannot work well.

so fix it by adding ports comparison in sctp_v6_cmp_addr().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-06-07 10:42:48 +02:00
Xin Long 4a3411cc43 sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
[ Upstream commit 068d8bd338e855286aea54e70d1c101569284b21 ]

In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().

So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".

But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-28 21:49:33 -08:00
Eric Dumazet 62c8fcbdf6 ipv6: sctp: clone options to avoid use after free
[ Upstream commit 9470e24f35ab81574da54e69df90c1eb4a96b43f ]

SCTP is lacking proper np->opt cloning at accept() time.

TCP and DCCP use ipv6_dup_options() helper, do the same
in SCTP.

We might later factorize this code in a common helper to avoid
future mistakes.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-22 19:47:54 -08:00
Marcelo Ricardo Leitner 33fbe78ae8 sctp: update the netstamp_needed counter when copying sockets
[ Upstream commit 01ce63c90170283a9855d1db4fe81934dddce648 ]

Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.

When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.

The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-22 19:47:54 -08:00
Eric Dumazet 622af8c8e9 ipv6: sctp: implement sctp_v6_destroy_sock()
[ Upstream commit 602dd62dfbda3e63a2d6a3cbde953ebe82bf5087 ]

Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets.

We need to call inet6_destroy_sock() to properly release
inet6 specific fields.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-22 19:47:52 -08:00
lucien 3fb28c9723 sctp: translate host order to network order when setting a hmacid
[ Upstream commit ed5a377d87dc4c87fb3e1f7f698cba38cd893103 ]

now sctp auth cannot work well when setting a hmacid manually, which
is caused by that we didn't use the network order for hmacid, so fix
it by adding the transformation in sctp_auth_ep_set_hmacs.

even we set hmacid with the network order in userspace, it still
can't work, because of this condition in sctp_auth_ep_set_hmacs():

		if (id > SCTP_AUTH_HMAC_ID_MAX)
			return -EOPNOTSUPP;

so this wasn't working before and thus it won't break compatibility.

Fixes: 65b07e5d0d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-22 19:47:51 -08:00
Marcelo Ricardo Leitner e7bb902b26 sctp: fix race on protocol/netns initialization
[ Upstream commit 8e2d61e0aed2b7c4ecb35844fe07e0b2b762dee4 ]

Consider sctp module is unloaded and is being requested because an user
is creating a sctp socket.

During initialization, sctp will add the new protocol type and then
initialize pernet subsys:

        status = sctp_v4_protosw_init();
        if (status)
                goto err_protosw_init;

        status = sctp_v6_protosw_init();
        if (status)
                goto err_v6_protosw_init;

        status = register_pernet_subsys(&sctp_net_ops);

The problem is that after those calls to sctp_v{4,6}_protosw_init(), it
is possible for userspace to create SCTP sockets like if the module is
already fully loaded. If that happens, one of the possible effects is
that we will have readers for net->sctp.local_addr_list list earlier
than expected and sctp_net_init() does not take precautions while
dealing with that list, leading to a potential panic but not limited to
that, as sctp_sock_init() will copy a bunch of blank/partially
initialized values from net->sctp.

The race happens like this:

     CPU 0                           |  CPU 1
  socket()                           |
   __sock_create                     | socket()
    inet_create                      |  __sock_create
     list_for_each_entry_rcu(        |
        answer, &inetsw[sock->type], |
        list) {                      |   inet_create
      /* no hits */                  |
     if (unlikely(err)) {            |
      ...                            |
      request_module()               |
      /* socket creation is blocked  |
       * the module is fully loaded  |
       */                            |
       sctp_init                     |
        sctp_v4_protosw_init         |
         inet_register_protosw       |
          list_add_rcu(&p->list,     |
                       last_perm);   |
                                     |  list_for_each_entry_rcu(
                                     |     answer, &inetsw[sock->type],
        sctp_v6_protosw_init         |     list) {
                                     |     /* hit, so assumes protocol
                                     |      * is already loaded
                                     |      */
                                     |  /* socket creation continues
                                     |   * before netns is initialized
                                     |   */
        register_pernet_subsys       |

Simply inverting the initialization order between
register_pernet_subsys() and sctp_v4_protosw_init() is not possible
because register_pernet_subsys() will create a control sctp socket, so
the protocol must be already visible by then. Deferring the socket
creation to a work-queue is not good specially because we loose the
ability to handle its errors.

So, as suggested by Vlad, the fix is to split netns initialization in
two moments: defaults and control socket, so that the defaults are
already loaded by when we register the protocol, while control socket
initialization is kept at the same moment it is today.

Fixes: 4db67e8086 ("sctp: Make the address lists per network namespace")
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01 12:07:38 +02:00
Marcelo Ricardo Leitner 7bf24986e3 sctp: fix ASCONF list handling
commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 upstream.

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67 ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen <jiji@redhat.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wangkai: backport to 3.10: adjust context]
Signed-off-by: Wang Kai <morgan.wang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01 12:07:34 +02:00
Alexander Sverdlin 59a460c394 sctp: Fix race between OOTB responce and route removal
[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ]

There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).

But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.

To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
     ip route del [interface-specific route]
     ip route add [interface-specific route]
   done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
   responce

On x86_64 the crash looks like this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.0.5-1-ARCH #1
Hardware name: ...
task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000
RIP: 0010:[<ffffffffa05ec9ac>]  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
RSP: 0018:ffff880127c037b8  EFLAGS: 00010296
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480
RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700
RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af
R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28
R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0
FS:  0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0
Stack:
 ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400
 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520
 0000000000000000 0000000000000001 0000000000000000 0000000000000000
Call Trace:
 <IRQ>
 [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
 [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp]
 [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp]
 [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210
 [<ffffffff810e0329>] ? update_process_times+0x59/0x60
 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0
 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0
 [<ffffffff8101f599>] ? read_tsc+0x9/0x10
 [<ffffffff8116d4b5>] ? put_page+0x55/0x60
 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100
 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80
 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic]
 [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0
 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
 [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp]
 [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp]
 [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
 [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp]
 [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30
 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150
 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210
 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90
 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370
 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0
 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50
 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60
 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0
 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120
 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169]
 [<ffffffff8147896a>] net_rx_action+0x21a/0x360
 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0
 [<ffffffff8107912d>] irq_exit+0xad/0xb0
 [<ffffffff8157d158>] do_IRQ+0x58/0xf0
 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d
 <EOI>
 [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20
 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
 [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0
 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20
 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480
 [<ffffffff8156b365>] rest_init+0x85/0x90
 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac
 [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120
 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184
Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
RIP  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
 RSP <ffff880127c037b8>
CR2: 0000000000000020
---[ end trace 5aec7fd2dc983574 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
drm_kms_helper: panic occurred, switching back to text console
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-10 10:40:21 -07:00
Saran Maruti Ramanara 572d332c02 net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param
[ Upstream commit cfbf654efc6d78dc9812e030673b86f235bf677d ]

When making use of RFC5061, section 4.2.4. for setting the primary IP
address, we're passing a wrong parameter header to param_type2af(),
resulting always in NULL being returned.

At this point, param.p points to a sctp_addip_param struct, containing
a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation
id. Followed by that, as also presented in RFC5061 section 4.2.4., comes
the actual sctp_addr_param, which also contains a sctp_paramhdr, but
this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that
param_type2af() can make use of. Since we already hold a pointer to
addr_param from previous line, just reuse it for param_type2af().

Fixes: d6de309759 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Saran Maruti Ramanara <saran.neti@telus.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26 17:48:49 -08:00
Daniel Borkmann 727ab4c06a net: sctp: fix slab corruption from use after free on INIT collisions
[ Upstream commit 600ddd6825543962fb807884169e57b580dba208 ]

When hitting an INIT collision case during the 4WHS with AUTH enabled, as
already described in detail in commit 1be9a950c646 ("net: sctp: inherit
auth_capable on INIT collisions"), it can happen that we occasionally
still remotely trigger the following panic on server side which seems to
have been uncovered after the fix from commit 1be9a950c646 ...

[  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
[  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
[  533.940559] PGD 5030f2067 PUD 0
[  533.957104] Oops: 0000 [#1] SMP
[  533.974283] Modules linked in: sctp mlx4_en [...]
[  534.939704] Call Trace:
[  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
[  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
[  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
[  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
[  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
[  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
[  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
[  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

... or depending on the the application, for example this one:

[ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
[ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
[ 1370.054568] PGD 633c94067 PUD 0
[ 1370.070446] Oops: 0000 [#1] SMP
[ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
[ 1370.963431] Call Trace:
[ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
[ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
[ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
[ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
[ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

With slab debugging enabled, we can see that the poison has been overwritten:

[  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
[  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
[  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
[  669.826424]  __slab_alloc+0x4bf/0x566
[  669.826433]  __kmalloc+0x280/0x310
[  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
[  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
[  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
[  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
[  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
[  669.826635]  __slab_free+0x39/0x2a8
[  669.826643]  kfree+0x1d6/0x230
[  669.826650]  kzfree+0x31/0x40
[  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
[  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
[  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]

Since this only triggers in some collision-cases with AUTH, the problem at
heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
when having refcnt 1, once directly in sctp_assoc_update() and yet again
from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
the already kzfree'd memory, which is also consistent with the observation
of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
at a later point in time when poison is checked on new allocation).

Reference counting of auth keys revisited:

Shared keys for AUTH chunks are being stored in endpoints and associations
in endpoint_shared_keys list. On endpoint creation, a null key is being
added; on association creation, all endpoint shared keys are being cached
and thus cloned over to the association. struct sctp_shared_key only holds
a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
keeps track of users internally through refcounting. Naturally, on assoc
or enpoint destruction, sctp_shared_key are being destroyed directly and
the reference on sctp_auth_bytes dropped.

User space can add keys to either list via setsockopt(2) through struct
sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
with refcount 1 and in case of replacement drops the reference on the old
sctp_auth_bytes. A key can be set active from user space through setsockopt()
on the id via sctp_auth_set_active_key(), which iterates through either
endpoint_shared_keys and in case of an assoc, invokes (one of various places)
sctp_auth_asoc_init_active_key().

sctp_auth_asoc_init_active_key() computes the actual secret from local's
and peer's random, hmac and shared key parameters and returns a new key
directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
the reference if there was a previous one. The secret, which where we
eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
intitial refcount of 1, which also stays unchanged eventually in
sctp_assoc_update(). This key is later being used for crypto layer to
set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().

To close the loop: asoc->asoc_shared_key is freshly allocated secret
material and independant of the sctp_shared_key management keeping track
of only shared keys in endpoints and assocs. Hence, also commit 4184b2a79a76
("net: sctp: fix memory leak in auth key management") is independant of
this bug here since it concerns a different layer (though same structures
being used eventually). asoc->asoc_shared_key is reference dropped correctly
on assoc destruction in sctp_association_free() and when active keys are
being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
to remove that sctp_auth_key_put() from there which fixes these panics.

Fixes: 730fc3d05c ("[SCTP]: Implete SCTP-AUTH parameter processing")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-26 17:48:48 -08:00
Daniel Borkmann 751e562491 net: sctp: use MAX_HEADER for headroom reserve in output path
[ Upstream commit 9772b54c55266ce80c639a80aa68eeb908f8ecf5 ]

To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.

In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.

Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14df ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16 09:09:43 -08:00
Daniel Borkmann cda702df47 net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
commit 9de7922bc709eee2f609cd01d98aaedc4cf5ea74 upstream.

Commit 6f4c618ddb ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:

skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
 head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
 end:0x440 dev:<NULL>
 ------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
 <IRQ>
 [<ffffffff8144fb1c>] skb_put+0x5c/0x70
 [<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
 [<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
 [<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
 [<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
 [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
 [<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
 [<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
 [<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
 [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
 [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
 [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497078>] ip_local_deliver+0x98/0xa0
 [<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
 [<ffffffff81496ac5>] ip_rcv+0x275/0x350
 [<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
 [<ffffffff81460588>] netif_receive_skb+0x58/0x60

This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ------------------ ASCONF; UNKNOWN ------------------>

... where ASCONF chunk of length 280 contains 2 parameters ...

  1) Add IP address parameter (param length: 16)
  2) Add/del IP address parameter (param length: 255)

... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.

The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.

In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.

When walking to the next parameter this time, we proceed
with ...

  length = ntohs(asconf_param->param_hdr.length);
  asconf_param = (void *)asconf_param + length;

... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.

Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.

Joint work with Vlad Yasevich.

Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:55 -08:00
Daniel Borkmann 3329125539 net: sctp: fix panic on duplicate ASCONF chunks
commit b69040d8e39f20d5215a03502a8e8b4c6ab78395 upstream.

When receiving a e.g. semi-good formed connection scan in the
form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---------------- ASCONF_a; ASCONF_b ----------------->

... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!

The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.

Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():

While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd54
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.

Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.

Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:55 -08:00
Daniel Borkmann bf53932bce net: sctp: fix remote memory pressure from excessive queueing
commit 26b87c7881006311828bb0ab271a551a62dcceb4 upstream.

This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
  [...]
  ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>

... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.

We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].

The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.

In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.

One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:55 -08:00
Daniel Borkmann e79c2487e4 net: sctp: fix memory leak in auth key management
[ Upstream commit 4184b2a79a7612a9272ce20d639934584a1f3786 ]

A very minimal and simple user space application allocating an SCTP
socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
the socket again will leak the memory containing the authentication
key from user space:

unreferenced object 0xffff8800837047c0 (size 16):
  comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
  hex dump (first 16 bytes):
    01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811c88d8>] __kmalloc+0xe8/0x270
    [<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp]
    [<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp]
    [<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp]
    [<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20
    [<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0
    [<ffffffff816e58a9>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff

This is bad because of two things, we can bring down a machine from
user space when auth_enable=1, but also we would leave security sensitive
keying material in memory without clearing it after use. The issue is
that sctp_auth_create_key() already sets the refcount to 1, but after
allocation sctp_auth_set_key() does an additional refcount on it, and
thus leaving it around when we free the socket.

Fixes: 65b07e5d0d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:51 -08:00
Daniel Borkmann 7031dcb018 net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
[ Upstream commit e40607cbe270a9e8360907cb1e62ddf0736e4864 ]

An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
in the form of:

  ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>

While the INIT chunk parameter verification dissects through many things
in order to detect malformed input, it misses to actually check parameters
inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
IP address' parameter in ASCONF, which has as a subparameter an address
parameter.

So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
and thus sctp_get_af_specific() returns NULL, too, which we then happily
dereference unconditionally through af->from_addr_param().

The trace for the log:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
PGD 0
Oops: 0000 [#1] SMP
[...]
Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa01e9c62>]  [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
[...]
Call Trace:
 <IRQ>
 [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
 [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
 [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
 [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
 [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
 [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
 [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
 [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[...]

A minimal way to address this is to check for NULL as we do on all
other such occasions where we know sctp_get_af_specific() could
possibly return with NULL.

Fixes: d6de309759 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:22:51 -08:00
Vlad Yasevich 2d435f096d sctp: handle association restarts when the socket is closed.
[ Upstream commit bdf6fa52f01b941d4a80372d56de465bdbbd1d23 ]

Currently association restarts do not take into consideration the
state of the socket.  When a restart happens, the current assocation
simply transitions into established state.  This creates a condition
where a remote system, through a the restart procedure, may create a
local association that is no way reachable by user.  The conditions
to trigger this are as follows:
  1) Remote does not acknoledge some data causing data to remain
     outstanding.
  2) Local application calls close() on the socket.  Since data
     is still outstanding, the association is placed in SHUTDOWN_PENDING
     state.  However, the socket is closed.
  3) The remote tries to create a new association, triggering a restart
     on the local system.  The association moves from SHUTDOWN_PENDING
     to ESTABLISHED.  At this point, it is no longer reachable by
     any socket on the local system.

This patch addresses the above situation by moving the newly ESTABLISHED
association into SHUTDOWN-SENT state and bundling a SHUTDOWN after
the COOKIE-ACK chunk.  This way, the restarted associate immidiately
enters the shutdown procedure and forces the termination of the
unreachable association.

Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15 08:31:57 +02:00
Eric Dumazet 6e5f6266d6 sctp: fix possible seqlock seadlock in sctp_packet_transmit()
[ Upstream commit 757efd32d5ce31f67193cc0e6a56e4dffcc42fb1 ]

Dave reported following splat, caused by improper use of
IP_INC_STATS_BH() in process context.

BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
 ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
 0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
 ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
Call Trace:
 [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
 [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
 [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
 [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
 [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
 [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
 [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
 [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
 [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
 [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
 [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
 [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
 [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
 [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
 [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
 [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
 [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
 [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
 [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
 [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
 [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
 [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
 [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
 [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2

This is a followup of commits f1d8cba61c3c4b ("inet: fix possible
seqlock deadlocks") and 7f88c6b23afbd315 ("ipv6: fix possible seqlock
deadlock in ip6_finish_output2")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14 09:24:15 +08:00
Daniel Borkmann 495d049f3e net: sctp: inherit auth_capable on INIT collisions
[ Upstream commit 1be9a950c646c9092fb3618197f7b6bfb50e82aa ]

Jason reported an oops caused by SCTP on his ARM machine with
SCTP authentication enabled:

Internal error: Oops: 17 [#1] ARM
CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
task: c6eefa40 ti: c6f52000 task.ti: c6f52000
PC is at sctp_auth_calculate_hmac+0xc4/0x10c
LR is at sg_init_table+0x20/0x38
pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
sp : c6f538e8  ip : 00000000  fp : c6f53924
r10: c6f50d80  r9 : 00000000  r8 : 00010000
r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005397f  Table: 06f28000  DAC: 00000015
Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
Stack: (0xc6f538e8 to 0xc6f54000)
[...]
Backtrace:
[<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
[<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
[<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
[<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
[<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
[<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
[<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
[<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)

While we already had various kind of bugs in that area
ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
we/peer is AUTH capable") and b14878ccb7fa ("net: sctp: cache
auth_enable per endpoint"), this one is a bit of a different
kind.

Giving a bit more background on why SCTP authentication is
needed can be found in RFC4895:

  SCTP uses 32-bit verification tags to protect itself against
  blind attackers. These values are not changed during the
  lifetime of an SCTP association.

  Looking at new SCTP extensions, there is the need to have a
  method of proving that an SCTP chunk(s) was really sent by
  the original peer that started the association and not by a
  malicious attacker.

To cause this bug, we're triggering an INIT collision between
peers; normal SCTP handshake where both sides intent to
authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
parameters that are being negotiated among peers:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895 says that each endpoint therefore knows its own random
number and the peer's random number *after* the association
has been established. The local and peer's random number along
with the shared key are then part of the secret used for
calculating the HMAC in the AUTH chunk.

Now, in our scenario, we have 2 threads with 1 non-blocking
SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
sctp_bindx(3), listen(2) and connect(2) against each other,
thus the handshake looks similar to this, e.g.:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
  -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
  ...

Since such collisions can also happen with verification tags,
the RFC4895 for AUTH rather vaguely says under section 6.1:

  In case of INIT collision, the rules governing the handling
  of this Random Number follow the same pattern as those for
  the Verification Tag, as explained in Section 5.2.4 of
  RFC 2960 [5]. Therefore, each endpoint knows its own Random
  Number and the peer's Random Number after the association
  has been established.

In RFC2960, section 5.2.4, we're eventually hitting Action B:

  B) In this case, both sides may be attempting to start an
     association at about the same time but the peer endpoint
     started its INIT after responding to the local endpoint's
     INIT. Thus it may have picked a new Verification Tag not
     being aware of the previous Tag it had sent this endpoint.
     The endpoint should stay in or enter the ESTABLISHED
     state but it MUST update its peer's Verification Tag from
     the State Cookie, stop any init or cookie timers that may
     running and send a COOKIE ACK.

In other words, the handling of the Random parameter is the
same as behavior for the Verification Tag as described in
Action B of section 5.2.4.

Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
side effect interpreter, and in fact it properly copies over
peer_{random, hmacs, chunks} parameters from the newly created
association to update the existing one.

Also, the old asoc_shared_key is being released and based on
the new params, sctp_auth_asoc_init_active_key() updated.
However, the issue observed in this case is that the previous
asoc->peer.auth_capable was 0, and has *not* been updated, so
that instead of creating a new secret, we're doing an early
return from the function sctp_auth_asoc_init_active_key()
leaving asoc->asoc_shared_key as NULL. However, we now have to
authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).

That in fact causes the server side when responding with ...

  <------------------ AUTH; COOKIE-ACK -----------------

... to trigger a NULL pointer dereference, since in
sctp_packet_transmit(), it discovers that an AUTH chunk is
being queued for xmit, and thus it calls sctp_auth_calculate_hmac().

Since the asoc->active_key_id is still inherited from the
endpoint, and the same as encoded into the chunk, it uses
asoc->asoc_shared_key, which is still NULL, as an asoc_key
and dereferences it in ...

  crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)

... causing an oops. All this happens because sctp_make_cookie_ack()
called with the *new* association has the peer.auth_capable=1
and therefore marks the chunk with auth=1 after checking
sctp_auth_send_cid(), but it is *actually* sent later on over
the then *updated* association's transport that didn't initialize
its shared key due to peer.auth_capable=0. Since control chunks
in that case are not sent by the temporary association which
are scheduled for deletion, they are issued for xmit via
SCTP_CMD_REPLY in the interpreter with the context of the
*updated* association. peer.auth_capable was 0 in the updated
association (which went from COOKIE_WAIT into ESTABLISHED state),
since all previous processing that performed sctp_process_init()
was being done on temporary associations, that we eventually
throw away each time.

The correct fix is to update to the new peer.auth_capable
value as well in the collision case via sctp_assoc_update(),
so that in case the collision migrated from 0 -> 1,
sctp_auth_asoc_init_active_key() can properly recalculate
the secret. This therefore fixes the observed server panic.

Fixes: 730fc3d05c ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-14 09:24:15 +08:00
Daniel Borkmann b3b3ba5714 net: sctp: fix information leaks in ulpevent layer
[ Upstream commit 8f2e5ae40ec193bc0a0ed99e95315c3eebca84ea ]

While working on some other SCTP code, I noticed that some
structures shared with user space are leaking uninitialized
stack or heap buffer. In particular, struct sctp_sndrcvinfo
has a 2 bytes hole between .sinfo_flags and .sinfo_ppid that
remains unfilled by us in sctp_ulpevent_read_sndrcvinfo() when
putting this into cmsg. But also struct sctp_remote_error
contains a 2 bytes hole that we don't fill but place into a skb
through skb_copy_expand() via sctp_ulpevent_make_remote_error().

Both structures are defined by the IETF in RFC6458:

* Section 5.3.2. SCTP Header Information Structure:

  The sctp_sndrcvinfo structure is defined below:

  struct sctp_sndrcvinfo {
    uint16_t sinfo_stream;
    uint16_t sinfo_ssn;
    uint16_t sinfo_flags;
    <-- 2 bytes hole  -->
    uint32_t sinfo_ppid;
    uint32_t sinfo_context;
    uint32_t sinfo_timetolive;
    uint32_t sinfo_tsn;
    uint32_t sinfo_cumtsn;
    sctp_assoc_t sinfo_assoc_id;
  };

* 6.1.3. SCTP_REMOTE_ERROR:

  A remote peer may send an Operation Error message to its peer.
  This message indicates a variety of error conditions on an
  association. The entire ERROR chunk as it appears on the wire
  is included in an SCTP_REMOTE_ERROR event. Please refer to the
  SCTP specification [RFC4960] and any extensions for a list of
  possible error formats. An SCTP error notification has the
  following format:

  struct sctp_remote_error {
    uint16_t sre_type;
    uint16_t sre_flags;
    uint32_t sre_length;
    uint16_t sre_error;
    <-- 2 bytes hole  -->
    sctp_assoc_t sre_assoc_id;
    uint8_t  sre_data[];
  };

Fix this by setting both to 0 before filling them out. We also
have other structures shared between user and kernel space in
SCTP that contains holes (e.g. struct sctp_paddrthlds), but we
copy that buffer over from user space first and thus don't need
to care about it in that cases.

While at it, we can also remove lengthy comments copied from
the draft, instead, we update the comment with the correct RFC
number where one can look it up.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-28 08:00:05 -07:00
Daniel Borkmann e9013d0f0f net: sctp: check proc_dointvec result in proc_sctp_do_auth
[ Upstream commit 24599e61b7552673dd85971cf5a35369cd8c119e ]

When writing to the sysctl field net.sctp.auth_enable, it can well
be that the user buffer we handed over to proc_dointvec() via
proc_sctp_do_auth() handler contains something other than integers.

In that case, we would set an uninitialized 4-byte value from the
stack to net->sctp.auth_enable that can be leaked back when reading
the sysctl variable, and it can unintentionally turn auth_enable
on/off based on the stack content since auth_enable is interpreted
as a boolean.

Fix it up by making sure proc_dointvec() returned sucessfully.

Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-28 08:00:04 -07:00
Xufeng Zhang a6987bca80 sctp: Fix sk_ack_backlog wrap-around problem
[ Upstream commit d3217b15a19a4779c39b212358a5c71d725822ee ]

Consider the scenario:
For a TCP-style socket, while processing the COOKIE_ECHO chunk in
sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check,
a new association would be created in sctp_unpack_cookie(), but afterwards,
some processing maybe failed, and sctp_association_free() will be called to
free the previously allocated association, in sctp_association_free(),
sk_ack_backlog value is decremented for this socket, since the initial
value for sk_ack_backlog is 0, after the decrement, it will be 65535,
a wrap-around problem happens, and if we want to establish new associations
afterward in the same socket, ABORT would be triggered since sctp deem the
accept queue as full.
Fix this issue by only decrementing sk_ack_backlog for associations in
the endpoint's list.

Fix-suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-26 15:12:38 -04:00
Xufeng Zhang 4de3e298aa sctp: reset flowi4_oif parameter on route lookup
[ Upstream commit 85350871317a5adb35519d9dc6fc9e80809d42ad ]

commit 813b3b5db8 (ipv4: Use caller's on-stack flowi as-is
in output route lookups.) introduces another regression which
is very similar to the problem of commit e6b45241c (ipv4: reset
flowi parameters on route connect) wants to fix:
Before we call ip_route_output_key() in sctp_v4_get_dst() to
get a dst that matches a bind address as the source address,
we have already called this function previously and the flowi
parameters have been initialized including flowi4_oif, so when
we call this function again, the process in __ip_route_output_key()
will be different because of the setting of flowi4_oif, and we'll
get a networking device which corresponds to the inputted flowi4_oif
as the output device, this is wrong because we'll never hit this
place if the previously returned source address of dst match one
of the bound addresses.

To reproduce this problem, a vlan setting is enough:
  # ifconfig eth0 up
  # route del default
  # vconfig add eth0 2
  # vconfig add eth0 3
  # ifconfig eth0.2 10.0.1.14 netmask 255.255.255.0
  # route add default gw 10.0.1.254 dev eth0.2
  # ifconfig eth0.3 10.0.0.14 netmask 255.255.255.0
  # ip rule add from 10.0.0.14 table 4
  # ip route add table 4 default via 10.0.0.254 src 10.0.0.14 dev eth0.3
  # sctp_darn -H 10.0.0.14 -P 36422 -h 10.1.4.134 -p 36422 -s -I
You'll detect that all the flow are routed to eth0.2(10.0.1.254).

Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30 21:52:16 -07:00
Vlad Yasevich e5eae4a051 net: sctp: cache auth_enable per endpoint
[ Upstream commit b14878ccb7fac0242db82720b784ab62c467c0dc ]

Currently, it is possible to create an SCTP socket, then switch
auth_enable via sysctl setting to 1 and crash the system on connect:

Oops[#1]:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
[...]
Call Trace:
[<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80
[<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4
[<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c
[<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8
[<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214
[<ffffffff8043af68>] sctp_rcv+0x588/0x630
[<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24
[<ffffffff803acb50>] ip6_input+0x2c0/0x440
[<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564
[<ffffffff80310650>] process_backlog+0xb4/0x18c
[<ffffffff80313cbc>] net_rx_action+0x12c/0x210
[<ffffffff80034254>] __do_softirq+0x17c/0x2ac
[<ffffffff800345e0>] irq_exit+0x54/0xb0
[<ffffffff800075a4>] ret_from_irq+0x0/0x4
[<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48
[<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148
[<ffffffff805a88b0>] start_kernel+0x37c/0x398
Code: dd0900b8  000330f8  0126302d <dcc60000> 50c0fff1  0047182a  a48306a0
03e00008  00000000
---[ end trace b530b0551467f2fd ]---
Kernel panic - not syncing: Fatal exception in interrupt

What happens while auth_enable=0 in that case is, that
ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
when endpoint is being created.

After that point, if an admin switches over to auth_enable=1,
the machine can crash due to NULL pointer dereference during
reception of an INIT chunk. When we enter sctp_process_init()
via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
the INIT verification succeeds and while we walk and process
all INIT params via sctp_process_param() we find that
net->sctp.auth_enable is set, therefore do not fall through,
but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
dereference what we have set to NULL during endpoint
initialization phase.

The fix is to make auth_enable immutable by caching its value
during endpoint initialization, so that its original value is
being carried along until destruction. The bug seems to originate
from the very first days.

Fix in joint work with Daniel Borkmann.

Reported-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30 21:52:15 -07:00
Daniel Borkmann 3947574d9c net: sctp: test if association is dead in sctp_wake_up_waiters
[ Upstream commit 1e1cdf8ac78793e0875465e98a648df64694a8d0 ]

In function sctp_wake_up_waiters(), we need to involve a test
if the association is declared dead. If so, we don't have any
reference to a possible sibling association anymore and need
to invoke sctp_write_space() instead, and normally walk the
socket's associations and notify them of new wmem space. The
reason for special casing is that otherwise, we could run
into the following issue when a sctp_primitive_SEND() call
from sctp_sendmsg() fails, and tries to flush an association's
outq, i.e. in the following way:

sctp_association_free()
`-> list_del(&asoc->asocs)         <-- poisons list pointer
    asoc->base.dead = true
    sctp_outq_free(&asoc->outqueue)
    `-> __sctp_outq_teardown()
     `-> sctp_chunk_free()
      `-> consume_skb()
       `-> sctp_wfree()
        `-> sctp_wake_up_waiters() <-- dereferences poisoned pointers
                                       if asoc->ep->sndbuf_policy=0

Therefore, only walk the list in an 'optimized' way if we find
that the current association is still active. We could also use
list_del_init() in addition when we call sctp_association_free(),
but as Vlad suggests, we want to trap such bugs and thus leave
it poisoned as is.

Why is it safe to resolve the issue by testing for asoc->base.dead?
Parallel calls to sctp_sendmsg() are protected under socket lock,
that is lock_sock()/release_sock(). Only within that path under
lock held, we're setting skb/chunk owner via sctp_set_owner_w().
Eventually, chunks are freed directly by an association still
under that lock. So when traversing association list on destruction
time from sctp_wake_up_waiters() via sctp_wfree(), a different
CPU can't be running sctp_wfree() while another one calls
sctp_association_free() as both happens under the same lock.
Therefore, this can also not race with setting/testing against
asoc->base.dead as we are guaranteed for this to happen in order,
under lock. Further, Vlad says: the times we check asoc->base.dead
is when we've cached an association pointer for later processing.
In between cache and processing, the association may have been
freed and is simply still around due to reference counts. We check
asoc->base.dead under a lock, so it should always be safe to check
and not race against sctp_association_free(). Stress-testing seems
fine now, too.

Fixes: cd253f9f357d ("net: sctp: wake up all assocs if sndbuf policy is per socket")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30 21:52:14 -07:00
Daniel Borkmann 83bd973b67 net: sctp: wake up all assocs if sndbuf policy is per socket
[ Upstream commit 52c35befb69b005c3fc5afdaae3a5717ad013411 ]

SCTP charges chunks for wmem accounting via skb->truesize in
sctp_set_owner_w(), and sctp_wfree() respectively as the
reverse operation. If a sender runs out of wmem, it needs to
wait via sctp_wait_for_sndbuf(), and gets woken up by a call
to __sctp_write_space() mostly via sctp_wfree().

__sctp_write_space() is being called per association. Although
we assign sk->sk_write_space() to sctp_write_space(), which
is then being done per socket, it is only used if send space
is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
is set and therefore not invoked in sock_wfree().

Commit 4c3a5bdae2 ("sctp: Don't charge for data in sndbuf
again when transmitting packet") fixed an issue where in case
sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again
unless it is interrupted by a signal. However, a still
remaining issue is that if net.sctp.sndbuf_policy=0, that is
accounting per socket, and one-to-many sockets are in use,
the reclaimed write space from sctp_wfree() is 'unfairly'
handed back on the server to the association that is the lucky
one to be woken up again via __sctp_write_space(), while
the remaining associations are never be woken up again
(unless by a signal).

The effect disappears with net.sctp.sndbuf_policy=1, that
is wmem accounting per association, as it guarantees a fair
share of wmem among associations.

Therefore, if we have reclaimed memory in case of per socket
accounting, wake all related associations to a socket in a
fair manner, that is, traverse the socket association list
starting from the current neighbour of the association and
issue a __sctp_write_space() to everyone until we end up
waking ourselves. This guarantees that no association is
preferred over another and even if more associations are
taken into the one-to-many session, all receivers will get
messages from the server and are not stalled forever on
high load. This setting still leaves the advantage of per
socket accounting in touch as an association can still use
up global limits if unused by others.

Fixes: 4eb701dfc6 ("[SCTP] Fix SCTP sendbuffer accouting.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-30 21:52:14 -07:00
Daniel Borkmann ec494e1004 net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk
[ Upstream commit c485658bae87faccd7aed540fd2ca3ab37992310 ]

While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to
verify if we/peer is AUTH capable"), we noticed that there's a skb
memory leakage in the error path.

Running the same reproducer as in ec0223ec48a9 and by unconditionally
jumping to the error label (to simulate an error condition) in
sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
the unfreed chunk->auth_chunk skb clone:

Unreferenced object 0xffff8800b8f3a000 (size 256):
  comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
  backtrace:
    [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
    [<ffffffff81566929>] skb_clone+0x49/0xb0
    [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
    [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
    [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
    [<ffffffff815a64af>] nf_reinject+0xbf/0x180
    [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
    [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
    [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
    [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
    [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380

What happens is that commit bbd0d59809 clones the skb containing
the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
that an endpoint requires COOKIE-ECHO chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

When we enter sctp_sf_do_5_1D_ce() and before we actually get to
the point where we process (and subsequently free) a non-NULL
chunk->auth_chunk, we could hit the "goto nomem_init" path from
an error condition and thus leave the cloned skb around w/o
freeing it.

The fix is to centrally free such clones in sctp_chunk_destroy()
handler that is invoked from sctp_chunk_free() after all refs have
dropped; and also move both kfree_skb(chunk->auth_chunk) there,
so that chunk->auth_chunk is either NULL (since sctp_chunkify()
allocs new chunks through kmem_cache_zalloc()) or non-NULL with
a valid skb pointer. chunk->skb and chunk->auth_chunk are the
only skbs in the sctp_chunk structure that need to be handeled.

While at it, we should use consume_skb() for both. It is the same
as dev_kfree_skb() but more appropriately named as we are not
a device but a protocol. Also, this effectively replaces the
kfree_skb() from both invocations into consume_skb(). Functions
are the same only that kfree_skb() assumes that the frame was
being dropped after a failure (e.g. for tools like drop monitor),
usage of consume_skb() seems more appropriate in function
sctp_chunk_destroy() though.

Fixes: bbd0d59809 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14 06:42:15 -07:00
Daniel Borkmann 892b46a6a6 net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
[ Upstream commit ec0223ec48a90cb605244b45f7c62de856403729 ]

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

  The receiver MUST use the HMAC algorithm indicated in
  the HMAC Identifier field. If this algorithm was not
  specified by the receiver in the HMAC-ALGO parameter in
  the INIT or INIT-ACK chunk during association setup, the
  AUTH chunk and all the chunks after it MUST be discarded
  and an ERROR chunk SHOULD be sent with the error cause
  defined in Section 4.1. [...] If no endpoint pair shared
  key has been configured for that Shared Key Identifier,
  all authenticated chunks MUST be silently discarded. [...]

  When an endpoint requires COOKIE-ECHO chunks to be
  authenticated, some special procedures have to be followed
  because the reception of a COOKIE-ECHO chunk might result
  in the creation of an SCTP association. If a packet arrives
  containing an AUTH chunk as a first chunk, a COOKIE-ECHO
  chunk as the second chunk, and possibly more chunks after
  them, and the receiver does not have an STCB for that
  packet, then authentication is based on the contents of
  the COOKIE-ECHO chunk. In this situation, the receiver MUST
  authenticate the chunks in the packet by using the RANDOM
  parameters, CHUNKS parameters and HMAC_ALGO parameters
  obtained from the COOKIE-ECHO chunk, and possibly a local
  shared secret as inputs to the authentication procedure
  specified in Section 6.3. If authentication fails, then
  the packet is discarded. If the authentication is successful,
  the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
  MUST be processed. If the receiver has an STCB, it MUST
  process the AUTH chunk as described above using the STCB
  from the existing association to authenticate the
  COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-23 21:38:11 -07:00
Daniel Borkmann b306713b63 net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
[ Upstream commit ffd5939381c609056b33b7585fb05a77b4c695f3 ]

SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.

Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.

Fixes: f9c67811eb ("sctp: Fix regression introduced by new sctp_connectx api")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-06 21:30:05 -08:00
Vlad Yasevich c69ee66768 sctp: Perform software checksum if packet has to be fragmented.
[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ]

IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
This causes problems if SCTP packets has to be fragmented and
ipsummed has been set to PARTIAL due to checksum offload support.
This condition can happen when retransmitting after MTU discover,
or when INIT or other control chunks are larger then MTU.
Check for the rare fragmentation condition in SCTP and use software
checksum calculation in this case.

CC: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-04 04:31:04 -08:00
Fan Du 935be1dc2c sctp: Use software crc32 checksum when xfrm transform will happen.
[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ]

igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.

Signed-off-by: Fan Du <fan.du@windriver.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-04 04:31:04 -08:00
Daniel Borkmann e4e7ba12fa net: sctp: rfc4443: do not report ICMP redirects to user space
[ Upstream commit 3f96a532113131d5a65ac9e00fc83cfa31b0295f ]

Adapt the same behaviour for SCTP as present in TCP for ICMP redirect
messages. For IPv6, RFC4443, section 2.4. says:

  ...
  (e) An ICMPv6 error message MUST NOT be originated as a result of
      receiving the following:
  ...
       (e.2) An ICMPv6 redirect message [IPv6-DISC].
  ...

Therefore, do not report an error to user space, just invoke dst's redirect
callback and leave, same for IPv4 as done in TCP as well. The implication
w/o having this patch could be that the reception of such packets would
generate a poll notification and in worst case it could even tear down the
whole connection. Therefore, stop updating sk_err on redirects.

Reported-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:29 -07:00
Daniel Borkmann 872b11be53 net: sctp: fix ipv6 ipsec encryption bug in sctp_v6_xmit
[ Upstream commit 95ee62083cb6453e056562d91f597552021e6ae7 ]

Alan Chester reported an issue with IPv6 on SCTP that IPsec traffic is not
being encrypted, whereas on IPv4 it is. Setting up an AH + ESP transport
does not seem to have the desired effect:

SCTP + IPv4:

  22:14:20.809645 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto AH (51), length 116)
    192.168.0.2 > 192.168.0.5: AH(spi=0x00000042,sumlen=16,seq=0x1): ESP(spi=0x00000044,seq=0x1), length 72
  22:14:20.813270 IP (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto AH (51), length 340)
    192.168.0.5 > 192.168.0.2: AH(spi=0x00000043,sumlen=16,seq=0x1):

SCTP + IPv6:

  22:31:19.215029 IP6 (class 0x02, hlim 64, next-header SCTP (132) payload length: 364)
    fe80::222:15ff:fe87:7fc.3333 > fe80::92e6:baff:fe0d:5a54.36767: sctp
    1) [INIT ACK] [init tag: 747759530] [rwnd: 62464] [OS: 10] [MIS: 10]

Moreover, Alan says:

  This problem was seen with both Racoon and Racoon2. Other people have seen
  this with OpenSwan. When IPsec is configured to encrypt all upper layer
  protocols the SCTP connection does not initialize. After using Wireshark to
  follow packets, this is because the SCTP packet leaves Box A unencrypted and
  Box B believes all upper layer protocols are to be encrypted so it drops
  this packet, causing the SCTP connection to fail to initialize. When IPsec
  is configured to encrypt just SCTP, the SCTP packets are observed unencrypted.

In fact, using `socat sctp6-listen:3333 -` on one end and transferring "plaintext"
string on the other end, results in cleartext on the wire where SCTP eventually
does not report any errors, thus in the latter case that Alan reports, the
non-paranoid user might think he's communicating over an encrypted transport on
SCTP although he's not (tcpdump ... -X):

  ...
  0x0030: 5d70 8e1a 0003 001a 177d eb6c 0000 0000  ]p.......}.l....
  0x0040: 0000 0000 706c 6169 6e74 6578 740a 0000  ....plaintext...

Only in /proc/net/xfrm_stat we can see XfrmInTmplMismatch increasing on the
receiver side. Initial follow-up analysis from Alan's bug report was done by
Alexey Dobriyan. Also thanks to Vlad Yasevich for feedback on this.

SCTP has its own implementation of sctp_v6_xmit() not calling inet6_csk_xmit().
This has the implication that it probably never really got updated along with
changes in inet6_csk_xmit() and therefore does not seem to invoke xfrm handlers.

SCTP's IPv4 xmit however, properly calls ip_queue_xmit() to do the work. Since
a call to inet6_csk_xmit() would solve this problem, but result in unecessary
route lookups, let us just use the cached flowi6 instead that we got through
sctp_v6_get_dst(). Since all SCTP packets are being sent through sctp_packet_transmit(),
we do the route lookup / flow caching in sctp_transport_route(), hold it in
tp->dst and skb_dst_set() right after that. If we would alter fl6->daddr in
sctp_v6_xmit() to np->opt->srcrt, we possibly could run into the same effect
of not having xfrm layer pick it up, hence, use fl6_update_dst() in sctp_v6_get_dst()
instead to get the correct source routed dst entry, which we assign to the skb.

Also source address routing example from 625034113 ("sctp: fix sctp to work with
ipv6 source address routing") still works with this patch! Nevertheless, in RFC5095
it is actually 'recommended' to not use that anyway due to traffic amplification [1].
So it seems we're not supposed to do that anyway in sctp_v6_xmit(). Moreover, if
we overwrite the flow destination here, the lower IPv6 layer will be unable to
put the correct destination address into IP header, as routing header is added in
ipv6_push_nfrag_opts() but then probably with wrong final destination. Things aside,
result of this patch is that we do not have any XfrmInTmplMismatch increase plus on
the wire with this patch it now looks like:

SCTP + IPv6:

  08:17:47.074080 IP6 2620:52:0:102f:7a2b:cbff:fe27:1b0a > 2620:52:0:102f:213:72ff:fe32:7eba:
    AH(spi=0x00005fb4,seq=0x1): ESP(spi=0x00005fb5,seq=0x1), length 72
  08:17:47.074264 IP6 2620:52:0:102f:213:72ff:fe32:7eba > 2620:52:0:102f:7a2b:cbff:fe27:1b0a:
    AH(spi=0x00003d54,seq=0x1): ESP(spi=0x00003d55,seq=0x1), length 296

This fixes Kernel Bugzilla 24412. This security issue seems to be present since
2.6.18 kernels. Lets just hope some big passive adversary in the wild didn't have
its fun with that. lksctp-tools IPv6 regression test suite passes as well with
this patch.

 [1] http://www.secdev.org/conf/IPv6_RH_security-csw07.pdf

Reported-by: Alan Chester <alan.chester@tekelec.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:29 -07:00
Daniel Borkmann 8ee62e8c94 net: sctp: fix smatch warning in sctp_send_asconf_del_ip
[ Upstream commit 88362ad8f9a6cea787420b57cc27ccacef000dbe ]

This was originally reported in [1] and posted by Neil Horman [2], he said:

  Fix up a missed null pointer check in the asconf code. If we don't find
  a local address, but we pass in an address length of more than 1, we may
  dereference a NULL laddr pointer. Currently this can't happen, as the only
  users of the function pass in the value 1 as the addrcnt parameter, but
  its not hot path, and it doesn't hurt to check for NULL should that ever
  be the case.

The callpath from sctp_asconf_mgmt() looks okay. But this could be triggered
from sctp_setsockopt_bindx() call with SCTP_BINDX_REM_ADDR and addrcnt > 1
while passing all possible addresses from the bind list to SCTP_BINDX_REM_ADDR
so that we do *not* find a single address in the association's bind address
list that is not in the packed array of addresses. If this happens when we
have an established association with ASCONF-capable peers, then we could get
a NULL pointer dereference as we only check for laddr == NULL && addrcnt == 1
and call later sctp_make_asconf_update_ip() with NULL laddr.

BUT: this actually won't happen as sctp_bindx_rem() will catch such a case
and return with an error earlier. As this is incredably unintuitive and error
prone, add a check to catch at least future bugs here. As Neil says, its not
hot path. Introduced by 8a07eb0a5 ("sctp: Add ASCONF operation on the
single-homed host").

 [1] http://www.spinics.net/lists/linux-sctp/msg02132.html
 [2] http://www.spinics.net/lists/linux-sctp/msg02133.html

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Michio Honda <micchie@sfc.wide.ad.jp>
Acked-By: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:28 -07:00
Daniel Borkmann 8b1ada4971 net: sctp: fix bug in sctp_poll for SOCK_SELECT_ERR_QUEUE
[ Upstream commit a0fb05d1aef0f5df936f80b726d1b3bfd4275f95 ]

If we do not add braces around ...

  mask |= POLLERR |
          sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? POLLPRI : 0;

... then this condition always evaluates to true as POLLERR is
defined as 8 and binary or'd with whatever result comes out of
sock_flag(). Hence instead of (X | Y) ? A : B, transform it into
X | (Y ? A : B). Unfortunatelty, commit 8facd5fb73 ("net: fix
smatch warnings inside datagram_poll") forgot about SCTP. :-(

Introduced by 7d4c04fc17 ("net: add option to enable error queue
packets waking select").

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13 16:08:28 -07:00
Neil Horman c5c7774d7e sctp: fully initialize sctp_outq in sctp_outq_init
In commit 2f94aabd9f
(refactor sctp_outq_teardown to insure proper re-initalization)
we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the
outq structure.  Steve West recently asked me why I removed the q->error = 0
initalization from sctp_outq_teardown.  I did so because I was operating under
the impression that sctp_outq_init would properly initalize that value for us,
but it doesn't.  sctp_outq_init operates under the assumption that the outq
struct is all 0's (as it is when called from sctp_association_init), but using
it in __sctp_outq_teardown violates that assumption. We should do a memset in
sctp_outq_init to ensure that the entire structure is in a known state there
instead.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve.west@nsn.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
CC: davem@davemloft.net
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 18:05:24 -07:00
Daniel Borkmann 1abd165ed7 net: sctp: fix NULL pointer dereference in socket destruction
While stress testing sctp sockets, I hit the following panic:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD 7cead067 PUD 7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF            3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000
RIP: 0010:[<ffffffffa0490c4e>]  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:ffff88007b569e08  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200
RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000
RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00
FS:  00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
 ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
 [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
 [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0
 [<ffffffff814df36e>] inet_create+0x2ae/0x350
 [<ffffffff81455a6f>] __sock_create+0x11f/0x240
 [<ffffffff81455bf0>] sock_create+0x30/0x40
 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0
 [<ffffffff815403be>] ? do_page_fault+0xe/0x10
 [<ffffffff8153cb32>] ? page_fault+0x22/0x30
 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
      1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
      8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
 RSP <ffff88007b569e08>
CR2: 0000000000000020
---[ end trace e0d71ec1108c1dd9 ]---

I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.

This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:49:05 -07:00
Linus Torvalds 73287a43cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights (1721 non-merge commits, this has to be a record of some
  sort):

   1) Add 'random' mode to team driver, from Jiri Pirko and Eric
      Dumazet.

   2) Make it so that any driver that supports configuration of multiple
      MAC addresses can provide the forwarding database add and del
      calls by providing a default implementation and hooking that up if
      the driver doesn't have an explicit set of handlers.  From Vlad
      Yasevich.

   3) Support GSO segmentation over tunnels and other encapsulating
      devices such as VXLAN, from Pravin B Shelar.

   4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

   5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
      Dukkipati.

   6) In the PHY layer, allow supporting wake-on-lan in situations where
      the PHY registers have to be written for it to be configured.

      Use it to support wake-on-lan in mv643xx_eth.

      From Michael Stapelberg.

   7) Significantly improve firewire IPV6 support, from YOSHIFUJI
      Hideaki.

   8) Allow multiple packets to be sent in a single transmission using
      network coding in batman-adv, from Martin Hundebøll.

   9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

  10) Generalize the VXLAN forwarding tables so that there is more
      flexibility in configurating various aspects of the endpoints.
      From David Stevens.

  11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
      from Dmitry Kravkov.

  12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
      Neira Ayuso.

  13) Start adding networking selftests.

  14) In situations of overload on the same AF_PACKET fanout socket, or
      per-cpu packet receive queue, minimize drop by distributing the
      load to other cpus/fanouts.  From Willem de Bruijn and Eric
      Dumazet.

  15) Add support for new payload offset BPF instruction, from Daniel
      Borkmann.

  16) Convert several drivers over to mdoule_platform_driver(), from
      Sachin Kamat.

  17) Provide a minimal BPF JIT image disassembler userspace tool, from
      Daniel Borkmann.

  18) Rewrite F-RTO implementation in TCP to match the final
      specification of it in RFC4138 and RFC5682.  From Yuchung Cheng.

  19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
      you like netlink, so I implemented netlink dumping of netlink
      sockets.") From Andrey Vagin.

  20) Remove ugly passing of rtnetlink attributes into rtnl_doit
      functions, from Thomas Graf.

  21) Allow userspace to be able to see if a configuration change occurs
      in the middle of an address or device list dump, from Nicolas
      Dichtel.

  22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
      Frederic Sowa.

  23) Increase accuracy of packet length used by packet scheduler, from
      Jason Wang.

  24) Beginning set of changes to make ipv4/ipv6 fragment handling more
      scalable and less susceptible to overload and locking contention,
      from Jesper Dangaard Brouer.

  25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
      instead.  From Hong Zhiguo.

  26) Optimize route usage in IPVS by avoiding reference counting where
      possible, from Julian Anastasov.

  27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

  28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
      Eitzenberger.

  29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
      nfnetlink_log, and nfnetlink_queue.  From Gao feng.

  30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

  31) Support several new r8169 chips, from Hayes Wang.

  32) Support tokenized interface identifiers in ipv6, from Daniel
      Borkmann.

  33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

  34) Add 802.1ad vlan offload support, from Patrick McHardy.

  35) Support mmap() based netlink communication, also from Patrick
      McHardy.

  36) Support HW timestamping in mlx4 driver, from Amir Vadai.

  37) Rationalize AF_PACKET packet timestamping when transmitting, from
      Willem de Bruijn and Daniel Borkmann.

  38) Bring parity to what's provided by /proc/net/packet socket dumping
      and the info provided by netlink socket dumping of AF_PACKET
      sockets.  From Nicolas Dichtel.

  39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
      Poirier"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  filter: fix va_list build error
  af_unix: fix a fatal race with bit fields
  bnx2x: Prevent memory leak when cnic is absent
  bnx2x: correct reading of speed capabilities
  net: sctp: attribute printl with __printf for gcc fmt checks
  netlink: kconfig: move mmap i/o into netlink kconfig
  netpoll: convert mutex into a semaphore
  netlink: Fix skb ref counting.
  net_sched: act_ipt forward compat with xtables
  mlx4_en: fix a build error on 32bit arches
  Revert "bnx2x: allow nvram test to run when device is down"
  bridge: avoid OOPS if root port not found
  drivers: net: cpsw: fix kernel warn on cpsw irq enable
  sh_eth: use random MAC address if no valid one supplied
  3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
  tg3: fix to append hardware time stamping flags
  unix/stream: fix peeking with an offset larger than data in queue
  unix/dgram: fix peeking with an offset larger than data in queue
  unix/dgram: peek beyond 0-sized skbs
  openvswitch: Remove unneeded ovs_netdev_get_ifindex()
  ...
2013-05-01 14:08:52 -07:00
Daniel Borkmann be3e45810b net: sctp: attribute printl with __printf for gcc fmt checks
Let GCC check for format string errors in sctp's probe printl
function. This patch fixes the warning when compiled with W=1:

net/sctp/probe.c:73:2: warning: function might be possible candidate
for 'gnu_printf' format attribute [-Wmissing-format-attribute]

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01 15:04:10 -04:00
Linus Torvalds 5d434fcb25 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual stuff, mostly comment fixes, typo fixes, printk fixes and small
  code cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (45 commits)
  mm: Convert print_symbol to %pSR
  gfs2: Convert print_symbol to %pSR
  m32r: Convert print_symbol to %pSR
  iostats.txt: add easy-to-find description for field 6
  x86 cmpxchg.h: fix wrong comment
  treewide: Fix typo in printk and comments
  doc: devicetree: Fix various typos
  docbook: fix 8250 naming in device-drivers
  pata_pdc2027x: Fix compiler warning
  treewide: Fix typo in printks
  mei: Fix comments in drivers/misc/mei
  treewide: Fix typos in kernel messages
  pm44xx: Fix comment for "CONFIG_CPU_IDLE"
  doc: Fix typo "CONFIG_CGROUP_CGROUP_MEMCG_SWAP"
  mmzone: correct "pags" to "pages" in comment.
  kernel-parameters: remove outdated 'noresidual' parameter
  Remove spurious _H suffixes from ifdef comments
  sound: Remove stray pluses from Kconfig file
  radio-shark: Fix printk "CONFIG_LED_CLASS"
  doc: put proper reference to CONFIG_MODULE_SIG_ENFORCE
  ...
2013-04-30 09:36:50 -07:00