commit 7620164e85e48ea381a52864d729a7731be8d7f2 Author: Greg Kroah-Hartman Date: Sat May 26 08:49:01 2018 +0200 Linux 4.4.133 commit eef045e7f67e01be5747af23ba77a06d7f5bcdbe Author: Tetsuo Handa Date: Wed May 9 19:42:20 2018 +0900 x86/kexec: Avoid double free_page() upon do_kexec_load() failure commit a466ef76b815b86748d9870ef2a430af7b39c710 upstream. >From ff82bedd3e12f0d3353282054ae48c3bd8c72012 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Wed, 9 May 2018 12:12:39 +0900 Subject: x86/kexec: Avoid double free_page() upon do_kexec_load() failure syzbot is reporting crashes after memory allocation failure inside do_kexec_load() [1]. This is because free_transition_pgtable() is called by both init_transition_pgtable() and machine_kexec_cleanup() when memory allocation failed inside init_transition_pgtable(). Regarding 32bit code, machine_kexec_free_page_tables() is called by both machine_kexec_alloc_page_tables() and machine_kexec_cleanup() when memory allocation failed inside machine_kexec_alloc_page_tables(). Fix this by leaving the error handling to machine_kexec_cleanup() (and optionally setting NULL after free_page()). [1] https://syzkaller.appspot.com/bug?id=91e52396168cf2bdd572fe1e1bc0bc645c1c6b40 Fixes: f5deb79679af6eb4 ("x86: kexec: Use one page table in x86_64 machine_kexec") Fixes: 92be3d6bdf2cb349 ("kexec/i386: allocate page table pages dynamically") Reported-by: syzbot Signed-off-by: Tetsuo Handa Signed-off-by: Thomas Gleixner Acked-by: Baoquan He Cc: thomas.lendacky@amd.com Cc: prudo@linux.vnet.ibm.com Cc: Huang Ying Cc: syzkaller-bugs@googlegroups.com Cc: takahiro.akashi@linaro.org Cc: H. Peter Anvin Cc: akpm@linux-foundation.org Cc: dyoung@redhat.com Cc: kirill.shutemov@linux.intel.com Link: https://lkml.kernel.org/r/201805091942.DGG12448.tMFVFSJFQOOLHO@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman commit 338762ca45d072bce8e9c38ba07a194516a58dba Author: Tetsuo Handa Date: Fri May 18 16:09:16 2018 -0700 hfsplus: stop workqueue when fill_super() failed commit 66072c29328717072fd84aaff3e070e3f008ba77 upstream. syzbot is reporting ODEBUG messages at hfsplus_fill_super() [1]. This is because hfsplus_fill_super() forgot to call cancel_delayed_work_sync(). As far as I can see, it is hfsplus_mark_mdb_dirty() from hfsplus_new_inode() in hfsplus_fill_super() that calls queue_delayed_work(). Therefore, I assume that hfsplus_new_inode() does not fail if queue_delayed_work() was called, and the out_put_hidden_dir label is the appropriate location to call cancel_delayed_work_sync(). [1] https://syzkaller.appspot.com/bug?id=a66f45e96fdbeb76b796bf46eb25ea878c42a6c9 Link: http://lkml.kernel.org/r/964a8b27-cd69-357c-fe78-76b066056201@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa Reported-by: syzbot Cc: Al Viro Cc: David Howells Cc: Ernesto A. Fernandez Cc: Vyacheslav Dubeyko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 87c807f1eff39a5b0fc440fac931fcaeba257395 Author: Johannes Berg Date: Tue Apr 3 14:33:49 2018 +0200 cfg80211: limit wiphy names to 128 bytes commit a7cfebcb7594a24609268f91299ab85ba064bf82 upstream. There's currently no limit on wiphy names, other than netlink message size and memory limitations, but that causes issues when, for example, the wiphy name is used in a uevent, e.g. in rfkill where we use the same name for the rfkill instance, and then the buffer there is "only" 2k for the environment variables. This was reported by syzkaller, which used a 4k name. Limit the name to something reasonable, I randomly picked 128. Reported-by: syzbot+230d9e642a85d3fec29c@syzkaller.appspotmail.com Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit c610b0cbb263787564290447452ab84719092f6f Author: Geert Uytterhoeven Date: Thu Feb 18 17:06:30 2016 +0100 gpio: rcar: Add Runtime PM handling for interrupts commit b26a719bdba9aa926ceaadecc66e07623d2b8a53 upstream. The R-Car GPIO driver handles Runtime PM for requested GPIOs only. When using a GPIO purely as an interrupt source, no Runtime PM handling is done, and the GPIO module's clock may not be enabled. To fix this: - Add .irq_request_resources() and .irq_release_resources() callbacks to handle Runtime PM when an interrupt is requested, - Add irq_bus_lock() and sync_unlock() callbacks to handle Runtime PM when e.g. disabling/enabling an interrupt, or configuring the interrupt type. Fixes: d5c3d84657db57bd "net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS" Signed-off-by: Geert Uytterhoeven Signed-off-by: Linus Walleij [fabrizio: cherry-pick to v4.4.y. Use container_of instead of gpiochip_get_data.] Signed-off-by: Fabrizio Castro Reviewed-by: Biju Das Signed-off-by: Greg Kroah-Hartman commit 09f7ebaa436c9bdad4a21c24fed5057604e709c1 Author: John Stultz Date: Thu Jun 8 16:44:21 2017 -0700 time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream. Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz Tested-by: Daniel Mentz Cc: Prarit Bhargava Cc: Kevin Brodsky Cc: Richard Cochran Cc: Stephen Boyd Cc: Will Deacon Cc: "stable #4 . 8+" Cc: Miroslav Lichvar Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner [fabrizio: cherry-pick to 4.4. Kept cycle_t type for function logarithmic_accumulation local variable "interval". Dropped casting of "interval" variable] Signed-off-by: Fabrizio Castro Signed-off-by: Biju Das Signed-off-by: Greg Kroah-Hartman commit 92cffdc98efb1dea5d6f6911616f05dc6338d75f Author: Vinod Koul Date: Tue Apr 12 21:07:06 2016 +0530 dmaengine: ensure dmaengine helpers check valid callback commit 757d12e5849be549076901b0d33c60d5f360269c upstream. dmaengine has various device callbacks and exposes helper functions to invoke these. These helpers should check if channel, device and callback is valid or not before invoking them. Reported-by: Jon Hunter Signed-off-by: Vinod Koul Signed-off-by: Fabrizio Castro Signed-off-by: Jianming Qiao Signed-off-by: Greg Kroah-Hartman commit 367971340270db5b3856172ed2072209ea9575dc Author: Jens Remus Date: Thu May 3 13:52:47 2018 +0200 scsi: zfcp: fix infinite iteration on ERP ready list commit fa89adba1941e4f3b213399b81732a5c12fd9131 upstream. zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen adapter ERP action via zfcp_erp_action_enqueue(). Both are separately processed asynchronously and concurrently. Blocking of rports is done in a kworker by zfcp_scsi_rport_work(). It calls zfcp_scsi_rport_block(), which then traces a DBF REC "scpdely" via zfcp_dbf_rec_trig(). zfcp_dbf_rec_trig() acquires the DBF REC spin lock and then iterates with list_for_each() over the adapter's ERP ready list without holding the ERP lock. This opens a race window in which the current list entry can be moved to another list, causing list_for_each() to iterate forever on the wrong list, as the erp_ready_head is never encountered as terminal condition. Meanwhile the ERP action can be processed in the ERP thread by zfcp_erp_thread(). It calls zfcp_erp_strategy(), which acquires the ERP lock and then calls zfcp_erp_action_to_running() to move the ERP action from the ready to the running list. zfcp_erp_action_to_running() can move the ERP action using list_move() just during the aforementioned race window. It then traces a REC RUN "erator1" via zfcp_dbf_rec_run(). zfcp_dbf_rec_run() tries to acquire the DBF REC spin lock. If this is held by the infinitely looping kworker, it effectively spins forever. Example Sequence Diagram: Process ERP Thread rport_work ------------------- ------------------- ------------------- zfcp_erp_adapter_reopen() zfcp_erp_adapter_block() zfcp_scsi_schedule_rports_block() lock ERP zfcp_scsi_rport_work() zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER) list_add_tail() on ready !(rport_task==RPORT_ADD) wake_up() ERP thread zfcp_scsi_rport_block() zfcp_dbf_rec_trig() zfcp_erp_strategy() zfcp_dbf_rec_trig() unlock ERP lock DBF REC zfcp_erp_wait() lock ERP | zfcp_erp_action_to_running() | list_for_each() ready | list_move() current entry | ready to running | zfcp_dbf_rec_run() endless loop over running | zfcp_dbf_rec_run_lvl() | lock DBF REC spins forever Any adapter recovery can trigger this, such as setting the device offline or reboot. V4.9 commit 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during rport gone") introduced additional tracing of (un)blocking of rports. It missed that the adapter->erp_lock must be held when calling zfcp_dbf_rec_trig(). This fix uses the approach formerly introduced by commit aa0fec62391c ("[SCSI] zfcp: Fix sparse warning by providing new entry in dbf") that got later removed by commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions."). Introduce zfcp_dbf_rec_trig_lock(), a wrapper for zfcp_dbf_rec_trig() that acquires and releases the adapter->erp_lock for read. Reported-by: Sebastian Ott Signed-off-by: Jens Remus Fixes: 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during rport gone") Cc: # 2.6.32+ Reviewed-by: Benjamin Block Signed-off-by: Steffen Maier Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 93314640426ddb6af618d0802e622f6fa771792c Author: Alexander Potapenko Date: Fri May 18 16:23:18 2018 +0200 scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() commit a45b599ad808c3c982fdcdc12b0b8611c2f92824 upstream. This shall help avoid copying uninitialized memory to the userspace when calling ioctl(fd, SG_IO) with an empty command. Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Alexander Potapenko Acked-by: Douglas Gilbert Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 6efcc74e1b0c16aebf5d8107543ce63475af35c1 Author: Jason Yan Date: Thu Mar 8 10:34:53 2018 +0800 scsi: libsas: defer ata device eh commands to libata commit 318aaf34f1179b39fa9c30fa0f3288b645beee39 upstream. When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle these commands. After these commands done, sas task is freed, but ata qc is not freed. This will cause ata qc leak and trigger a warning like below: WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 ata_eh_finish+0xb4/0xcc CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 ...... Call trace: [] ata_eh_finish+0xb4/0xcc [] ata_do_eh+0xc4/0xd8 [] ata_std_error_handler+0x44/0x8c [] ata_scsi_port_error_handler+0x480/0x694 [] async_sas_ata_eh+0x4c/0x80 [] async_run_entry_fn+0x4c/0x170 [] process_one_work+0x144/0x390 [] worker_thread+0x144/0x418 [] kthread+0x10c/0x138 [] ret_from_fork+0x10/0x18 If ata qc leaked too many, ata tag allocation will fail and io blocked for ever. As suggested by Dan Williams, defer ata device commands to libata and merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle ata qcs correctly after this. Signed-off-by: Jason Yan CC: Xiaofei Tan CC: John Garry CC: Dan Williams Reviewed-by: Dan Williams Signed-off-by: Martin K. Petersen Cc: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 8c6d306fad557b6f3a8b80c9cc54611fefc3b3c3 Author: Martin Schwidefsky Date: Wed May 23 18:21:36 2018 +0200 s390: use expoline thunks in the BPF JIT [ Upstream commit de5cb6eb514ebe241e3edeb290cb41deb380b81d ] The BPF JIT need safe guarding against spectre v2 in the sk_load_xxx assembler stubs and the indirect branches generated by the JIT itself need to be converted to expolines. Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit f436cb969ab314b25edcabfc4b4f1442c1b83eb7 Author: Martin Schwidefsky Date: Wed May 23 18:21:35 2018 +0200 s390: extend expoline to BC instructions [ Upstream commit 6deaa3bbca804b2a3627fd685f75de64da7be535 ] The BPF JIT uses a 'b (%r)' instruction in the definition of the sk_load_word and sk_load_half functions. Add support for branch-on-condition instructions contained in the thunk code of an expoline. Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit c617e74f5b3e027f622eb109d94ab909cd69cdfa Author: Martin Schwidefsky Date: Wed May 23 18:21:33 2018 +0200 s390: move spectre sysfs attribute code [ Upstream commit 4253b0e0627ee3461e64c2495c616f1c8f6b127b ] The nospec-branch.c file is compiled without the gcc options to generate expoline thunks. The return branch of the sysfs show functions cpu_show_spectre_v1 and cpu_show_spectre_v2 is an indirect branch as well. These need to be compiled with expolines. Move the sysfs functions for spectre reporting to a separate file and loose an '.' for one of the messages. Cc: stable@vger.kernel.org # 4.16 Fixes: d424986f1d ("s390: add sysfs attributes for spectre") Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 90305465afd422e50afdaa1d614dc335418cd19e Author: Martin Schwidefsky Date: Wed May 23 18:21:32 2018 +0200 s390/kernel: use expoline for indirect branches [ Upstream commit c50c84c3ac4d5db683904bdb3257798b6ef980ae ] The assember code in arch/s390/kernel uses a few more indirect branches which need to be done with execute trampolines for CONFIG_EXPOLINE=y. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Reviewed-by: Hendrik Brueckner Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 5ce9dc0f76519ac7ef2d861172ae280fc9d6c3e2 Author: Martin Schwidefsky Date: Wed May 23 18:21:30 2018 +0200 s390/lib: use expoline for indirect branches [ Upstream commit 97489e0663fa700d6e7febddc43b58df98d7bcda ] The return from the memmove, memset, memcpy, __memset16, __memset32 and __memset64 functions are done with "br %r14". These are indirect branches as well and need to use execute trampolines for CONFIG_EXPOLINE=y. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Reviewed-by: Hendrik Brueckner Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 73bf2b1c5b3219f3497c1023e7add82eb1d1bd4b Author: Martin Schwidefsky Date: Wed May 23 18:21:29 2018 +0200 s390: move expoline assembler macros to a header [ Upstream commit 6dd85fbb87d1d6b87a3b1f02ca28d7b2abd2e7ba ] To be able to use the expoline branches in different assembler files move the associated macros from entry.S to a new header nospec-insn.h. While we are at it make the macros a bit nicer to use. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 2685d33c6ff9f7ca700e7f0c9ad218cd515dcc5a Author: Martin Schwidefsky Date: Wed May 23 18:21:28 2018 +0200 s390: add assembler macros for CPU alternatives [ Upstream commit fba9eb7946251d6e420df3bdf7bc45195be7be9a ] Add a header with macros usable in assembler files to emit alternative code sequences. It works analog to the alternatives for inline assmeblies in C files, with the same restrictions and capabilities. The syntax is ALTERNATIVE "", \ "", \ "" and ALTERNATIVE_2 "", \ "", \ "", "", \ "" Reviewed-by: Vasily Gorbik Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 3cd868dc9b68709d7274120f7ef1c6063948910d Author: Al Viro Date: Thu May 17 17:18:30 2018 -0400 ext2: fix a block leak commit 5aa1437d2d9a068c0334bd7c9dafa8ec4f97f13b upstream. open file, unlink it, then use ioctl(2) to make it immutable or append only. Now close it and watch the blocks *not* freed... Immutable/append-only checks belong in ->setattr(). Note: the bug is old and backport to anything prior to 737f2e93b972 ("ext2: convert to use the new truncate convention") will need these checks lifted into ext2_setattr(). Cc: stable@kernel.org Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 5bbe138a250e795b32f29fc91af88c69851a12d3 Author: Eric Dumazet Date: Mon May 14 21:14:26 2018 -0700 tcp: purge write queue in tcp_connect_init() [ Upstream commit 7f582b248d0a86bae5788c548d7bb5bca6f7691a ] syzkaller found a reliable way to crash the host, hitting a BUG() in __tcp_retransmit_skb() Malicous MSG_FASTOPEN is the root cause. We need to purge write queue in tcp_connect_init() at the point we init snd_una/write_seq. This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE() kernel BUG at net/ipv4/tcp_output.c:2837! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837 RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206 RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49 RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005 RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2 R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80 FS: 0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923 tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488 tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573 tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:525 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Cc: Neal Cardwell Reported-by: syzbot Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8e299f7ae900e3dd8053ec81b405abdfa6ba18e2 Author: Eric Dumazet Date: Fri May 18 04:47:55 2018 -0700 sock_diag: fix use-after-free read in __sk_free [ Upstream commit 9709020c86f6bf8439ca3effc58cfca49a5de192 ] We must not call sock_diag_has_destroy_listeners(sk) on a socket that has no reference on net structure. BUG: KASAN: use-after-free in sock_diag_has_destroy_listeners include/linux/sock_diag.h:75 [inline] BUG: KASAN: use-after-free in __sk_free+0x329/0x340 net/core/sock.c:1609 Read of size 8 at addr ffff88018a02e3a0 by task swapper/1/0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.17.0-rc5+ #54 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 sock_diag_has_destroy_listeners include/linux/sock_diag.h:75 [inline] __sk_free+0x329/0x340 net/core/sock.c:1609 sk_free+0x42/0x50 net/core/sock.c:1623 sock_put include/net/sock.h:1664 [inline] reqsk_free include/net/request_sock.h:116 [inline] reqsk_put include/net/request_sock.h:124 [inline] inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:672 [inline] reqsk_timer_handler+0xe27/0x10e0 net/ipv4/inet_connection_sock.c:739 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:525 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54 RSP: 0018:ffff8801d9ae7c38 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13 RAX: dffffc0000000000 RBX: 1ffff1003b35cf8a RCX: 0000000000000000 RDX: 1ffffffff11a30d0 RSI: 0000000000000001 RDI: ffffffff88d18680 RBP: ffff8801d9ae7c38 R08: ffffed003b5e46c3 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 R13: ffff8801d9ae7cf0 R14: ffffffff897bef20 R15: 0000000000000000 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline] default_idle+0xc2/0x440 arch/x86/kernel/process.c:354 arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:345 default_idle_call+0x6d/0x90 kernel/sched/idle.c:93 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x395/0x560 kernel/sched/idle.c:262 cpu_startup_entry+0x104/0x120 kernel/sched/idle.c:368 start_secondary+0x426/0x5b0 arch/x86/kernel/smpboot.c:269 secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:242 Allocated by task 4557: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554 kmem_cache_zalloc include/linux/slab.h:691 [inline] net_alloc net/core/net_namespace.c:383 [inline] copy_net_ns+0x159/0x4c0 net/core/net_namespace.c:423 create_new_namespaces+0x69d/0x8f0 kernel/nsproxy.c:107 unshare_nsproxy_namespaces+0xc3/0x1f0 kernel/nsproxy.c:206 ksys_unshare+0x708/0xf90 kernel/fork.c:2408 __do_sys_unshare kernel/fork.c:2476 [inline] __se_sys_unshare kernel/fork.c:2474 [inline] __x64_sys_unshare+0x31/0x40 kernel/fork.c:2474 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 69: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x86/0x2d0 mm/slab.c:3756 net_free net/core/net_namespace.c:399 [inline] net_drop_ns.part.14+0x11a/0x130 net/core/net_namespace.c:406 net_drop_ns net/core/net_namespace.c:405 [inline] cleanup_net+0x6a1/0xb20 net/core/net_namespace.c:541 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 The buggy address belongs to the object at ffff88018a02c140 which belongs to the cache net_namespace of size 8832 The buggy address is located 8800 bytes inside of 8832-byte region [ffff88018a02c140, ffff88018a02e3c0) The buggy address belongs to the page: page:ffffea0006280b00 count:1 mapcount:0 mapping:ffff88018a02c140 index:0x0 compound_mapcount: 0 flags: 0x2fffc0000008100(slab|head) raw: 02fffc0000008100 ffff88018a02c140 0000000000000000 0000000100000001 raw: ffffea00062a1320 ffffea0006268020 ffff8801d9bdde40 0000000000000000 page dumped because: kasan: bad access detected Fixes: b922622ec6ef ("sock_diag: don't broadcast kernel sockets") Signed-off-by: Eric Dumazet Cc: Craig Gallek Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d9fb8cc230b2a4757e9fe4f81468f81212d4deaa Author: Willem de Bruijn Date: Fri May 11 13:24:25 2018 -0400 packet: in packet_snd start writing at link layer allocation [ Upstream commit b84bbaf7a6c8cca24f8acf25a2c8e46913a947ba ] Packet sockets allow construction of packets shorter than dev->hard_header_len to accommodate protocols with variable length link layer headers. These packets are padded to dev->hard_header_len, because some device drivers interpret that as a minimum packet size. packet_snd reserves dev->hard_header_len bytes on allocation. SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that link layer headers are stored in the reserved range. SOCK_RAW sockets do the same in tpacket_snd, but not in packet_snd. Syzbot was able to send a zero byte packet to a device with massive 116B link layer header, causing padding to cross over into skb_shinfo. Fix this by writing from the start of the llheader reserved range also in the case of packet_snd/SOCK_RAW. Update skb_set_network_header to the new offset. This also corrects it for SOCK_DGRAM, where it incorrectly double counted reserve due to the skb_push in dev_hard_header. Fixes: 9ed988cd5915 ("packet: validate variable length ll headers") Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 671cf50f4c2f55652495d98fa4306aa8641b2860 Author: Willem de Bruijn Date: Thu May 17 13:13:29 2018 -0400 net: test tailroom before appending to linear skb [ Upstream commit 113f99c3358564a0647d444c2ae34e8b1abfd5b9 ] Device features may change during transmission. In particular with corking, a device may toggle scatter-gather in between allocating and writing to an skb. Do not unconditionally assume that !NETIF_F_SG at write time implies that the same held at alloc time and thus the skb has sufficient tailroom. This issue predates git history. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Eric Dumazet Signed-off-by: Willem de Bruijn Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e2da30c54e49d7c989484865b838aeccad708d00 Author: Liu Bo Date: Wed May 16 01:37:36 2018 +0800 btrfs: fix reading stale metadata blocks after degraded raid1 mounts commit 02a3307aa9c20b4f6626255b028f07f6cfa16feb upstream. If a btree block, aka. extent buffer, is not available in the extent buffer cache, it'll be read out from the disk instead, i.e. btrfs_search_slot() read_block_for_search() # hold parent and its lock, go to read child btrfs_release_path() read_tree_block() # read child Unfortunately, the parent lock got released before reading child, so commit 5bdd3536cbbe ("Btrfs: Fix block generation verification race") had used 0 as parent transid to read the child block. It forces read_tree_block() not to check if parent transid is different with the generation id of the child that it reads out from disk. A simple PoC is included in btrfs/124, 0. A two-disk raid1 btrfs, 1. Right after mkfs.btrfs, block A is allocated to be device tree's root. 2. Mount this filesystem and put it in use, after a while, device tree's root got COW but block A hasn't been allocated/overwritten yet. 3. Umount it and reload the btrfs module to remove both disks from the global @fs_devices list. 4. mount -odegraded dev1 and write some data, so now block A is allocated to be a leaf in checksum tree. Note that only dev1 has the latest metadata of this filesystem. 5. Umount it and mount it again normally (with both disks), since raid1 can pick up one disk by the writer task's pid, if btrfs_search_slot() needs to read block A, dev2 which does NOT have the latest metadata might be read for block A, then we got a stale block A. 6. As parent transid is not checked, block A is marked as uptodate and put into the extent buffer cache, so the future search won't bother to read disk again, which means it'll make changes on this stale one and make it dirty and flush it onto disk. To avoid the problem, parent transid needs to be passed to read_tree_block(). In order to get a valid parent transid, we need to hold the parent's lock until finishing reading child. This patch needs to be slightly adapted for stable kernels, the &first_key parameter added to read_tree_block() is from 4.16+ (581c1760415c4). The fix is to replace 0 by 'gen'. Fixes: 5bdd3536cbbe ("Btrfs: Fix block generation verification race") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Liu Bo Reviewed-by: Filipe Manana Reviewed-by: Qu Wenruo [ update changelog ] Signed-off-by: David Sterba Signed-off-by: Nikolay Borisov Signed-off-by: Greg Kroah-Hartman commit 68dea4bd04f34e53771f6e143322cadc2f4beb48 Author: Anand Jain Date: Thu May 17 15:16:51 2018 +0800 btrfs: fix crash when trying to resume balance without the resume flag commit 02ee654d3a04563c67bfe658a05384548b9bb105 upstream. We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance() only, which isn't called during the remount. So when resuming from the paused balance we hit the bug: kernel: kernel BUG at fs/btrfs/volumes.c:3890! :: kernel: balance_kthread+0x51/0x60 [btrfs] kernel: kthread+0x111/0x130 :: kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8 Reproducer: On a mounted filesystem: btrfs balance start --full-balance /btrfs btrfs balance pause /btrfs mount -o remount,ro /dev/sdb /btrfs mount -o remount,rw /dev/sdb /btrfs To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async(). CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Anand Jain Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 72d1df8a8921d801904cc71428351c2960b7155b Author: Filipe Manana Date: Fri May 11 16:42:42 2018 +0100 Btrfs: fix xattr loss after power failure commit 9a8fca62aacc1599fea8e813d01e1955513e4fad upstream. If a file has xattrs, we fsync it, to ensure we clear the flags BTRFS_INODE_NEEDS_FULL_SYNC and BTRFS_INODE_COPY_EVERYTHING from its inode, the current transaction commits and then we fsync it (without either of those bits being set in its inode), we end up not logging all its xattrs. This results in deleting all xattrs when replying the log after a power failure. Trivial reproducer $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ touch /mnt/foobar $ setfattr -n user.xa -v qwerty /mnt/foobar $ xfs_io -c "fsync" /mnt/foobar $ sync $ xfs_io -c "pwrite -S 0xab 0 64K" /mnt/foobar $ xfs_io -c "fsync" /mnt/foobar $ mount /dev/sdb /mnt $ getfattr --absolute-names --dump /mnt/foobar $ So fix this by making sure all xattrs are logged if we log a file's inode item and neither the flags BTRFS_INODE_NEEDS_FULL_SYNC nor BTRFS_INODE_COPY_EVERYTHING were set in the inode. Fixes: 36283bf777d9 ("Btrfs: fix fsync xattr loss in the fast fsync path") Cc: # 4.2+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 104cff91618e6073e428a8a549f69f70d6454f08 Author: Masami Hiramatsu Date: Sun May 13 05:04:29 2018 +0100 ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions commit 0d73c3f8e7f6ee2aab1bb350f60c180f5ae21a2c upstream. Since do_undefinstr() uses get_user to get the undefined instruction, it can be called before kprobes processes recursive check. This can cause an infinit recursive exception. Prohibit probing on get_user functions. Fixes: 24ba613c9d6c ("ARM kprobes: core code") Signed-off-by: Masami Hiramatsu Cc: stable@vger.kernel.org Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 51425b7554a92040e8dc9fe42b2694a6df2125c0 Author: Masami Hiramatsu Date: Sun May 13 05:04:10 2018 +0100 ARM: 8770/1: kprobes: Prohibit probing on optimized_callback commit 70948c05fdde0aac32f9667856a88725c192fa40 upstream. Prohibit probing on optimized_callback() because it is called from kprobes itself. If we put a kprobes on it, that will cause a recursive call loop. Mark it NOKPROBE_SYMBOL. Fixes: 0dc016dbd820 ("ARM: kprobes: enable OPTPROBES for ARM 32") Signed-off-by: Masami Hiramatsu Cc: stable@vger.kernel.org Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 0ec84ae5aba19bf79653371f0bd45c12631b49b6 Author: Masami Hiramatsu Date: Sun May 13 05:03:54 2018 +0100 ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed commit 69af7e23a6870df2ea6fa79ca16493d59b3eebeb upstream. Since get_kprobe_ctlblk() uses smp_processor_id() to access per-cpu variable, it hits smp_processor_id sanity check as below. [ 7.006928] BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 [ 7.007859] caller is debug_smp_processor_id+0x20/0x24 [ 7.008438] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00192-g4eb17253e4b5 #1 [ 7.008890] Hardware name: Generic DT based system [ 7.009917] [] (unwind_backtrace) from [] (show_stack+0x20/0x24) [ 7.010473] [] (show_stack) from [] (dump_stack+0x84/0x98) [ 7.010990] [] (dump_stack) from [] (check_preemption_disabled+0x138/0x13c) [ 7.011592] [] (check_preemption_disabled) from [] (debug_smp_processor_id+0x20/0x24) [ 7.012214] [] (debug_smp_processor_id) from [] (optimized_callback+0x2c/0xe4) [ 7.013077] [] (optimized_callback) from [] (0xbf0021b0) To fix this issue, call get_kprobe_ctlblk() right after irq-disabled since that disables preemption. Fixes: 0dc016dbd820 ("ARM: kprobes: enable OPTPROBES for ARM 32") Signed-off-by: Masami Hiramatsu Cc: stable@vger.kernel.org Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 2c4c0ab87d4d72ba07f8a01b51bd15b9b6a46516 Author: Dexuan Cui Date: Tue May 15 19:52:50 2018 +0000 tick/broadcast: Use for_each_cpu() specially on UP kernels commit 5596fe34495cf0f645f417eb928ef224df3e3cb4 upstream. for_each_cpu() unintuitively reports CPU0 as set independent of the actual cpumask content on UP kernels. This causes an unexpected PIT interrupt storm on a UP kernel running in an SMP virtual machine on Hyper-V, and as a result, the virtual machine can suffer from a strange random delay of 1~20 minutes during boot-up, and sometimes it can hang forever. Protect if by checking whether the cpumask is empty before entering the for_each_cpu() loop. [ tglx: Use !IS_ENABLED(CONFIG_SMP) instead of #ifdeffery ] Signed-off-by: Dexuan Cui Signed-off-by: Thomas Gleixner Cc: Josh Poulson Cc: "Michael Kelley (EOSG)" Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: stable@vger.kernel.org Cc: Rakib Mullick Cc: Jork Loeser Cc: Greg Kroah-Hartman Cc: Andrew Morton Cc: KY Srinivasan Cc: Linus Torvalds Cc: Alexey Dobriyan Cc: Dmitry Vyukov Link: https://lkml.kernel.org/r/KL1P15301MB000678289FE55BA365B3279ABF990@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM Link: https://lkml.kernel.org/r/KL1P15301MB0006FA63BC22BEB64902EAA0BF930@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM Signed-off-by: Greg Kroah-Hartman commit e4f821810c2856c856c28884cd6fcb1b528412bb Author: Masami Hiramatsu Date: Sun May 13 05:04:16 2018 +0100 ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr commit eb0146daefdde65665b7f076fbff7b49dade95b9 upstream. Prohibit kprobes on do_undefinstr because kprobes on arm is implemented by undefined instruction. This means if we probe do_undefinstr(), it can cause infinit recursive exception. Fixes: 24ba613c9d6c ("ARM kprobes: core code") Signed-off-by: Masami Hiramatsu Cc: stable@vger.kernel.org Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 5f1d5f781555718ceebded2d9a7659d3766d86b8 Author: Ard Biesheuvel Date: Fri May 4 07:59:58 2018 +0200 efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode commit 0b3225ab9407f557a8e20f23f37aa7236c10a9b1 upstream. Mixed mode allows a kernel built for x86_64 to interact with 32-bit EFI firmware, but requires us to define all struct definitions carefully when it comes to pointer sizes. 'struct efi_pci_io_protocol_32' currently uses a 'void *' for the 'romimage' field, which will be interpreted as a 64-bit field on such kernels, potentially resulting in bogus memory references and subsequent crashes. Tested-by: Hans de Goede Signed-off-by: Ard Biesheuvel Cc: Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-13-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit fa376a12da18f847d9f76695a6f3d891fc6224de Author: Martin Schwidefsky Date: Tue Apr 24 11:18:49 2018 +0200 s390: remove indirect branch from do_softirq_own_stack commit 9f18fff63cfd6f559daa1eaae60640372c65f84b upstream. The inline assembly to call __do_softirq on the irq stack uses an indirect branch. This can be replaced with a normal relative branch. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Reviewed-by: Hendrik Brueckner Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit b26157de3368201ca3cc08007ee4dfe7daae2b92 Author: Julian Wiedmann Date: Wed May 2 08:28:34 2018 +0200 s390/qdio: don't release memory in qdio_setup_irq() commit 2e68adcd2fb21b7188ba449f0fab3bee2910e500 upstream. Calling qdio_release_memory() on error is just plain wrong. It frees the main qdio_irq struct, when following code still uses it. Also, no other error path in qdio_establish() does this. So trust callers to clean up via qdio_free() if some step of the QDIO initialization fails. Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.") Cc: #v2.6.27+ Signed-off-by: Julian Wiedmann Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 55b21399d8d94ec124cc5ceed2e11f9be53d9e13 Author: Hendrik Brueckner Date: Thu May 3 15:56:15 2018 +0200 s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero commit 4bbaf2584b86b0772413edeac22ff448f36351b1 upstream. Correct a trinity finding for the perf_event_open() system call with a perf event attribute structure that uses a frequency but has the sampling frequency set to zero. This causes a FP divide exception during the sample rate initialization for the hardware sampling facility. Fixes: 8c069ff4bd606 ("s390/perf: add support for the CPU-Measurement Sampling Facility") Cc: stable@vger.kernel.org # 3.14+ Reviewed-by: Heiko Carstens Signed-off-by: Hendrik Brueckner Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit 9c32e0d3655efc6f5b556614a6713870da477d05 Author: Julian Wiedmann Date: Wed May 2 08:48:43 2018 +0200 s390/qdio: fix access to uninitialized qdio_q fields commit e521813468f786271a87e78e8644243bead48fad upstream. Ever since CQ/QAOB support was added, calling qdio_free() straight after qdio_alloc() results in qdio_release_memory() accessing uninitialized memory (ie. q->u.out.use_cq and q->u.out.aobs). Followed by a kmem_cache_free() on the random AOB addresses. For older kernels that don't have 6e30c549f6ca, the same applies if qdio_establish() fails in the DEV_STATE_ONLINE check. While initializing q->u.out.use_cq would be enough to fix this particular bug, the more future-proof change is to just zero-alloc the whole struct. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: #v3.2+ Signed-off-by: Julian Wiedmann Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman commit ec8d6e953ad179be174b59513d8392ef48f566d5 Author: Pavel Tatashin Date: Fri May 18 16:09:13 2018 -0700 mm: don't allow deferred pages with NEED_PER_CPU_KM commit ab1e8d8960b68f54af42b6484b5950bd13a4054b upstream. It is unsafe to do virtual to physical translations before mm_init() is called if struct page is needed in order to determine the memory section number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we initialize struct pages for all the allocated memory when deferred struct pages are used. My recent fix in commit c9e97a1997 ("mm: initialize pages on demand during boot") exposed this problem, because it greatly reduced number of pages that are initialized before mm_init(), but the problem existed even before my fix, as Fengguang Wu found. Below is a more detailed explanation of the problem. We initialize struct pages in four places: 1. Early in boot a small set of struct pages is initialized to fill the first section, and lower zones. 2. During mm_init() we initialize "struct pages" for all the memory that is allocated, i.e reserved in memblock. 3. Using on-demand logic when pages are allocated after mm_init call (when memblock is finished) 4. After smp_init() when the rest free deferred pages are initialized. The problem occurs if we try to do va to phys translation of a memory between steps 1 and 2. Because we have not yet initialized struct pages for all the reserved pages, it is inherently unsafe to do va to phys if the translation itself requires access of "struct page" as in case of this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP The following path exposes the problem: start_kernel() trap_init() setup_cpu_entry_areas() setup_cpu_entry_area(cpu) get_cpu_gdt_paddr(cpu) per_cpu_ptr_to_phys(addr) pcpu_addr_to_page(addr) virt_to_page(addr) pfn_to_page(__pa(addr) >> PAGE_SHIFT) We disable this path by not allowing NEED_PER_CPU_KM with deferred struct pages feature. The problems are discussed in these threads: http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Pavel Tatashin Acked-by: Michal Hocko Reviewed-by: Andrew Morton Cc: Steven Sistare Cc: Daniel Jordan Cc: Mel Gorman Cc: Fengguang Wu Cc: Dennis Zhou Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 380efa6a71c5aba6dec2a1432a29644bb0b220a8 Author: Nicholas Piggin Date: Tue May 15 01:59:47 2018 +1000 powerpc/powernv: Fix NVRAM sleep in invalid context when crashing commit c1d2a31397ec51f0370f6bd17b19b39152c263cb upstream. Similarly to opal_event_shutdown, opal_nvram_write can be called in the crash path with irqs disabled. Special case the delay to avoid sleeping in invalid context. Fixes: 3b8070335f75 ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops") Cc: stable@vger.kernel.org # v3.2 Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit b8c71ce9e00d7aa60b847a5822fd18a716a52332 Author: Janis Danisevskis Date: Fri May 20 17:00:08 2016 -0700 procfs: fix pthread cross-thread naming if !PR_DUMPABLE commit 1b3044e39a89cb1d4d5313da477e8dfea2b5232d upstream. The PR_DUMPABLE flag causes the pid related paths of the proc file system to be owned by ROOT. The implementation of pthread_set/getname_np however needs access to /proc//task//comm. If PR_DUMPABLE is false this implementation is locked out. This patch installs a special permission function for the file "comm" that grants read and write access to all threads of the same group regardless of the ownership of the inode. For all other threads the function falls back to the generic inode permission check. [akpm@linux-foundation.org: fix spello in comment] Signed-off-by: Janis Danisevskis Acked-by: Kees Cook Cc: Al Viro Cc: Cyrill Gorcunov Cc: Alexey Dobriyan Cc: Colin Ian King Cc: David Rientjes Cc: Minfei Huang Cc: John Stultz Cc: Calvin Owens Cc: Jann Horn Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a58c00657ec02755fe322eebd11dbf6a24da1d91 Author: Mateusz Guzik Date: Wed Jan 20 15:01:05 2016 -0800 proc read mm's {arg,env}_{start,end} with mmap semaphore taken. commit a3b609ef9f8b1dbfe97034ccad6cd3fe71fbe7ab upstream. Only functions doing more than one read are modified. Consumeres happened to deal with possibly changing data, but it does not seem like a good thing to rely on. Signed-off-by: Mateusz Guzik Acked-by: Cyrill Gorcunov Cc: Alexey Dobriyan Cc: Jarod Wilson Cc: Jan Stancek Cc: Al Viro Cc: Anshuman Khandual Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 111cb9f5df8a649cd0e57bbacf8d5254a2a8c712 Author: Steven Rostedt (VMware) Date: Wed May 9 14:36:09 2018 -0400 tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all} commit 45dd9b0666a162f8e4be76096716670cf1741f0e upstream. Doing an audit of trace events, I discovered two trace events in the xen subsystem that use a hack to create zero data size trace events. This is not what trace events are for. Trace events add memory footprint overhead, and if all you need to do is see if a function is hit or not, simply make that function noinline and use function tracer filtering. Worse yet, the hack used was: __array(char, x, 0) Which creates a static string of zero in length. There's assumptions about such constructs in ftrace that this is a dynamic string that is nul terminated. This is not the case with these tracepoints and can cause problems in various parts of ftrace. Nuke the trace events! Link: http://lkml.kernel.org/r/20180509144605.5a220327@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 95a7d76897c1e ("xen/mmu: Use Xen specific TLB flush instead of the generic one.") Reviewed-by: Juergen Gross Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit f5a80c5d9ca98045bae32ef29a9ab031066047ba Author: Srinivas Pandruvada Date: Thu Feb 25 15:09:19 2016 -0800 cpufreq: intel_pstate: Enable HWP by default commit 7791e4aa59ad724e0b4c8b4dea547a5735108972 upstream. If the processor supports HWP, enable it by default without checking for the cpu model. This will allow to enable HWP in all supported processors without driver change. Signed-off-by: Srinivas Pandruvada Signed-off-by: Rafael J. Wysocki Signed-off-by: Thomas Renninger Signed-off-by: Greg Kroah-Hartman commit eae31bc183db0281c51e09a2830d70ca20e22562 Author: Waiman Long Date: Wed Dec 14 15:04:10 2016 -0800 signals: avoid unnecessary taking of sighand->siglock commit c7be96af89d4b53211862d8599b2430e8900ed92 upstream. When running certain database workload on a high-end system with many CPUs, it was found that spinlock contention in the sigprocmask syscalls became a significant portion of the overall CPU cycles as shown below. 9.30% 9.30% 905387 dataserver /proc/kcore 0x7fff8163f4d2 [k] _raw_spin_lock_irq | ---_raw_spin_lock_irq | |--99.34%-- __set_current_blocked | sigprocmask | sys_rt_sigprocmask | system_call_fastpath | | | |--50.63%-- __swapcontext | | | | | |--99.91%-- upsleepgeneric | | | |--49.36%-- __setcontext | | ktskRun Looking further into the swapcontext function in glibc, it was found that the function always call sigprocmask() without checking if there are changes in the signal mask. A check was added to the __set_current_blocked() function to avoid taking the sighand->siglock spinlock if there is no change in the signal mask. This will prevent unneeded spinlock contention when many threads are trying to call sigprocmask(). With this patch applied, the spinlock contention in sigprocmask() was gone. Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com Signed-off-by: Waiman Long Acked-by: Oleg Nesterov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Stas Sergeev Cc: Scott J Norton Cc: Douglas Hatch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit 5dd02a8952118538308b3bf5170d5db52d3ed477 Author: Mel Gorman Date: Tue Mar 15 14:55:39 2016 -0700 mm: filemap: avoid unnecessary calls to lock_page when waiting for IO to complete during a read commit ebded02788b5d7c7600f8cff26ae07896d568649 upstream. In the generic read paths the kernel looks up a page in the page cache and if it's up to date, it is used. If not, the page lock is acquired to wait for IO to complete and then check the page. If multiple processes are waiting on IO, they all serialise against the lock and duplicate the checks. This is unnecessary. The page lock in itself does not give any guarantees to the callers about the page state as it can be immediately truncated or reclaimed after the page is unlocked. It's sufficient to wait_on_page_locked and then continue if the page is up to date on wakeup. It is possible that a truncated but up-to-date page is returned but the reference taken during read prevents it disappearing underneath the caller and the data is still valid if PageUptodate. The overall impact is small as even if processes serialise on the lock, the lock section is tiny once the IO is complete. Profiles indicated that unlock_page and friends are generally a tiny portion of a read-intensive workload. An artificial test was created that had instances of dd access a cache-cold file on an ext4 filesystem and measure how long the read took. paralleldd 4.4.0 4.4.0 vanilla avoidlock Amean Elapsd-1 5.28 ( 0.00%) 5.15 ( 2.50%) Amean Elapsd-4 5.29 ( 0.00%) 5.17 ( 2.12%) Amean Elapsd-7 5.28 ( 0.00%) 5.18 ( 1.78%) Amean Elapsd-12 5.20 ( 0.00%) 5.33 ( -2.50%) Amean Elapsd-21 5.14 ( 0.00%) 5.21 ( -1.41%) Amean Elapsd-30 5.30 ( 0.00%) 5.12 ( 3.38%) Amean Elapsd-48 5.78 ( 0.00%) 5.42 ( 6.21%) Amean Elapsd-79 6.78 ( 0.00%) 6.62 ( 2.46%) Amean Elapsd-110 9.09 ( 0.00%) 8.99 ( 1.15%) Amean Elapsd-128 10.60 ( 0.00%) 10.43 ( 1.66%) The impact is small but intuitively, it makes sense to avoid unnecessary calls to lock_page. Signed-off-by: Mel Gorman Reviewed-by: Jan Kara Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit 9c9a6bbaf85bef862609f98809853ab1a1c69c98 Author: Mel Gorman Date: Tue Mar 15 14:55:36 2016 -0700 mm: filemap: remove redundant code in do_read_cache_page commit 32b635298ff4e991d8d8f64dc23782b02eec29c3 upstream. do_read_cache_page and __read_cache_page duplicate page filler code when filling the page for the first time. This patch simply removes the duplicate logic. Signed-off-by: Mel Gorman Reviewed-by: Jan Kara Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit 106253c4c9bca7880000d44aa2248b1ca2071c92 Author: Johannes Weiner Date: Thu Jan 14 15:20:18 2016 -0800 proc: meminfo: estimate available memory more conservatively commit 84ad5802a33a4964a49b8f7d24d80a214a096b19 upstream. The MemAvailable item in /proc/meminfo is to give users a hint of how much memory is allocatable without causing swapping, so it excludes the zones' low watermarks as unavailable to userspace. However, for a userspace allocation, kswapd will actually reclaim until the free pages hit a combination of the high watermark and the page allocator's lowmem protection that keeps a certain amount of DMA and DMA32 memory from userspace as well. Subtract the full amount we know to be unavailable to userspace from the number of free pages when calculating MemAvailable. Signed-off-by: Johannes Weiner Cc: Rik van Riel Cc: Mel Gorman Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit a4134dfc6b4c38b3b6b35d867f9f09ddb2a2b2ec Author: Vladimir Davydov Date: Thu Jan 14 15:19:38 2016 -0800 vmscan: do not force-scan file lru if its absolute size is small commit 316bda0e6cc5f36f94b4af8bded16d642c90ad75 upstream. We assume there is enough inactive page cache if the size of inactive file lru is greater than the size of active file lru, in which case we force-scan file lru ignoring anonymous pages. While this logic works fine when there are plenty of page cache pages, it fails if the size of file lru is small (several MB): in this case (lru_size >> prio) will be 0 for normal scan priorities, as a result, if inactive file lru happens to be larger than active file lru, anonymous pages of a cgroup will never get evicted unless the system experiences severe memory pressure, even if there are gigabytes of unused anonymous memory there, which is unfair in respect to other cgroups, whose workloads might be page cache oriented. This patch attempts to fix this by elaborating the "enough inactive page cache" check: it makes it not only check that inactive lru size > active lru size, but also that we will scan something from the cgroup at the current scan priority. If these conditions do not hold, we proceed to SCAN_FRACT as usual. Signed-off-by: Vladimir Davydov Acked-by: Johannes Weiner Acked-by: Michal Hocko Cc: Vlastimil Babka Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit 90c2e2e1a88ab8b8ca7387b49539722268ecd96e Author: Benjamin Herrenschmidt Date: Wed Jan 10 17:10:12 2018 +1100 powerpc: Don't preempt_disable() in show_cpuinfo() commit 349524bc0da698ec77f2057cf4a4948eb6349265 upstream. This causes warnings from cpufreq mutex code. This is also rather unnecessary and ineffective. If we really want to prevent concurrent unplug, we could take the unplug read lock but I don't see this being critical. Fixes: cd77b5ce208c ("powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo") Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Michael Ellerman Acked-by: Michal Suchanek Signed-off-by: Greg Kroah-Hartman commit 605902e923584c2884ef869ef2cd4e4172be275e Author: Anders Roxell Date: Wed Jan 27 20:26:54 2016 +0100 cpuidle: coupled: remove unused define cpuidle_coupled_lock commit 75274b33e779ae40a750bcb4bd0b07c4dfef4746 upstream. This was found with the -RT patch enabled, but the fix should apply to non-RT also. Used multi_v7_defconfig+PREEMPT_RT_FULL=y and this caused a compilation warning without this fix: ../drivers/cpuidle/coupled.c:122:21: warning: 'cpuidle_coupled_lock' defined but not used [-Wunused-variable] Signed-off-by: Anders Roxell Signed-off-by: Rafael J. Wysocki Signed-off-by: Mike Galbraith Signed-off-by: Greg Kroah-Hartman commit 662de929795e61ce845285145ccc0a8c8330f75b Author: Stewart Smith Date: Wed Dec 9 17:18:20 2015 +1100 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL commit e4d54f71d29997344b4c4c8d47708240f9f23a5c upstream. Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith Signed-off-by: Michael Ellerman Signed-off-by: Mike Galbraith Signed-off-by: Greg Kroah-Hartman commit fb0e6fa225a455f3070921455455202d03ace691 Author: Stewart Smith Date: Wed Dec 9 17:18:19 2015 +1100 powerpc/powernv: Remove OPALv2 firmware define and references commit 7261aafc095763b119136a562540dea7b1ccf657 upstream. OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: Stewart Smith Signed-off-by: Michael Ellerman Signed-off-by: Mike Galbraith Signed-off-by: Greg Kroah-Hartman commit bffb5017b10003d0513dd8806ccd87facbf82de6 Author: Stewart Smith Date: Wed Dec 9 17:18:18 2015 +1100 powerpc/powernv: panic() on OPAL < V3 commit 786842b62f81f20d14894925e8c225328ee8144b upstream. The OpenPower Abstraction Layer firmware went through a couple of iterations in the lab before being released. What we now know as OPAL advertises itself as OPALv3. OPALv2 and OPALv1 never made it outside the lab, and the possibility of anyone at all ever building a mainline kernel today and expecting it to boot on such hardware is zero. Signed-off-by: Stewart Smith Signed-off-by: Michael Ellerman Signed-off-by: Mike Galbraith Signed-off-by: Greg Kroah-Hartman commit 503ef637f94de26dc88241634f213f42194367d6 Author: Andy Shevchenko Date: Thu Apr 19 19:53:32 2018 +0300 spi: pxa2xx: Allow 64-bit DMA commit efc4a13724b852ddaa3358402a8dec024ffbcb17 upstream. Currently the 32-bit device address only is supported for DMA. However, starting from Intel Sunrisepoint PCH the DMA address of the device FIFO can be 64-bit. Change the respective variable to be compatible with DMA engine expectations, i.e. to phys_addr_t. Fixes: 34cadd9c1bcb ("spi: pxa2xx: Add support for Intel Sunrisepoint") Signed-off-by: Andy Shevchenko Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 59e3b28998558e3fdaf96ab37141c4d24d520f69 Author: Wenwen Wang Date: Sat May 5 13:38:03 2018 -0500 ALSA: control: fix a redundant-copy issue commit 3f12888dfae2a48741c4caa9214885b3aaf350f9 upstream. In snd_ctl_elem_add_compat(), the fields of the struct 'data' need to be copied from the corresponding fields of the struct 'data32' in userspace. This is achieved by invoking copy_from_user() and get_user() functions. The problem here is that the 'type' field is copied twice. One is by copy_from_user() and one is by get_user(). Given that the 'type' field is not used between the two copies, the second copy is *completely* redundant and should be removed for better performance and cleanup. Also, these two copies can cause inconsistent data: as the struct 'data32' resides in userspace and a malicious userspace process can race to change the 'type' field between the two copies to cause inconsistent data. Depending on how the data is used in the future, such an inconsistency may cause potential security risks. For above reasons, we should take out the second copy. Signed-off-by: Wenwen Wang Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1b406c0d49383d8474dd343ca463137ee8a89584 Author: Hans de Goede Date: Tue May 8 09:27:46 2018 +0200 ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist commit c8beccc19b92f5172994c0732db689c08f4f98e5 upstream. Power-saving is causing loud plops on the Lenovo C50 All in one, add it to the blacklist. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1572975 Signed-off-by: Hans de Goede Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 2444bdbebcf9fc71b9177965c8117523f91b5170 Author: Federico Cuello Date: Wed May 9 00:13:38 2018 +0200 ALSA: usb: mixer: volume quirk for CM102-A+/102S+ commit 21493316a3c4598f308d5a9fa31cc74639c4caff upstream. Currently it's not possible to set volume lower than 26% (it just mutes). Also fixes this warning: Warning! Unlikely big volume range (=9472), cval->res is probably wrong. [13] FU [PCM Playback Volume] ch = 2, val = -9473/-1/1 , and volume works fine for full range. Signed-off-by: Federico Cuello Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 294c6cc3364ab82b8717db938d123fbf968b8c0f Author: Shuah Khan (Samsung OSG) Date: Tue May 15 17:57:23 2018 -0600 usbip: usbip_host: fix bad unlock balance during stub_probe() commit c171654caa875919be3c533d3518da8be5be966e upstream. stub_probe() calls put_busid_priv() in an error path when device isn't found in the busid_table. Fix it by making put_busid_priv() safe to be called with null struct bus_id_priv pointer. This problem happens when "usbip bind" is run without loading usbip_host driver and then running modprobe. The first failed bind attempt unbinds the device from the original driver and when usbip_host is modprobed, stub_probe() runs and doesn't find the device in its busid table and calls put_busid_priv(0 with null bus_id_priv pointer. usbip-host 3-10.2: 3-10.2 is not in match_busid table... skip! [ 367.359679] ===================================== [ 367.359681] WARNING: bad unlock balance detected! [ 367.359683] 4.17.0-rc4+ #5 Not tainted [ 367.359685] ------------------------------------- [ 367.359688] modprobe/2768 is trying to release lock ( [ 367.359689] ================================================================== [ 367.359696] BUG: KASAN: null-ptr-deref in print_unlock_imbalance_bug+0x99/0x110 [ 367.359699] Read of size 8 at addr 0000000000000058 by task modprobe/2768 [ 367.359705] CPU: 4 PID: 2768 Comm: modprobe Not tainted 4.17.0-rc4+ #5 Fixes: 22076557b07c ("usbip: usbip_host: fix NULL-ptr deref and use-after-free errors") in usb-linus Signed-off-by: Shuah Khan (Samsung OSG) Cc: stable Signed-off-by: Greg Kroah-Hartman commit 02995a5882371a9fca3033fd356598a805d46040 Author: Shuah Khan (Samsung OSG) Date: Mon May 14 20:49:58 2018 -0600 usbip: usbip_host: fix NULL-ptr deref and use-after-free errors commit 22076557b07c12086eeb16b8ce2b0b735f7a27e7 upstream. usbip_host updates device status without holding lock from stub probe, disconnect and rebind code paths. When multiple requests to import a device are received, these unprotected code paths step all over each other and drive fails with NULL-ptr deref and use-after-free errors. The driver uses a table lock to protect the busid array for adding and deleting busids to the table. However, the probe, disconnect and rebind paths get the busid table entry and update the status without holding the busid table lock. Add a new finer grain lock to protect the busid entry. This new lock will be held to search and update the busid entry fields from get_busid_idx(), add_match_busid() and del_match_busid(). match_busid_show() does the same to access the busid entry fields. get_busid_priv() changed to return the pointer to the busid entry holding the busid lock. stub_probe(), stub_disconnect() and stub_device_rebind() call put_busid_priv() to release the busid lock before returning. This changes fixes the unprotected code paths eliminating the race conditions in updating the busid entries. Reported-by: Jakub Jirasek Signed-off-by: Shuah Khan (Samsung OSG) Cc: stable Signed-off-by: Greg Kroah-Hartman commit 22d4a89efe86b8710d3f0436a1d68979207719b8 Author: Shuah Khan (Samsung OSG) Date: Mon Apr 30 16:17:20 2018 -0600 usbip: usbip_host: run rebind from exit when module is removed commit 7510df3f29d44685bab7b1918b61a8ccd57126a9 upstream. After removing usbip_host module, devices it releases are left without a driver. For example, when a keyboard or a mass storage device are bound to usbip_host when it is removed, these devices are no longer bound to any driver. Fix it to run device_attach() from the module exit routine to restore the devices to their original drivers. This includes cleanup changes and moving device_attach() code to a common routine to be called from rebind_store() and usbip_host_exit(). Signed-off-by: Shuah Khan (Samsung OSG) Cc: stable Signed-off-by: Greg Kroah-Hartman commit fce529ec9971384bf0fe57b39ffc65226bc3a1a5 Author: Shuah Khan (Samsung OSG) Date: Mon Apr 30 16:17:19 2018 -0600 usbip: usbip_host: delete device from busid_table after rebind commit 1e180f167d4e413afccbbb4a421b48b2de832549 upstream. Device is left in the busid_table after unbind and rebind. Rebind initiates usb bus scan and the original driver claims the device. After rescan the device should be deleted from the busid_table as it no longer belongs to usbip_host. Fix it to delete the device after device_attach() succeeds. Signed-off-by: Shuah Khan (Samsung OSG) Cc: stable Signed-off-by: Greg Kroah-Hartman commit 39cfc006fb1378f92faf91b94844d33e4d819f91 Author: Shuah Khan Date: Wed Apr 11 18:13:30 2018 -0600 usbip: usbip_host: refine probe and disconnect debug msgs to be useful commit 28b68acc4a88dcf91fd1dcf2577371dc9bf574cc upstream. Refine probe and disconnect debug msgs to be useful and say what is in progress. Signed-off-by: Shuah Khan Cc: stable Signed-off-by: Greg Kroah-Hartman commit ea00b22b02f228cb58ee6c6707c86ec270e37fba Author: zhongjiang Date: Mon Jul 10 15:53:01 2017 -0700 kernel/exit.c: avoid undefined behaviour when calling wait4() commit dd83c161fbcc5d8be637ab159c0de015cbff5ba4 upstream. wait4(-2147483648, 0x20, 0, 0xdd0000) triggers: UBSAN: Undefined behaviour in kernel/exit.c:1651:9 The related calltrace is as follows: negation of -2147483648 cannot be represented in type 'int': CPU: 9 PID: 16482 Comm: zj Tainted: G B ---- ------- 3.10.0-327.53.58.71.x86_64+ #66 Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011 Call Trace: dump_stack+0x19/0x1b ubsan_epilogue+0xd/0x50 __ubsan_handle_negate_overflow+0x109/0x14e SyS_wait4+0x1cb/0x1e0 system_call_fastpath+0x16/0x1b Exclude the overflow to avoid the UBSAN warning. Link: http://lkml.kernel.org/r/1497264618-20212-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhongjiang Cc: Oleg Nesterov Cc: David Rientjes Cc: Aneesh Kumar K.V Cc: Kirill A. Shutemov Cc: Xishi Qiu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit cde6d68b1a4f67b8b1c7f71c710fd8590ef725a3 Author: Jiri Slaby Date: Thu Nov 30 15:35:44 2017 +0100 futex: futex_wake_op, fix sign_extend32 sign bits commit d70ef22892ed6c066e51e118b225923c9b74af34 upstream. sign_extend32 counts the sign bit parameter from 0, not from 1. So we have to use "11" for 12th bit, not "12". This mistake means we have not allowed negative op and cmp args since commit 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour") till now. Fixes: 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour") Signed-off-by: Jiri Slaby Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Darren Hart Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 7d56aed52b52450181c7ba58cb680739cb7e5737 Author: Michael Kerrisk (man-pages) Date: Tue Oct 11 13:53:43 2016 -0700 pipe: cap initial pipe capacity according to pipe-max-size limit commit 086e774a57fba4695f14383c0818994c0b31da7c upstream. This is a patch that provides behavior that is more consistent, and probably less surprising to users. I consider the change optional, and welcome opinions about whether it should be applied. By default, pipes are created with a capacity of 64 kiB. However, /proc/sys/fs/pipe-max-size may be set smaller than this value. In this scenario, an unprivileged user could thus create a pipe whose initial capacity exceeds the limit. Therefore, it seems logical to cap the initial pipe capacity according to the value of pipe-max-size. The test program shown earlier in this patch series can be used to demonstrate the effect of the change brought about with this patch: # cat /proc/sys/fs/pipe-max-size 1048576 # sudo -u mtk ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 65536 # echo 10000 > /proc/sys/fs/pipe-max-size # cat /proc/sys/fs/pipe-max-size 16384 # sudo -u mtk ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 16384 # ./test_F_SETPIPE_SZ 1 Initial pipe capacity: 65536 The last two executions of 'test_F_SETPIPE_SZ' show that pipe-max-size caps the initial allocation for a new pipe for unprivileged users, but not for privileged users. Link: http://lkml.kernel.org/r/31dc7064-2a17-9c5b-1df1-4e3012ee992c@gmail.com Signed-off-by: Michael Kerrisk Reviewed-by: Vegard Nossum Cc: Willy Tarreau Cc: Cc: Tetsuo Handa Cc: Jens Axboe Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Daniel Sangorrin Signed-off-by: Greg Kroah-Hartman commit e28debec24f5c40ac5752bd2fc3b2e686e9010d0 Author: James Chapman Date: Wed Jan 3 22:48:05 2018 +0000 l2tp: revert "l2tp: fix missing print session offset info" commit de3b58bc359a861d5132300f53f95e83f71954b3 upstream. Revert commit 820da5357572 ("l2tp: fix missing print session offset info"). The peer_offset parameter is removed. Signed-off-by: James Chapman Signed-off-by: David S. Miller Cc: Guillaume Nault Signed-off-by: Greg Kroah-Hartman commit 59d96688ed7c62f71b881ece31efe89a0c3fe664 Author: Greg Kroah-Hartman Date: Thu May 17 11:44:48 2018 +0200 Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap" This reverts commit 9de3a3bfed892608dc30a6bc3fd8bdbeae5b51a5 which was commit 79935915300c5eb88a0e94fa9148a7505c14a02a upstream. As Ben points out: This depends on: commit 570c70a60f53ca737ead4e5966c446bf0d39fac9 Author: Fabio Estevam Date: Wed Apr 5 11:32:34 2017 -0300 ASoC: sgtl5000: Allow LRCLK pad drive strength to be changed which did not show up until 4.13, so this makes no sense to have in this stable branch. Reported-by: Ben Hutchings Cc: Fabio Estevam Cc: Shawn Guo Cc: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 415a85ce93fdc114b0ca37577882569481d11c4e Author: Vasily Averin Date: Thu Nov 2 13:03:42 2017 +0300 lockd: lost rollback of set_grace_period() in lockd_down_net() commit 3a2b19d1ee5633f76ae8a88da7bc039a5d1732aa upstream. Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: efda760fe95e ("lockd: fix lockd shutdown race") Cc: stable@vger.kernel.org Signed-off-by: Vasily Averin Signed-off-by: J. Bruce Fields Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit ee551b8e00b0368cb1188dbf0238a3a8989e9845 Author: Antony Antony Date: Thu Dec 7 21:54:27 2017 +0100 xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM) commit 75bf50f4aaa1c78d769d854ab3d975884909e4fb upstream. copy geniv when cloning the xfrm state. x->geniv was not copied to the new state and migration would fail. xfrm_do_migrate .. xfrm_state_clone() .. .. esp_init_aead() crypto_alloc_aead() crypto_alloc_tfm() crypto_find_alg() return EAGAIN and failed Signed-off-by: Antony Antony Signed-off-by: Steffen Klassert Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 177a981885cf9588ca5cbcaf2ce65ab0d4b4abb3 Author: Jiri Slaby Date: Thu Aug 24 09:31:05 2017 +0200 futex: Remove duplicated code and fix undefined behaviour commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream. There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby Signed-off-by: Thomas Gleixner Acked-by: Russell King Acked-by: Michael Ellerman (powerpc) Acked-by: Heiko Carstens [s390] Acked-by: Chris Metcalf [for tile] Reviewed-by: Darren Hart (VMware) Reviewed-by: Will Deacon [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt Cc: Max Filippov Cc: Paul Mackerras Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller Cc: "James E.J. Bottomley" Cc: Catalin Marinas Cc: Matt Turner Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu Cc: Arnd Bergmann Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky Cc: Stafford Horne Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson Cc: Chris Zankel Cc: Michal Simek Cc: Tony Luck Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta Cc: Ralf Baechle Cc: Richard Kuo Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2fc127737e300bba00d4ed878dac4c78be42522b Author: Mel Gorman Date: Wed Aug 9 08:27:11 2017 +0100 futex: Remove unnecessary warning from get_futex_key commit 48fb6f4db940e92cfb16cd878cddd59ea6120d06 upstream. Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in get_futex_key()") removed an unnecessary lock_page() with the side-effect that page->mapping needed to be treated very carefully. Two defensive warnings were added in case any assumption was missed and the first warning assumed a correct application would not alter a mapping backing a futex key. Since merging, it has not triggered for any unexpected case but Mark Rutland reported the following bug triggering due to the first warning. kernel BUG at kernel/futex.c:679! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3 Hardware name: linux,dummy-virt (DT) task: ffff80001e271780 task.stack: ffff000010908000 PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 pc : [] lr : [] pstate: 80000145 The fact that it's a bug instead of a warning was due to an unrelated arm64 problem, but the warning itself triggered because the underlying mapping changed. This is an application issue but from a kernel perspective it's a recoverable situation and the warning is unnecessary so this patch removes the warning. The warning may potentially be triggered with the following test program from Mark although it may be necessary to adjust NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the system. #include #include #include #include #include #include #include #include #define NR_FUTEX_THREADS 16 pthread_t threads[NR_FUTEX_THREADS]; void *mem; #define MEM_PROT (PROT_READ | PROT_WRITE) #define MEM_SIZE 65536 static int futex_wrapper(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3) { syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3); } void *poll_futex(void *unused) { for (;;) { futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1); } } int main(int argc, char *argv[]) { int i; mem = mmap(NULL, MEM_SIZE, MEM_PROT, MAP_SHARED | MAP_ANONYMOUS, -1, 0); printf("Mapping @ %p\n", mem); printf("Creating futex threads...\n"); for (i = 0; i < NR_FUTEX_THREADS; i++) pthread_create(&threads[i], NULL, poll_futex, NULL); printf("Flipping mapping...\n"); for (;;) { mmap(mem, MEM_SIZE, MEM_PROT, MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0); } return 0; } Reported-and-tested-by: Mark Rutland Signed-off-by: Mel Gorman Acked-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Linus Torvalds Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 348f043ab6c6ab15f9e49b7905ba932dd260cf93 Author: Suzuki K Poulose Date: Mon Mar 26 15:12:49 2018 +0100 arm64: Add work around for Arm Cortex-A55 Erratum 1024718 commit ece1397cbc89c51914fae1aec729539cfd8bd62b upstream. Some variants of the Arm Cortex-55 cores (r0p0, r0p1, r1p0) suffer from an erratum 1024718, which causes incorrect updates when DBM/AP bits in a page table entry is modified without a break-before-make sequence. The work around is to skip enabling the hardware DBM feature on the affected cores. The hardware Access Flag management features is not affected. There are some other cores suffering from this errata, which could be added to the midr_list to trigger the work around. Cc: Catalin Marinas Cc: ckadabi@codeaurora.org Reviewed-by: Dave Martin Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Suzuki K Poulose Signed-off-by: Greg Kroah-Hartman commit 45eecfb9c5f5bc2f7c254c8bcc4219fa4b31008c Author: Ard Biesheuvel Date: Mon Apr 18 17:09:44 2016 +0200 arm64: introduce mov_q macro to move a constant into a 64-bit register commit 30b5ba5cf333cc650e474eaf2cc1ae91bc7cf89f upstream. Implement a macro mov_q that can be used to move an immediate constant into a 64-bit register, using between 2 and 4 movz/movk instructions (depending on the operand) Acked-by: Catalin Marinas Signed-off-by: Ard Biesheuvel Signed-off-by: Will Deacon Signed-off-by: Suzuki K Poulose Signed-off-by: Greg Kroah-Hartman commit 9bb698bedebf437ae3de8bae8e1aa8bca02664da Author: Richard Guy Briggs Date: Tue Jun 28 12:06:58 2016 -0400 audit: move calcs after alloc and check when logging set loginuid commit 76a658c20efd541a62838d9ff68ce94170d7a549 upstream. Move the calculations of values after the allocation in case the allocation fails. This avoids wasting effort in the rare case that it fails, but more importantly saves us extra logic to release the tty ref. Signed-off-by: Richard Guy Briggs Signed-off-by: Paul Moore Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit bea8a671cee74232bbeef3980354cb9b305f6a9f Author: Takashi Iwai Date: Wed Feb 10 12:47:03 2016 +0100 ALSA: timer: Call notifier in the same spinlock commit f65e0d299807d8a11812845c972493c3f9a18e10 upstream. snd_timer_notify1() is called outside the spinlock and it retakes the lock after the unlock. This is rather racy, and it's safer to move snd_timer_notify() call inside the main spinlock. The patch also contains a slight refactoring / cleanup of the code. Now all start/stop/continue/pause look more symmetric and a bit better readable. Signed-off-by: Takashi Iwai Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 957adaae7d228f5be44ed0e4ea926dc8e3d45caf Author: Xin Long Date: Sat May 5 14:59:47 2018 +0800 sctp: delay the authentication for the duplicated cookie-echo chunk [ Upstream commit 59d8d4434f429b4fa8a346fd889058bda427a837 ] Now sctp only delays the authentication for the normal cookie-echo chunk by setting chunk->auth_chunk in sctp_endpoint_bh_rcv(). But for the duplicated one with auth, in sctp_assoc_bh_rcv(), it does authentication first based on the old asoc, which will definitely fail due to the different auth info in the old asoc. The duplicated cookie-echo chunk will create a new asoc with the auth info from this chunk, and the authentication should also be done with the new asoc's auth info for all of the collision 'A', 'B' and 'D'. Otherwise, the duplicated cookie-echo chunk with auth will never pass the authentication and create the new connection. This issue exists since very beginning, and this fix is to make sctp_assoc_bh_rcv() follow the way sctp_endpoint_bh_rcv() does for the normal cookie-echo chunk to delay the authentication. While at it, remove the unused params from sctp_sf_authenticate() and define sctp_auth_chunk_verify() used for all the places that do the delayed authentication. v1->v2: fix the typo in changelog as Marcelo noticed. Acked-by: Marcelo Ricardo Leitner Signed-off-by: Xin Long Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e0a532d3d4562da65cac8bf74bac3f4190d73a57 Author: Xin Long Date: Wed May 2 13:45:12 2018 +0800 sctp: fix the issue that the cookie-ack with auth can't get processed [ Upstream commit ce402f044e4e432c296f90eaabb8dbe8f3624391 ] When auth is enabled for cookie-ack chunk, in sctp_inq_pop, sctp processes auth chunk first, then continues to the next chunk in this packet if chunk_end + chunk_hdr size < skb_tail_pointer(). Otherwise, it will go to the next packet or discard this chunk. However, it missed the fact that cookie-ack chunk's size is equal to chunk_hdr size, which couldn't match that check, and thus this chunk would not get processed. This patch fixes it by changing the check to chunk_end + chunk_hdr size <= skb_tail_pointer(). Fixes: 26b87c788100 ("net: sctp: fix remote memory pressure from excessive queueing") Signed-off-by: Xin Long Acked-by: Neil Horman Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fc36a2a38f1e87ad0c1d753bbb00bd03ceb75bf1 Author: Yuchung Cheng Date: Wed Apr 25 11:33:08 2018 -0700 tcp: ignore Fast Open on repair mode [ Upstream commit 16ae6aa1705299789f71fdea59bfb119c1fbd9c0 ] The TCP repair sequence of operation is to first set the socket in repair mode, then inject the TCP stats into the socket with repair socket options, then call connect() to re-activate the socket. The connect syscall simply returns and set state to ESTABLISHED mode. As a result Fast Open is meaningless for TCP repair. However allowing sendto() system call with MSG_FASTOPEN flag half-way during the repair operation could unexpectedly cause data to be sent, before the operation finishes changing the internal TCP stats (e.g. MSS). This in turn triggers TCP warnings on inconsistent packet accounting. The fix is to simply disallow Fast Open operation once the socket is in the repair mode. Reported-by: syzbot Signed-off-by: Yuchung Cheng Reviewed-by: Neal Cardwell Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8ff936814817daeefceeb1ce26e80b46439a137d Author: Debabrata Banerjee Date: Wed May 9 19:32:10 2018 -0400 bonding: do not allow rlb updates to invalid mac [ Upstream commit 4fa8667ca3989ce14cf66301fa251544fbddbdd0 ] Make sure multicast, broadcast, and zero mac's cannot be the output of rlb updates, which should all be directed arps. Receive load balancing will be collapsed if any of these happen, as the switch will broadcast. Signed-off-by: Debabrata Banerjee Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit afad8f840026e7011d58f75a0c2c30772b27ed96 Author: Michael Chan Date: Thu May 3 20:04:27 2018 -0400 tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent(). [ Upstream commit d89a2adb8bfe6f8949ff389acdb9fa298b6e8e12 ] tg3_free_consistent() calls dma_free_coherent() to free tp->hw_stats under spinlock and can trigger BUG_ON() in vunmap() because vunmap() may sleep. Fix it by removing the spinlock and relying on the TG3_FLAG_INIT_COMPLETE flag to prevent race conditions between tg3_get_stats64() and tg3_free_consistent(). TG3_FLAG_INIT_COMPLETE is always cleared under tp->lock before tg3_free_consistent() and therefore tg3_get_stats64() can safely access tp->hw_stats under tp->lock if TG3_FLAG_INIT_COMPLETE is set. Fixes: f5992b72ebe0 ("tg3: Fix race condition in tg3_get_stats64().") Reported-by: Zumeng Chen Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 00002fa69ffdfc0038e7277bf9de1662098f85cc Author: Xin Long Date: Wed May 2 13:39:46 2018 +0800 sctp: use the old asoc when making the cookie-ack chunk in dupcook_d [ Upstream commit 46e16d4b956867013e0bbd7f2bad206f4aa55752 ] When processing a duplicate cookie-echo chunk, for case 'D', sctp will not process the param from this chunk. It means old asoc has nothing to be updated, and the new temp asoc doesn't have the complete info. So there's no reason to use the new asoc when creating the cookie-ack chunk. Otherwise, like when auth is enabled for cookie-ack, the chunk can not be set with auth, and it will definitely be dropped by peer. This issue is there since very beginning, and we fix it by using the old asoc instead. Signed-off-by: Xin Long Acked-by: Neil Horman Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9982c6090d7e23782542f919ebbfdb2f72c1c1d1 Author: Xin Long Date: Thu Apr 26 14:13:57 2018 +0800 sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr [ Upstream commit d625329b06e46bd20baf9ee40847d11982569204 ] Since sctp ipv6 socket also supports v4 addrs, it's possible to compare two v4 addrs in pf v6 .cmp_addr, sctp_inet6_cmp_addr. However after Commit 1071ec9d453a ("sctp: do not check port in sctp_inet6_cmp_addr"), it no longer calls af1->cmp_addr, which in this case is sctp_v4_cmp_addr, but calls __sctp_v6_cmp_addr where it handles them as two v6 addrs. It would cause a out of bounds crash. syzbot found this crash when trying to bind two v4 addrs to a v6 socket. This patch fixes it by adding the process for two v4 addrs in sctp_inet6_cmp_addr. Fixes: 1071ec9d453a ("sctp: do not check port in sctp_inet6_cmp_addr") Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com Signed-off-by: Xin Long Acked-by: Neil Horman Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cd59b55080f5fa48a9cf7049dd3dcccb492a50d7 Author: Heiner Kallweit Date: Mon May 7 21:11:21 2018 +0200 r8169: fix powering up RTL8168h [ Upstream commit 3148dedfe79e422f448a10250d3e2cdf8b7ee617 ] Since commit a92a08499b1f "r8169: improve runtime pm in general and suspend unused ports" interfaces w/o link are runtime-suspended after 10s. On systems where drivers take longer to load this can lead to the situation that the interface is runtime-suspended already when it's initially brought up. This shouldn't be a problem because rtl_open() resumes MAC/PHY. However with at least one chip version the interface doesn't properly come up, as reported here: https://bugzilla.kernel.org/show_bug.cgi?id=199549 The vendor driver uses a delay to give certain chip versions some time to resume before starting the PHY configuration. So let's do the same. I don't know which chip versions may be affected, therefore apply this delay always. This patch was reported to fix the issue for RTL8168h. I was able to reproduce the issue on an Asus H310I-Plus which also uses a RTL8168h. Also in my case the patch fixed the issue. Reported-by: Slava Kardakov Tested-by: Slava Kardakov Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 10bffef7fb2614f17319198146feee1769270f1e Author: Bjørn Mork Date: Wed May 2 22:22:54 2018 +0200 qmi_wwan: do not steal interfaces from class drivers [ Upstream commit 5697db4a696c41601a1d15c1922150b4dbf5726c ] The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that the { vendorid, productid, interfacenumber } set uniquely identifies one specific function. This has proven to fail for some configurable devices. One example is the Quectel EM06/EP06 where the same interface number can be either QMI or MBIM, without the device ID changing either. Fix by requiring the vendor-specific class for interface number based matching. Functions of other classes can and should use class based matching instead. Fixes: 03304bcb5ec4 ("net: qmi_wwan: use fixed interface number matching") Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 02ce46ba5c009d926c7a07fb9f79d39d7aa89fbc Author: Stefano Brivio Date: Thu May 3 18:13:25 2018 +0200 openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found [ Upstream commit 72f17baf2352ded6a1d3f4bb2d15da8c678cd2cb ] If an OVS_ATTR_NESTED attribute type is found while walking through netlink attributes, we call nlattr_set() recursively passing the length table for the following nested attributes, if different from the current one. However, once we're done with those sub-nested attributes, we should continue walking through attributes using the current table, instead of using the one related to the sub-nested attributes. For example, given this sequence: 1 OVS_KEY_ATTR_PRIORITY 2 OVS_KEY_ATTR_TUNNEL 3 OVS_TUNNEL_KEY_ATTR_ID 4 OVS_TUNNEL_KEY_ATTR_IPV4_SRC 5 OVS_TUNNEL_KEY_ATTR_IPV4_DST 6 OVS_TUNNEL_KEY_ATTR_TTL 7 OVS_TUNNEL_KEY_ATTR_TP_SRC 8 OVS_TUNNEL_KEY_ATTR_TP_DST 9 OVS_KEY_ATTR_IN_PORT 10 OVS_KEY_ATTR_SKB_MARK 11 OVS_KEY_ATTR_MPLS we switch to the 'ovs_tunnel_key_lens' table on attribute #3, and we don't switch back to 'ovs_key_lens' while setting attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is 15, we also get this kind of KASan splat while accessing the wrong table: [ 7654.586496] ================================================================== [ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch] [ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430 [ 7654.610983] [ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1 [ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016 [ 7654.631379] Call Trace: [ 7654.634108] [] dump_stack+0x19/0x1b [ 7654.639843] [] print_address_description+0x33/0x290 [ 7654.647129] [] ? nlattr_set+0x164/0xde9 [openvswitch] [ 7654.654607] [] kasan_report.part.3+0x242/0x330 [ 7654.661406] [] __asan_report_load4_noabort+0x34/0x40 [ 7654.668789] [] nlattr_set+0x164/0xde9 [openvswitch] [ 7654.676076] [] ovs_nla_get_match+0x10c8/0x1900 [openvswitch] [ 7654.684234] [] ? genl_rcv+0x28/0x40 [ 7654.689968] [] ? netlink_unicast+0x3f3/0x590 [ 7654.696574] [] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch] [ 7654.705122] [] ? unwind_get_return_address+0xb0/0xb0 [ 7654.712503] [] ? system_call_fastpath+0x1c/0x21 [ 7654.719401] [] ? update_stack_state+0x229/0x370 [ 7654.726298] [] ? update_stack_state+0x229/0x370 [ 7654.733195] [] ? kasan_unpoison_shadow+0x35/0x50 [ 7654.740187] [] ? kasan_kmalloc+0xaa/0xe0 [ 7654.746406] [] ? kasan_slab_alloc+0x12/0x20 [ 7654.752914] [] ? memset+0x31/0x40 [ 7654.758456] [] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch] [snip] [ 7655.132484] The buggy address belongs to the variable: [ 7655.138226] ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch] [ 7655.145507] [ 7655.147166] Memory state around the buggy address: [ 7655.152514] ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa [ 7655.160585] ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa [ 7655.176701] ^ [ 7655.184372] ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05 [ 7655.192431] ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 [ 7655.200490] ================================================================== Reported-by: Hangbin Liu Fixes: 982b52700482 ("openvswitch: Fix mask generation for nested attributes.") Signed-off-by: Stefano Brivio Reviewed-by: Sabrina Dubroca Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b19f6e41a3db6f5e547545537430bc7305f82a73 Author: Lance Richardson Date: Wed Apr 25 10:21:54 2018 -0400 net: support compat 64-bit time in {s,g}etsockopt [ Upstream commit 988bf7243e03ef69238381594e0334a79cef74a6 ] For the x32 ABI, struct timeval has two 64-bit fields. However the kernel currently interprets the user-space values used for the SO_RCVTIMEO and SO_SNDTIMEO socket options as having a pair of 32-bit fields. When the seconds portion of the requested timeout is less than 2**32, the seconds portion of the effective timeout is correct but the microseconds portion is zero. When the seconds portion of the requested timeout is zero and the microseconds portion is non-zero, the kernel interprets the timeout as zero (never timeout). Fix by using 64-bit time for SO_RCVTIMEO/SO_SNDTIMEO as required for the ABI. The code included below demonstrates the problem. Results before patch: $ gcc -m64 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 2.008181 seconds send time: 2.015985 seconds $ gcc -m32 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 2.016763 seconds send time: 2.016062 seconds $ gcc -mx32 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 1.007239 seconds send time: 1.023890 seconds Results after patch: $ gcc -m64 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.010062 seconds send time: 2.015836 seconds $ gcc -m32 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.013974 seconds send time: 2.015981 seconds $ gcc -mx32 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.030257 seconds send time: 2.013383 seconds #include #include #include #include #include void checkrc(char *str, int rc) { if (rc >= 0) return; perror(str); exit(1); } static char buf[1024]; int main(int argc, char **argv) { int rc; int socks[2]; struct timeval tv; struct timeval start, end, delta; rc = socketpair(AF_UNIX, SOCK_STREAM, 0, socks); checkrc("socketpair", rc); /* set timeout to 1.999999 seconds */ tv.tv_sec = 1; tv.tv_usec = 999999; rc = setsockopt(socks[0], SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof tv); rc = setsockopt(socks[0], SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof tv); checkrc("setsockopt", rc); /* measure actual receive timeout */ gettimeofday(&start, NULL); rc = recv(socks[0], buf, sizeof buf, 0); gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("recv time: %ld.%06ld seconds\n", (long)delta.tv_sec, (long)delta.tv_usec); /* fill send buffer */ do { rc = send(socks[0], buf, sizeof buf, 0); } while (rc > 0); /* measure actual send timeout */ gettimeofday(&start, NULL); rc = send(socks[0], buf, sizeof buf, 0); gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("send time: %ld.%06ld seconds\n", (long)delta.tv_sec, (long)delta.tv_usec); exit(0); } Fixes: 515c7af85ed9 ("x32: Use compat shims for {g,s}etsockopt") Reported-by: Gopal RajagopalSai Signed-off-by: Lance Richardson Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 541d81ab682aed633052e7312e104d99cdeb1554 Author: Eric Dumazet Date: Wed May 2 10:03:30 2018 -0700 net_sched: fq: take care of throttled flows before reuse [ Upstream commit 7df40c2673a1307c3260aab6f9d4b9bf97ca8fd7 ] Normally, a socket can not be freed/reused unless all its TX packets left qdisc and were TX-completed. However connect(AF_UNSPEC) allows this to happen. With commit fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for reused flows") we cleared f->time_next_packet but took no special action if the flow was still in the throttled rb-tree. Since f->time_next_packet is the key used in the rb-tree searches, blindly clearing it might break rb-tree integrity. We need to make sure the flow is no longer in the rb-tree to avoid this problem. Fixes: fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for reused flows") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7e2d8aef33e46adf2b588a588980a505ab5f8c76 Author: Moshe Shemesh Date: Wed May 9 18:35:13 2018 +0300 net/mlx4_en: Verify coalescing parameters are in range [ Upstream commit 6ad4e91c6d796b38a7f0e724db1de28eeb122bad ] Add check of coalescing parameters received through ethtool are within range of values supported by the HW. Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the users through ethtool. The ethtool support up to 32 bit value for each. However, mlx4 modify cq limits the coalescing time parameter and coalescing frames parameters to 16 bits. Return out of range error if user tries to set these parameters to higher values. Change type of sample-interval and adaptive_rx_coal parameters in mlx4 driver to u32 as the ethtool holds them as u32 and these parameters are not limited due to mlx4 HW. Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC') Signed-off-by: Moshe Shemesh Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c9264b0a7e133df20e6c9d27d5cc08b2a01f2575 Author: Rob Taglang Date: Thu May 3 17:13:06 2018 -0400 net: ethernet: sun: niu set correct packet size in skb [ Upstream commit 14224923c3600bae2ac4dcae3bf0c3d4dc2812be ] Currently, skb->len and skb->data_len are set to the page size, not the packet size. This causes the frame check sequence to not be located at the "end" of the packet resulting in ethernet frame check errors. The driver does work currently, but stricter kernel facing networking solutions like OpenVSwitch will drop these packets as invalid. These changes set the packet size correctly so that these errors no longer occur. The length does not include the frame check sequence, so that subtraction was removed. Tested on Oracle/SUN Multithreaded 10-Gigabit Ethernet Network Controller [108e:abcd] and validated in wireshark. Signed-off-by: Rob Taglang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 418e529ef418ecf31019de0034d63b226dd92937 Author: Eric Dumazet Date: Mon May 7 09:02:25 2018 -0700 llc: better deal with too small mtu [ Upstream commit 2c5d5b13c6eb79f5677e206b8aad59b3a2097f60 ] syzbot loves to set very small mtu on devices, since it brings joy. We must make llc_ui_sendmsg() fool proof. usercopy: Kernel memory overwrite attempt detected to wrapped address (offset 0, size 18446612139802320068)! kernel BUG at mm/usercopy.c:100! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 17464 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #36 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:usercopy_abort+0xbb/0xbd mm/usercopy.c:88 RSP: 0018:ffff8801868bf800 EFLAGS: 00010282 RAX: 000000000000006c RBX: ffffffff87d2fb00 RCX: 0000000000000000 RDX: 000000000000006c RSI: ffffffff81610731 RDI: ffffed0030d17ef6 RBP: ffff8801868bf858 R08: ffff88018daa4200 R09: ffffed003b5c4fb0 R10: ffffed003b5c4fb0 R11: ffff8801dae27d87 R12: ffffffff87d2f8e0 R13: ffffffff87d2f7a0 R14: ffffffff87d2f7a0 R15: ffffffff87d2f7a0 FS: 00007f56a14ac700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2bc21000 CR3: 00000001abeb1000 CR4: 00000000001426f0 DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000030602 Call Trace: check_bogus_address mm/usercopy.c:153 [inline] __check_object_size+0x5d9/0x5d9 mm/usercopy.c:256 check_object_size include/linux/thread_info.h:108 [inline] check_copy_size include/linux/thread_info.h:139 [inline] copy_from_iter_full include/linux/uio.h:121 [inline] memcpy_from_msg include/linux/skbuff.h:3305 [inline] llc_ui_sendmsg+0x4b1/0x1530 net/llc/af_llc.c:941 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 __sys_sendto+0x3d7/0x670 net/socket.c:1789 __do_sys_sendto net/socket.c:1801 [inline] __se_sys_sendto net/socket.c:1797 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455979 RSP: 002b:00007f56a14abc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f56a14ac6d4 RCX: 0000000000455979 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000018 RBP: 000000000072bea0 R08: 00000000200012c0 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000548 R14: 00000000006fbf60 R15: 0000000000000000 Code: 55 c0 e8 c0 55 bb ff ff 75 c8 48 8b 55 c0 4d 89 f9 ff 75 d0 4d 89 e8 48 89 d9 4c 89 e6 41 56 48 c7 c7 80 fa d2 87 e8 a0 0b a3 ff <0f> 0b e8 95 55 bb ff e8 c0 a8 f7 ff 8b 95 14 ff ff ff 4d 89 e8 RIP: usercopy_abort+0xbb/0xbd mm/usercopy.c:88 RSP: ffff8801868bf800 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 87bd2aca94cc3a3d0a48fa75a7532ff19f1549cc Author: Andrey Ignatov Date: Thu May 10 10:59:34 2018 -0700 ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg [ Upstream commit 1b97013bfb11d66f041de691de6f0fec748ce016 ] Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed earlier in 919483096bfe. * udp_sendmsg one was there since the beginning when linux sources were first added to git; * ping_v4_sendmsg one was copy/pasted in c319b4d76b9e. Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options have to be freed if they were allocated previously. Add label so that future callers (if any) can use it instead of kfree() before return that is easy to forget. Fixes: c319b4d76b9e (net: ipv4: add IPPROTO_ICMP socket kind) Signed-off-by: Andrey Ignatov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7233fad00faa51a04a074a6ed7e507f408c034ec Author: Eric Dumazet Date: Thu May 3 09:39:20 2018 -0700 dccp: fix tasklet usage [ Upstream commit a8d7aa17bbc970971ccdf71988ea19230ab368b1 ] syzbot reported a crash in tasklet_action_common() caused by dccp. dccp needs to make sure socket wont disappear before tasklet handler has completed. This patch takes a reference on the socket when arming the tasklet, and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet() kernel BUG at kernel/softirq.c:514! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246 dccp_close: ABORT with 65423 bytes unread RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000 RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000 RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94 dccp_close: ABORT with 65423 bytes unread R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000 R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490 FS: 0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tasklet_action+0x1d/0x20 kernel/softirq.c:533 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 dccp_close: ABORT with 65423 bytes unread run_ksoftirqd+0x86/0x100 kernel/softirq.c:646 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff <0f> 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8 RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8 Fixes: dc841e30eaea ("dccp: Extend CCID packet dequeueing interface") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Gerrit Renker Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2a188d15b21814af58d6a038023d0d8cbd9d6065 Author: Hangbin Liu Date: Fri Apr 27 20:59:24 2018 +0800 bridge: check iface upper dev when setting master via ioctl [ Upstream commit e8238fc2bd7b4c3c7554fa2df067e796610212fc ] When we set a bond slave's master to bridge via ioctl, we only check the IFF_BRIDGE_PORT flag. Although we will find the slave's real master at netdev_master_upper_dev_link() later, it already does some settings and allocates some resources. It would be better to return as early as possible. v1 -> v2: use netdev_master_upper_dev_get() instead of netdev_has_any_upper_dev() to check if we have a master, because not all upper devs are masters, e.g. vlan device. Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com Signed-off-by: Hangbin Liu Acked-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a221a0c1db1e343ba64a41e21a39aea73d35ab7f Author: Ingo Molnar Date: Wed May 2 13:30:57 2018 +0200 8139too: Use disable_irq_nosync() in rtl8139_poll_controller() [ Upstream commit af3e0fcf78879f718c5f73df0814951bd7057d34 ] Use disable_irq_nosync() instead of disable_irq() as this might be called in atomic context with netpoll. Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman