DLPX-82827 Fix for Solaris NFSv4 client mounts by don-brady · Pull Request #19 · delphix/linux-kernel-aws

don-brady · 2022-09-02T17:16:16Z

Problem:

We are unable to mount timeflows on Solaris targets when using NFSv4.

Solaris client mounts use a compound operation which contain 5 operations per path element in the mount path. So with a path like "/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/datafile", the op count reaches 20 and exceeds the restriction of NFSD_MAX_OPS_PER_COMPOUND = 16.

Solution:

We bump NFSD_MAX_OPS_PER_COMPOUND to 40 to allow us to use both NFS v4.0 and v4.1 from a Solaris client.

Testing

Manually tested with a five element path for both NFSv4.0 and NFSv4.1
Provisioned an Oracle VDB on a Solaris target and confirmed that it was using a NFSv4 mount:
NFS Protocol
NFS Version 4
NFS Reason
Default
Environment
djb-sol11u4-ora19900-tgt

And as seen on the client:

oracle@djb-sol11u4-ora19900-tgt:~$ nfsstat -m
/mnt/provision/VDBOMSRB27FBB_NM8 from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/mnt/provision/VDBOMSRB27FBB_NM8/datafile from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/datafile
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/mnt/provision/VDBOMSRB27FBB_NM8/archive from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/archive
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/mnt/provision/VDBOMSRB27FBB_NM8/external from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/external
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/mnt/provision/VDBOMSRB27FBB_NM8/temp from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/temp
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

/mnt/provision/VDBOMSRB27FBB_NM8/source-archive from 10.110.216.69:/domain0/group-2/oracle_db_container-6/oracle_timeflow-7/source-archive
 Flags:         vers=4,mvers=1,proto=tcp,sec=sys,hard,nointr,link,symlink,forcedirectio,rsize=1048576,wsize=1048576,retrans=5,timeo=600
 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

Implementation:

We bumped NFSD_MAX_OPS_PER_COMPOUND to 40 operations, which allows for a 10 element path (5 levels below the datafile mount).

Deployment Plan:

This kernel change should land first followed by an app-gate change to remove the Solaris guardrail that was forcing NFSv3 mounts for Solaris.

Future work:

An upstream bug was filed and we will continue to monitor the progress of a potentially different solution to this problem
https://bugzilla.kernel.org/show_bug.cgi?id=216383

BugLink: https://bugs.launchpad.net/bugs/1982409 [ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/1982409 [ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/2028408 [ Upstream commit 56077b5 ] When doing link mtu negotiation, a malicious peer may send Activate msg with a very small mtu, e.g. 4 in Shuang's testing, without checking for the minimum mtu, l->mtu will be set to 4 in tipc_link_proto_rcv(), then n->links[bearer_id].mtu is set to 4294967228, which is a overflow of '4 - INT_H_SIZE - EMSG_OVERHEAD' in tipc_link_mss(). With tipc_link.mtu = 4, tipc_link_xmit() kept printing the warning: tipc: Too large msg, purging xmit list 1 5 0 40 4! tipc: Too large msg, purging xmit list 1 15 0 60 4! And with tipc_link_entry.mtu 4294967228, a huge skb was allocated in named_distribute(), and when purging it in tipc_link_xmit(), a crash was even caused: general protection fault, probably for non-canonical address 0x2100001011000dd: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 6.3.0.neta #19 RIP: 0010:kfree_skb_list_reason+0x7e/0x1f0 Call Trace: <IRQ> skb_release_data+0xf9/0x1d0 kfree_skb_reason+0x40/0x100 tipc_link_xmit+0x57a/0x740 [tipc] tipc_node_xmit+0x16c/0x5c0 [tipc] tipc_named_node_up+0x27f/0x2c0 [tipc] tipc_node_write_unlock+0x149/0x170 [tipc] tipc_rcv+0x608/0x740 [tipc] tipc_udp_recv+0xdc/0x1f0 [tipc] udp_queue_rcv_one_skb+0x33e/0x620 udp_unicast_rcv_skb.isra.72+0x75/0x90 __udp4_lib_rcv+0x56d/0xc20 ip_protocol_deliver_rcu+0x100/0x2d0 This patch fixes it by checking the new mtu against tipc_bearer_min_mtu(), and not updating mtu if it is too small. Fixes: ed193ec ("tipc: simplify link mtu negotiation") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

don-brady requested review from pcd1193182 and sebroy September 2, 2022 17:29

sebroy approved these changes Sep 2, 2022

View reviewed changes

pcd1193182 approved these changes Sep 2, 2022

View reviewed changes

don-brady changed the title ~~DLPX-82827 Fix for Solaris NFSv4 client mounts~~ DLPX-82827 Fix for Solaris NFSv4 client mounts Sep 2, 2022

delphix-devops-bot force-pushed the 6.0/stage branch from 92114b7 to 945c734 Compare September 4, 2022 12:21

DLPX-82827 Fix for Solaris NFSv4 client mounts

8dcebe6

don-brady force-pushed the dlpx-82827-aws branch from 2c333ee to 8dcebe6 Compare September 6, 2022 04:06

don-brady merged commit 19e833d into delphix:6.0/stage Sep 6, 2022

don-brady deleted the dlpx-82827-aws branch September 6, 2022 16:46

delphix-devops-bot pushed a commit that referenced this pull request Sep 22, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

4aeb601

delphix-devops-bot pushed a commit that referenced this pull request Sep 23, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

dc7054d

delphix-devops-bot pushed a commit that referenced this pull request Sep 24, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

805ba86

delphix-devops-bot pushed a commit that referenced this pull request Oct 12, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

2ebf60e

delphix-devops-bot pushed a commit that referenced this pull request Oct 13, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

ab9e4e4

delphix-devops-bot pushed a commit that referenced this pull request Nov 4, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

b3c92d1

delphix-devops-bot pushed a commit that referenced this pull request Nov 5, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

85ef182

delphix-devops-bot pushed a commit that referenced this pull request Nov 17, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

4fce9c6

delphix-devops-bot pushed a commit that referenced this pull request Nov 18, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

b8a20cc

delphix-devops-bot pushed a commit that referenced this pull request Dec 15, 2022

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

72d02cb

delphix-devops-bot pushed a commit that referenced this pull request Jan 7, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

cd452a4

delphix-devops-bot pushed a commit that referenced this pull request Jan 16, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

1435a15

delphix-devops-bot pushed a commit that referenced this pull request Feb 10, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

b913d0f

delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

fb79039

prakashsurya pushed a commit that referenced this pull request Mar 11, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

f817929

prakashsurya pushed a commit that referenced this pull request Mar 14, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

32f1e83

prakashsurya pushed a commit that referenced this pull request Mar 14, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

43636a9

delphix-devops-bot pushed a commit that referenced this pull request Aug 19, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

0330378

delphix-devops-bot pushed a commit that referenced this pull request Aug 20, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

7c10877

delphix-devops-bot pushed a commit that referenced this pull request Aug 21, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

8616d6c

delphix-devops-bot pushed a commit that referenced this pull request Aug 22, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

23efbc1

delphix-devops-bot pushed a commit that referenced this pull request Aug 23, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

3105379

delphix-devops-bot pushed a commit that referenced this pull request Aug 24, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

648e885

delphix-devops-bot pushed a commit that referenced this pull request Aug 25, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

3f5c10f

delphix-devops-bot pushed a commit that referenced this pull request Aug 26, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

b2d3c8b

delphix-devops-bot pushed a commit that referenced this pull request Aug 27, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

83c570c

delphix-devops-bot pushed a commit that referenced this pull request Aug 30, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

f2062b3

delphix-devops-bot pushed a commit that referenced this pull request Aug 31, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

4bc7756

delphix-devops-bot pushed a commit that referenced this pull request Sep 1, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

5c10d99

delphix-devops-bot pushed a commit that referenced this pull request Sep 2, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

7253ceb

delphix-devops-bot pushed a commit that referenced this pull request Sep 3, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

d2ceb58

delphix-devops-bot pushed a commit that referenced this pull request Sep 6, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

3b212c6

delphix-devops-bot pushed a commit that referenced this pull request Sep 7, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

1d5661f

delphix-devops-bot pushed a commit that referenced this pull request Sep 8, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

2ae1786

delphix-devops-bot pushed a commit that referenced this pull request Sep 20, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

c740e8a

delphix-devops-bot pushed a commit that referenced this pull request Oct 6, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

7b66f1c

delphix-devops-bot pushed a commit that referenced this pull request Oct 7, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

f41a5b9

delphix-devops-bot pushed a commit that referenced this pull request Oct 8, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

21b3ef3

delphix-devops-bot pushed a commit that referenced this pull request Oct 9, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

12698de

delphix-devops-bot pushed a commit that referenced this pull request Oct 10, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

426c684

delphix-devops-bot pushed a commit that referenced this pull request Oct 21, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

197eb89

delphix-devops-bot pushed a commit that referenced this pull request Nov 1, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

bdf0f4d

delphix-devops-bot pushed a commit that referenced this pull request Nov 22, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

91852a7

delphix-devops-bot pushed a commit that referenced this pull request Dec 9, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

c1af053

delphix-devops-bot pushed a commit that referenced this pull request Dec 10, 2023

DLPX-82827 Fix for Solaris NFSv4 client mounts (#19)

e0347ef

manoj-joseph mentioned this pull request Jul 31, 2024

DLPX-91780 Merge conflict in linux-kernel-aws after DLPX-91748 #51

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DLPX-82827 Fix for Solaris NFSv4 client mounts#19

DLPX-82827 Fix for Solaris NFSv4 client mounts#19
don-brady merged 1 commit into
delphix:6.0/stagefrom
don-brady:dlpx-82827-aws

don-brady commented Sep 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

don-brady commented Sep 2, 2022

Problem:

Solution:

Testing

Implementation:

Deployment Plan:

Future work:

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants