Skip to content

Commit d333f6b

Browse files
liwenkailinmineswincomputing
authored andcommitted
feat:rdna graphics card support
Changelogs: 1. backport rdna graphics card support & optimize PLT/GOT entry counting SiFiveHolland and others added 9 commits May 26, 2025 14:50 from: #11 @SiFiveHolland @RevySR riscv: module: Optimize PLT/GOT entry counting @Pritesh201192 @RevySR riscv: module: fix compilation error of kvrealloc @SiFiveHolland @RevySR arch: add ARCH_HAS_KERNEL_FPU_SUPPORT @SiFiveHolland @RevySR riscv: add support for kernel-mode FPU @RevySR drm/amd/display: Remove migrate_en/dis from dc_fpu_begin(). @RevySR drm/amd/display: Simplify the per-CPU usage. @RevySR drm/amd/display: Add a warning if the FPU is used outside from task c… @mpe @RevySR drm/amd/display: only use hard-float, not altivec on powerpc @SiFiveHolland @RevySR drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT 2. raw 9 commits include zhangyizhong optimimation code of arch/riscv/kernel/module-sections.c so revert zhangyizhong code and apply 9 commit patch Signed-off-by: liwenkai <liwenkai@eswincomputing.com>
1 parent d1a7166 commit d333f6b

File tree

14 files changed

+173
-98
lines changed

14 files changed

+173
-98
lines changed
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
.. SPDX-License-Identifier: GPL-2.0+
2+
3+
Floating-point API
4+
==================
5+
6+
Kernel code is normally prohibited from using floating-point (FP) registers or
7+
instructions, including the C float and double data types. This rule reduces
8+
system call overhead, because the kernel does not need to save and restore the
9+
userspace floating-point register state.
10+
11+
However, occasionally drivers or library functions may need to include FP code.
12+
This is supported by isolating the functions containing FP code to a separate
13+
translation unit (a separate source file), and saving/restoring the FP register
14+
state around calls to those functions. This creates "critical sections" of
15+
floating-point usage.
16+
17+
The reason for this isolation is to prevent the compiler from generating code
18+
touching the FP registers outside these critical sections. Compilers sometimes
19+
use FP registers to optimize inlined ``memcpy`` or variable assignment, as
20+
floating-point registers may be wider than general-purpose registers.
21+
22+
Usability of floating-point code within the kernel is architecture-specific.
23+
Additionally, because a single kernel may be configured to support platforms
24+
both with and without a floating-point unit, FPU availability must be checked
25+
both at build time and at run time.
26+
27+
Several architectures implement the generic kernel floating-point API from
28+
``linux/fpu.h``, as described below. Some other architectures implement their
29+
own unique APIs, which are documented separately.
30+
31+
Build-time API
32+
--------------
33+
34+
Floating-point code may be built if the option ``ARCH_HAS_KERNEL_FPU_SUPPORT``
35+
is enabled. For C code, such code must be placed in a separate file, and that
36+
file must have its compilation flags adjusted using the following pattern::
37+
38+
CFLAGS_foo.o += $(CC_FLAGS_FPU)
39+
CFLAGS_REMOVE_foo.o += $(CC_FLAGS_NO_FPU)
40+
41+
Architectures are expected to define one or both of these variables in their
42+
top-level Makefile as needed. For example::
43+
44+
CC_FLAGS_FPU := -mhard-float
45+
46+
or::
47+
48+
CC_FLAGS_NO_FPU := -msoft-float
49+
50+
Normal kernel code is assumed to use the equivalent of ``CC_FLAGS_NO_FPU``.
51+
52+
Runtime API
53+
-----------
54+
55+
The runtime API is provided in ``linux/fpu.h``. This header cannot be included
56+
from files implementing FP code (those with their compilation flags adjusted as
57+
above). Instead, it must be included when defining the FP critical sections.
58+
59+
.. c:function:: bool kernel_fpu_available( void )
60+
61+
This function reports if floating-point code can be used on this CPU or
62+
platform. The value returned by this function is not expected to change
63+
at runtime, so it only needs to be called once, not before every
64+
critical section.
65+
66+
.. c:function:: void kernel_fpu_begin( void )
67+
void kernel_fpu_end( void )
68+
69+
These functions create a floating-point critical section. It is only
70+
valid to call ``kernel_fpu_begin()`` after a previous call to
71+
``kernel_fpu_available()`` returned ``true``. These functions are only
72+
guaranteed to be callable from (preemptible or non-preemptible) process
73+
context.
74+
75+
Preemption may be disabled inside critical sections, so their size
76+
should be minimized. They are *not* required to be reentrant. If the
77+
caller expects to nest critical sections, it must implement its own
78+
reference counting.

Documentation/core-api/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ Library functionality that is used throughout the kernel.
4848
errseq
4949
wrappers/atomic_t
5050
wrappers/atomic_bitops
51+
floating-point
5152

5253
Low level entry and exit
5354
========================

Makefile

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -981,6 +981,11 @@ KBUILD_CFLAGS += $(CC_FLAGS_CFI)
981981
export CC_FLAGS_CFI
982982
endif
983983

984+
# Architectures can define flags to add/remove for floating-point support
985+
CC_FLAGS_FPU += -D_LINUX_FPU_COMPILATION_UNIT
986+
export CC_FLAGS_FPU
987+
export CC_FLAGS_NO_FPU
988+
984989
ifneq ($(CONFIG_FUNCTION_ALIGNMENT),0)
985990
KBUILD_CFLAGS += -falign-functions=$(CONFIG_FUNCTION_ALIGNMENT)
986991
endif

arch/Kconfig

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1480,6 +1480,12 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG
14801480
address translations. Page table walkers that clear the accessed bit
14811481
may use this capability to reduce their search space.
14821482

1483+
config ARCH_HAS_KERNEL_FPU_SUPPORT
1484+
bool
1485+
help
1486+
Architectures that select this option can run floating-point code in
1487+
the kernel, as described in Documentation/core-api/floating-point.rst.
1488+
14831489
source "kernel/gcov/Kconfig"
14841490

14851491
source "scripts/gcc-plugins/Kconfig"

arch/riscv/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ config RISCV
2929
select ARCH_HAS_KCOV
3030
select ARCH_HAS_MEMBARRIER_CALLBACKS
3131
select ARCH_HAS_MEMBARRIER_SYNC_CORE
32+
select ARCH_HAS_KERNEL_FPU_SUPPORT if 64BIT && FPU
3233
select ARCH_HAS_MMIOWB
3334
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
3435
select ARCH_HAS_PMEM_API

arch/riscv/Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,9 @@ KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64i
7777

7878
KBUILD_AFLAGS += -march=$(riscv-march-y)
7979

80+
# For C code built with floating-point support, exclude V but keep F and D.
81+
CC_FLAGS_FPU := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
82+
8083
KBUILD_CFLAGS += -mno-save-restore
8184
KBUILD_CFLAGS += -DCONFIG_PAGE_OFFSET=$(CONFIG_PAGE_OFFSET)
8285

arch/riscv/include/asm/fpu.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
/* SPDX-License-Identifier: GPL-2.0-only */
2+
/*
3+
* Copyright (C) 2023 SiFive
4+
*/
5+
6+
#ifndef _ASM_RISCV_FPU_H
7+
#define _ASM_RISCV_FPU_H
8+
9+
#include <asm/switch_to.h>
10+
11+
#define kernel_fpu_available() has_fpu()
12+
13+
void kernel_fpu_begin(void);
14+
void kernel_fpu_end(void);
15+
16+
#endif /* ! _ASM_RISCV_FPU_H */

arch/riscv/kernel/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ obj-$(CONFIG_MMU) += vdso.o vdso/
6161

6262
obj-$(CONFIG_RISCV_M_MODE) += traps_misaligned.o
6363
obj-$(CONFIG_FPU) += fpu.o
64+
obj-$(CONFIG_FPU) += kernel_mode_fpu.o
6465
obj-$(CONFIG_RISCV_ISA_V) += vector.o
6566
obj-$(CONFIG_SMP) += smpboot.o
6667
obj-$(CONFIG_SMP) += smp.o
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// SPDX-License-Identifier: GPL-2.0-only
2+
/*
3+
* Copyright (C) 2023 SiFive
4+
*/
5+
6+
#include <linux/export.h>
7+
#include <linux/preempt.h>
8+
9+
#include <asm/csr.h>
10+
#include <asm/fpu.h>
11+
#include <asm/processor.h>
12+
#include <asm/switch_to.h>
13+
14+
void kernel_fpu_begin(void)
15+
{
16+
preempt_disable();
17+
fstate_save(current, task_pt_regs(current));
18+
csr_set(CSR_SSTATUS, SR_FS);
19+
}
20+
EXPORT_SYMBOL_GPL(kernel_fpu_begin);
21+
22+
void kernel_fpu_end(void)
23+
{
24+
csr_clear(CSR_SSTATUS, SR_FS);
25+
fstate_restore(current, task_pt_regs(current));
26+
preempt_enable();
27+
}
28+
EXPORT_SYMBOL_GPL(kernel_fpu_end);

arch/riscv/kernel/module-sections.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,6 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
123123
unsigned int num_gots = 0;
124124
Elf_Rela *scratch = NULL;
125125
size_t scratch_size = 0;
126-
size_t old_size = 0;
127126
int i;
128127

129128
/*
@@ -169,11 +168,10 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
169168
* close together, so sort a copy of the section to avoid interfering.
170169
*/
171170
if (sechdrs[i].sh_size > scratch_size) {
172-
old_size = scratch_size;
173-
scratch_size = sechdrs[i].sh_size;
174-
scratch = kvrealloc(scratch, old_size, scratch_size, GFP_KERNEL);
171+
scratch = kvrealloc(scratch, scratch_size, sechdrs[i].sh_size, GFP_KERNEL);
175172
if (!scratch)
176173
return -ENOMEM;
174+
scratch_size = sechdrs[i].sh_size;
177175
}
178176

179177
/* sort relocations requiring a PLT or GOT entry so duplicates are adjacent */

0 commit comments

Comments
 (0)