cmake: linker: make passing -no-pie configurable#54445
cmake: linker: make passing -no-pie configurable#54445nashif merged 12 commits intozephyrproject-rtos:mainfrom
Conversation
andyross
left a comment
There was a problem hiding this comment.
This definitely gets my vote
tejlmand
left a comment
There was a problem hiding this comment.
Need a bit more description to fully understand the use-case.
Also, this should be having corresponding Kconfig setting, and not rely on board testing directly in CMake.
There was a problem hiding this comment.
first comment.
We should not filter this based on a board type.
It's to limiting regarding users ability to enable / disable this flag.
As minimum this should have a Kconfig and with a proper default value.
A given board can then define a different default.
The setting may be promptless to avoid users accidentially changing the setting.
There was a problem hiding this comment.
your commit message states:
Zephyr SDK and others are, currently, not to produce PIE binaries by default.
I get that if Zephyr SDK and other currently do not produce PIE binaries per default, but what is the problem in specifying the flag in those cases then ?
Also, we are explicitly requesting the compiler to not do position independent code for C code here:
Lines 376 to 377 in ab764aa
so having -no-pie for the linker seems logical to me in this case, even though the toolchain you mention does so per default.
Is your intention to be able to actually use -pie on the linker in some use-cases ?
Or should we actually change the handling of pie / -no-pie, so that it compiler and linker flags are set as properties and then applied based on configuration.
Similar to how things are done for coverage, where the are both compiler and linker flags specified under the coverage property, which is then applied based on Kconfig.
Ref:
zephyr/arch/common/CMakeLists.txt
Lines 72 to 75 in d598e26
There was a problem hiding this comment.
so having
-no-piefor the linker seems logical to me in this case, even though the toolchain you mention does so per default.Is your intention to be able to actually use
-pieon the linker in some use-cases ?
Answering my own question.
I see the need for not passing the flag here: #54333 (comment)
So atm it seems to be specific to the Xtensa toolchain, xcc.
Which means the best approach forward is to have a no_position_independent property also on the linker, and the Xtensa toolchain can then set this property flag to empty in https://github.com/zephyrproject-rtos/zephyr/tree/main/cmake/linker/ld/xcc/linker_flags.cmake (assuming the xcc uses the ld linker)
There was a problem hiding this comment.
I believe we need this with LLVM as well. So it might not be xcc specific.
The Kconfig part may not be required. Of course if a better user-facing handling of when to apply the properties is desired, then those should go through Kconfig as described. But the immediate problem described in #54333 doesn't seem to need it. |
Others can comment with more expertise, but the core issue is that this flag isn't supported compatibly across all binutils/gcc/clang/lld variants. And we don't really want it anyway, it's not a feature we enable at code generation so we don't want the linker to have to care about it. The sole exception (I think! @yperess caught this originally) is on Debian when using the distro-provided clang/lld to produce native_posix binaries. There, the toolchain is hard-wired to generate pic/pie by default, so we need the flag to the linker to unbreak things. But that requires per-toolchain hackery. I think. |
fd56ba0
2f0b6aa to
fd56ba0
Compare
|
@tejlmand I have updated the PR to use linker property. I tried this previously but was having issue with using |
Stating the obvious: this was done to increase security A lot of background information on these pages, including this (not) fun bit:
Similar headaches: |
|
I think something is wrong in cmake scripts... the error said it was using |
Yes, because that function was made to be able to check if a flag is valid, so it only handles simply flags and not generator expressions.
yes, and this is the function to use now, because code is updated to use linker properties.
Sounds strange. Will see if I can re-produce. |
fd56ba0 to
34e2251
Compare
Turns out if you only have LLVM/clang/lld 15 installed, the symlink However, another issue is that |
|
Errr... the new commit doesn't seem to work with the Clang/LLVM workflow. |
34e2251 to
5c1f16a
Compare
|
I have updated the PR to use |
b2fba48 to
3d0e221
Compare
tejlmand
left a comment
There was a problem hiding this comment.
Thanks, nice improvement.
Some general comments, but I've found no major issues.
There was a problem hiding this comment.
Is this really needed ?
A choice will always have a value, so the only way that both CONFIG_LLVM_USE_LD and CONFIG_LLVM_USE_LLD could n is in case someone extended the choice with an extra config option.
And in case someone does that there probably have a very good reason to do so.
Finally, fatal errors should be used with absolute caution when it comes to Kconfig settings because of #9573
| else() | |
| message(FATAL_ERROR | |
| "Need to enable either CONFIG_LLVM_USE_LD or CONFIG_LLVM_USE_LLD!") |
There was a problem hiding this comment.
Is this really needed ? A choice will always have a value,...
Unless when editing the .config file manually or generating from fragments and some non-Zephyr script?
so the only way that both CONFIG_LLVM_USE_LD and CONFIG_LLVM_USE_LLD could n is in case someone extended the choice with an extra config option.
Could someone really add an extra KConfig option without editing this code?
Finally, fatal errors should be used with absolute caution when it comes to Kconfig settings because of #9573
I only skimmed 9573 sorry but the main message seems to be "don't generate invalid .config files that menuconfig" can't read again. It will of course always be possible to make invalid .config edits manually.
Can this FATAL_ERROR really cause invalid .config files to be generated?
In my experience CMake and Kconfig errors tend to be a nightmare to debug + no one wants to go anywhere near them. Very understandable when you look at major design mistakes like COMMAND_ERROR_IS_FATAL false by default, MYPROGRAM_NOT-FOUND and other crazy and super time-consuming CMake stuff like https://samthursfield.wordpress.com/2015/11/21/cmake-dependencies-between-targets-and-files-and-custom-commands/
So in general we need MORE user-friendly error messages, not fewer but maybe this particular case is different? Not convinced of that yet.
There was a problem hiding this comment.
If someone adds a new linker kconfig, this acts as a reminder that this also needs to be changed. I will change it to a warning instead.
There was a problem hiding this comment.
Unless when editing the .config file manually or generating from fragments and some non-Zephyr script?
No, it will always be set, try this for yourself:
$ cat build/zephyr/.config |grep LLVM_USE
CONFIG_LLVM_USE_LD=y
# CONFIG_LLVM_USE_LLD is not set
$ sed 's/\(CONFIG_LLVM_USE_LD\)=y/# \1 is not set/g' build/zephyr/.config > tmp; mv tmp build/zephyr/.config
$ cat build/zephyr/.config |grep LLVM_USE
# CONFIG_LLVM_USE_LD is not set
# CONFIG_LLVM_USE_LLD is not set
.... Let's do the build:
$ ninja -Cbuild
ninja: Entering directory `build'
[0/1] Re-running CMake...
Loading Zephyr default modules (Zephyr base (cached)).
....
$ cat build/zephyr/.config |grep LLVM_USE
CONFIG_LLVM_USE_LD=y
# CONFIG_LLVM_USE_LLD is not set
As you see, touching .config will cause a CMake re-run which triggers a kconfig run that ensures all choices has a value.
Could someone really add an extra KConfig option without editing this code?
You can extend choices with new entries downstream, for example like it's done here:
https://github.com/nrfconnect/sdk-nrf/blob/5f58fc4ae964e24e5b432bcfcca865f857286f62/Kconfig.nrf#L23-L38
where the Zephyr provided options for an mbedTLS implementation is extended with an extra choice entry in a downstream project.
Can this FATAL_ERROR really cause invalid .config files to be generated?
No, but a FATAL_ERROR in CMake will prevent running menuconfig, cause the build file (which contains the menuconfig build target) has never been created.
Therefore, mis-configured Kconfig files should not result in fatal errors in CMake cause users will not be able to open menuconfig to fix the error.
So in this case, there is no value in throwing an error that a user cannot fix.
Plus it prevents downstream projects to provide extra choice options, and that's making it harder to extend Zephyr on a general basis.
In this particular case, there is probably little risk, but users will copy this pattern else-where, not realizing the damage / limitations it may introduce at other places.
There was a problem hiding this comment.
that's the reason for the split in generic.cmake and target.cmake, so no reason for such comment.
| # Since kconfigs are not available when generic.cmake | |
| # is parsed. Setting the target triple for x86 needs to | |
| # be done here instead. |
There was a problem hiding this comment.
that's the reason for the split in generic.cmake and target.cmake, so no reason for such comment.
OK but how many other people can easily understand that besides yourself + 1 or 2 others in the entire world? I know it's very hard but please try to put yourself in the shoes of non domain experts. I do NOT have the answer, this is a genuine question, not a rhetorical question. I do not have the answer but if @dcpleung who just spent weeks working on this PR - which makes him much, much more expert than the average - felt the need for this comment then I would tend to trust him that it is useful. Also: how does it hurt? It's only 3 lines and very far from a CMake crash course.
There was a problem hiding this comment.
I have removed this but kept the comment on setting LINKER in generic.cmake, where it mentioned about kconfig not being available during parsing of generic.cmake.
There was a problem hiding this comment.
OK but how many other people can easily understand that besides yourself + 1 or 2 others in the entire world?
Then the comment should be the purpose of the generic.cmake / target.cmake files, and not why the triple (in this file) or why setting the linker is deferred to target.cmake (comment in the generic file).
I would be in favor of general comments in top of the file, for example like this:
+++ b/cmake/toolchain/<toolchain>/generic.cmake
@@ -1,5 +1,9 @@
# SPDX-License-Identifier: Apache-2.0
+# Purpose of the generic.cmake is to define a generic C compiler which can be
+# used for devicetree pre-processing and other pre-processing tasks which must
+# be performed before the target can be determined.
+There was a problem hiding this comment.
Added this and removed the other comments.
There was a problem hiding this comment.
To me this looks as a very fragile test.
Could we do this another / cleaner way ?
There was a problem hiding this comment.
I could change it to test if arch is posix and CONFIG_LLVM_USE_LD is enabled. I have only see it for native_posix as llvm links crt{being,end}S.o automatically. Let me try that.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Changed this to test if LIBGCC_DIR is defined. Since posix arch does not have it, it skips including crtbegin.o and crtend.o.
There was a problem hiding this comment.
Maybe a bit stricter?
| "GNU ld \\(.+\\) ([0-9]+[.][0-9]+[.]?[0-9]*).*" | |
| "GNU ld \\(.+\\) ([0-9]+[.][0-9]+[.]?[0-9]*)$" |
What happens when there's no match? Did you locally break the regex and see how that's reported?
There was a problem hiding this comment.
The version string would simply be empty. Cmake will not print the version, and any version comparisons will be false.
There was a problem hiding this comment.
Is this really needed ? A choice will always have a value,...
Unless when editing the .config file manually or generating from fragments and some non-Zephyr script?
so the only way that both CONFIG_LLVM_USE_LD and CONFIG_LLVM_USE_LLD could n is in case someone extended the choice with an extra config option.
Could someone really add an extra KConfig option without editing this code?
Finally, fatal errors should be used with absolute caution when it comes to Kconfig settings because of #9573
I only skimmed 9573 sorry but the main message seems to be "don't generate invalid .config files that menuconfig" can't read again. It will of course always be possible to make invalid .config edits manually.
Can this FATAL_ERROR really cause invalid .config files to be generated?
In my experience CMake and Kconfig errors tend to be a nightmare to debug + no one wants to go anywhere near them. Very understandable when you look at major design mistakes like COMMAND_ERROR_IS_FATAL false by default, MYPROGRAM_NOT-FOUND and other crazy and super time-consuming CMake stuff like https://samthursfield.wordpress.com/2015/11/21/cmake-dependencies-between-targets-and-files-and-custom-commands/
So in general we need MORE user-friendly error messages, not fewer but maybe this particular case is different? Not convinced of that yet.
There was a problem hiding this comment.
that's the reason for the split in generic.cmake and target.cmake, so no reason for such comment.
OK but how many other people can easily understand that besides yourself + 1 or 2 others in the entire world? I know it's very hard but please try to put yourself in the shoes of non domain experts. I do NOT have the answer, this is a genuine question, not a rhetorical question. I do not have the answer but if @dcpleung who just spent weeks working on this PR - which makes him much, much more expert than the average - felt the need for this comment then I would tend to trust him that it is useful. Also: how does it hurt? It's only 3 lines and very far from a CMake crash course.
Some distros may provide config files for clang to change its default behavior. We need to override that, or else developers may be using different defaults and we will have confusing bug reports in the future. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2e11342 to
2759602
Compare
tejlmand
left a comment
There was a problem hiding this comment.
Looks good.
A minor concern to reply to before a +1
There was a problem hiding this comment.
is that always true, or just true 99% of the time ?
If we start reporting this, then we should really be sure that it's correct.
In linux, there will almost always an ld.bfd and ld will usually be a link to the ld.bfd exe.
The almost is to take into account any obscure distro / home built toolchain where ld could be an ld.bfd without ld.bfd existing (or not in path).
But what about other OS'es and pre-built cross tool-chains ?
Are we sure all conforms to the <triple>-ld.bfd or could some be lazy and provide only <triple>-ld, for example on Windows where links (the unix way) are not supported.
Such that <triple>-ld.exe is in fact a bfd.
I agree that most likely your assumption is correct, but worried about edge-cases.
There was a problem hiding this comment.
This is more of a case where we can be certain that the linker is bfd compatible. We do ask the compilers for their preferred ld.bfd binary, and if they say there is one, we assume it is bfd compatible. This is simply a case where if we have any doubts, assume the linker is not bfd compatible.
|
Compliance failure is due to the kconfig file being selectively included (based on which toolchain being used), and thus any documentation on those kconfigs generated check errors as clang is not used by default (and thus not the checked). |
This moves CONFIG_LLVM_USE_LD into cmake/toolchain/llvm as this is a toolchain kconfig. Also make it a choice to allow the use of LLVM's lld as linker. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This introduces a new cmake module FindGnuLd.cmake to do the work to discover GNU ld (of binutils). Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This introduces a new cmake module FindLlvmLld.cmake to do the work to discover LLVM lld linker. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The variable LINKER is dependent on CONFIG_LLVM_USE_LLD or CONFIG_LLVM_USE_LD, and these kconfigs are not available when toolchain/llvm/generic.cmake is parsed. So setting LINKER needs to be deferred to target.cmake where kconfigs are available. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This asks the compiler if it has its own preference for ld.bfd. This is useful for LLVM (when CONFIG_LLVM_USE_LD=y) so we know which linker clang is using. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This asks the clang if it has its own preference for ld.lld. This is to mirror what we are doing to find GNU ld, and to make sure we are using the linker clang is using. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Since kconfigs are not available when generic.cmake is parsed. Setting the target triple for x86 needs to be deferred to target.cmake as it needs to know whether CONFIG_64BIT is enabled. This also moves the ARM triple to target.cmake as triple is needed for target tools. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a new output variable to FindGnuLd.cmake to indicate
if ld.bfd is found. Since we now ask the compilers for their
preferred ld.bfd linker, it may not match using the existing
string equal test to ${CROSS_COMPILE}ld.bfd. So set the new
variable GNULD_LINKER_IS_BFD to true if ld.bfd, and use it to
pass an extra argument to compiler to make it use ld.bfd.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a new linker property specifically for passing "-no-pie" to linker. Older binutils' LD (<= 2.36) do not support this flag and will behave erratically if set. It would parse "-no-pie" separately as "-n" and "-o-pie", which would result in the output file being "-pie" instead of "zephyr*.elf". Moreover, LLVM lld does not support -no-pie but --no-pie (note the extra hyphen). By having no-pie as a linker property, we can pass correct no-pie flag to these linkers (or none at all). Signed-off-by: Daniel Leung <daniel.leung@intel.com>
GNU ld and LLVM lld both complain under C++: error: section: init_array is not contiguous with other relro sections So do not create RELRO program header. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Only include crtbegin.o and crtend.o when LIBGCC_DIR is defined.
Since LIBGCC_DIR is not defined when compiling for posix
architecture, crt{begin,end}.o cannot be referred to via
LIBGCC_DIR.
Also note that, when using llvm/clang, crt{begin,end}S.o are
automatically for native_posix which collide with symbols in
crt{begin,end}.o. So there is no point in making LIBGCC_DIR
available for native_posix under llvm/clang.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2759602 to
9a8acac
Compare
|
The last change was simply adding the kconfigs to the compliance check ignore list. |
|
In some conditions, the commit that adds |
(See individual commits for details.)
The main issue is to properly pass
-no-pie(or--no-pie) to the linker. Old binutils' ld (<= 2.36, used in Ubuntu 20.04) and XCC's xt-ld both do not support '-no-pie' in linker. If you pass '-no-pie' to them, they will be parsed separately as-nand-o-pie, which results in output file named-pieinstead of the usualzephyr*.elf(and thus failing intermediate build steps). The rest of the commits are to make llvm/clang works in CI and to fix a partially non-functionalCONFIG_LLVM_USE_LD.