xtensa: userspace: handle DTLB multihit and incorrect load/store ring exceptions#93947
Conversation
fd6d3e0 to
2a4e384
Compare
18cc903 to
be8dff4
Compare
be8dff4 to
09db043
Compare
|
Compliance does not like labels in assembly. |
09db043 to
e3bc863
Compare
|
I removed the part to stash permissions into SW bits on L1 PTEs. Come to think of it, it is not very useful. L1 PTEs need only binary access permission at this point: pointing to L2 table, or not pointer anywhere. So we are always reset back to not pointing anywhere. |
6258141 to
23fcd7d
Compare
23fcd7d to
310fa4a
Compare
310fa4a to
ca3c125
Compare
|
Plain rebase. |
This tests zephyrproject-rtos/zephyr#93947 Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
lyakh
left a comment
There was a problem hiding this comment.
while waiting for thesofproject/sof#10242 to finish the test, I confirm, that this PR fixes the zephyr/samples/userspace/prod_consumer test on PTL ADSP
This is the problem with trying to adhere to some rules we can't actually see :-/ everything is conjecture. |
|
I think that's MISRA 15.5 |
The software bits inside PTE are used to store original PTE attributes and ring value, and those bits are used to restore PTE to previous state. This modifies reset_region() to properly restore attributes and ring value when resetting memory regions to the same as when the system boots. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This saves the EXCCAUSE and EXCVADDR into BSA during exception. These will then be used during exception handling. The reason for doing this instead of reading from both registers during exception handing is that another exception will override the value in these register (e.g. DTLB miss). When returning to the previous exception handler, these register no longer contains the original exception cause and address. We need to save it so that we are actually handling the orignal exception. Coredump also now uses the EXCCAUSE and EXCVADDR from BSA instead of reading the registers. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
e28b01e to
efce11a
Compare
|
Fixed PTE restoration that I found working on something else. And I have changed to early returns. The source already has bunch of them in the file anyway. |
| ring = (bsa->ps & XCHAL_PS_RING_MASK) >> XCHAL_PS_RING_SHIFT; | ||
|
|
||
| if (ring != XTENSA_MMU_USER_RING) { | ||
| return false; |
There was a problem hiding this comment.
@dcpleung this case was returning true before, was that wrong? I suppose it was, right? If access was performed from the kernel ring, then it isn't an access violation exception? So this seems to actually be a fix too.
There was a problem hiding this comment.
This needs to return true. So fixed that.
| return false; | ||
| } | ||
|
|
||
| xtensa_dtlb_entry_invalidate_sync(dtlb_entry); |
There was a problem hiding this comment.
This function isn't just checking, it's also performing some non-trivial action. Can we update its doxygen description? Can be a follow-up PR
There was a problem hiding this comment.
I updated the wording on the doxygen.
| uint32_t restored_pte; | ||
|
|
||
| uint32_t original_sw = XTENSA_MMU_PTE_SW_GET(pte); | ||
| uint32_t original_attr = XTENSA_MMU_PTE_SW_ATTR_GET(original_sw); |
There was a problem hiding this comment.
an earlier version of this commit had
uint32_t original_attr = XTENSA_MMU_PTE_SW_ATTR_GET(pte);
here and that worked for SOF. This version breaks SOF user-space. Is this something that I have to change in SOF user-space implementation?
There was a problem hiding this comment.
The earlier version was incorrect as it used the wrong input to get the backup attributes. The current version (with input as original_sw) has the correct behavior.
There was a problem hiding this comment.
Maybe there is a missing mapping function somewhere on your side?
When a page can be accessed by both kernel and user threads, the autofill DTLB may contain an entry for kernel thread. This will result in load/store ring exception when it is accessed by user thread later. In this case, we need to invalidate all associated TLBs related to kernel access so hardware can reload the page table the correct permission for user thread. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a function to handle DTLB multihit exception when userspace is enabled, as this exception may be raised due to the same page being able to accessed by both kernel and user threads. The auto-refill DTLBs may contain entries for same page, one for kernel and one for user under some situations. We need to invalidate those existing DTLB entries so that hardware can reload from the page table. This is an alternative to the kconfig invalidating DTLBs on every swap: CONFIG_XTENSA_MMU_FLUSH_AUTOREFILL_DTLBS_ON_SWAP. Although this does not have extra processing per context switching, exception handling itself has a high cost. So care needs to be taken on whether to enable that kconfig. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This moves the block of FPU and HIFI registers in the Base Save Area (BSA), used during interrupts and exceptions, to the end of the block. This is done to minimize code usage in the vector text section. During interrupt entrance, the code stores a small set of registers first before jumping to the long handler. When the offset to these registers from the beginning of BSA is small enough, the compiler will emit S32I.N instead of S32I. This saves us one byte per store (2 vs 3 bytes). This is mainly done to avoid overflowing the vector text area. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
xtensa_asm2_s.h is not exactly a C header. Rename it to xtensa_asm2.inc.S to clearly state that it is has assembly code in it as its main purpose is to declare assembly macros. This is being done to keep checkpatch.pl from treating it as C file and complaining about non-C syntax. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is simply done to conserve code space in the vector text areas. These vector text areas are very small and we should only put code that is absolutely necessary for interrupt and exception entrance. The saving of PS into the thread struct can be deferred a bit. So do that. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The assert error messages when l2_page_table_map() fails are not correct. It returns false when it cannot allocate new L2 table, and it is not able the address having already mapped. So correct the error message. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
In functions which manipulate both L1 and L2 tables, make the variable names obvious by prefixing them with l1_ or l2_. This is mainly done to avoid confusion when reading through those functions. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
arch_mem_map() takes in some flags to describe the to-be mapped memory regions' permissions and cache status. When the flags are translated, they become attributes in PTEs. So for functions being called by arch_mem_map() and beyond, rename flags to attrs to better describe its purpose. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The priv_stack_ptr for reading and writing tests was not set correctly on Xtensa, and was always 0. Test would pass since the NULL page should always generate access faults on most Xtensa platforms. However, those tests are supposed to test reading and writing to the privilege stack. So fix that by populating the priv_stack_ptr correctly. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a few tests to test switching between threads where they are under different memory domains. This is mainly to test if permissions are set correctly during context switch. By default the new tests are disabled and must be explicitly enabled. This is due to some architectures requiring some config changes (for example, x86 needing more reserved memory domains, and ARM64 needing more translation tables). To avoid causing mass CI failures, only enable platforms that have been tested, for now. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
efce11a to
d8b8697
Compare
|
I suppose the change from an earlier version is correct, it's still not working with my current SOF state, but we'll find a solution for that.



This adds handlers for DTLB multihit and load/store ring exceptions. These are usually the result of having both kernel and user threads accessing the same memory page.
To test this, some context switching tests have been added to the userspace test suite.