c8f765a6 | 29-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(TLB): explicitly specify the signal width when truncated (#4471)
Similar to https://github.com/OpenXiangShan/XiangShan/pull/4455, ppn_s1 has a width of 44 bits, ppn_s2 has a width of 38 bits, an
fix(TLB): explicitly specify the signal width when truncated (#4471)
Similar to https://github.com/OpenXiangShan/XiangShan/pull/4455, ppn_s1 has a width of 44 bits, ppn_s2 has a width of 38 bits, and both of them need to finally assign to a 36-bits signal `ppn(d)`.
The original code does not result in a functional error, but the designer should be aware of the bit-widths of these signals and whether they can be directly truncated during assignment. Therefore this PR was committed.
show more ...
|
e6595665 | 28-Mar-2025 |
Xu, Zefan <[email protected]> |
fix(MMU): incorrect check for pre af/pf/gpf (#4358)
Pre af/pf/gpf means the exceptions before real translating, such as high bits exception. There is some mistakes in choosing the translation mode,
fix(MMU): incorrect check for pre af/pf/gpf (#4358)
Pre af/pf/gpf means the exceptions before real translating, such as high bits exception. There is some mistakes in choosing the translation mode, which may causes problems in which different mode are used for Non-virt S-stage and Virt VS-Stage.
This patch fixes the problems by smallest changes. There is too many debts in L1TLB.
show more ...
|
1f23fd0f | 27-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(TLB): L1 TLB will not save the high bit of PPN (#4455)
Actually, the bit-width of s1ppn is sectorppnLen, which is defined as PaddrBits(48) - Offset(12) - TLB compression(3). In contrast, the L2
fix(TLB): L1 TLB will not save the high bit of PPN (#4455)
Actually, the bit-width of s1ppn is sectorppnLen, which is defined as PaddrBits(48) - Offset(12) - TLB compression(3). In contrast, the L2 TLB returns item.s1.entry.ppn with a bit-width of sectorptePPNLen, which equals ptePaddrLen(56) - Offset(12) - TLB compression(3).
The part of the PPN beyond PaddrBits is only used to generate gpaddr when a guest page fault occurs, so it isn’t stored in the L1 TLB entry. Here, we simply assign the lower (sectorppnLen - 1, 0) bits. In fact, the original implementation would also work correctly, as it automatically truncates the lower bits of item.s1.entry.ppn and assigns them to s1ppn.
TODO: Not storing gpaddr (the upper PPN bits) in the L1 TLB currently provides minimal benefits and significantly increases the complexity, scalability, and maintainability issues of the TLB. In the new architecture, we need to store gpaddr in the L1 TLB to avoid all the additional handling related to guest page faults (gpf).
show more ...
|
5ffa384a | 24-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(TLB): fix a typo about napot scenario (#4454) |
776b48db | 24-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(PTWCache): length of PPN should be gvpnLen when hypervisor (#4453)
ppnLen: PaddrBits - Offset = 48 - 12 = 36 gvpnLen: GVaddrBits - Offset = 50 (Sv48x4) - 12 = 38
When hypervisor extension imple
fix(PTWCache): length of PPN should be gvpnLen when hypervisor (#4453)
ppnLen: PaddrBits - Offset = 48 - 12 = 36 gvpnLen: GVaddrBits - Offset = 50 (Sv48x4) - 12 = 38
When hypervisor extension implemented, PPN length should be gvpnLen rather than ppnLen
show more ...
|
fc4ea7c0 | 24-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(MMU): Stage1Gpf should use hgatp instead of vsatp (#4448) |
f8c4173d | 23-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(PTW): Fix exception handle logic when both pf and af occur (#4422)
In the previous design, every time a page table entry was fetched from memory, a PMP check was performed on the physical addres
fix(PTW): Fix exception handle logic when both pf and af occur (#4422)
In the previous design, every time a page table entry was fetched from memory, a PMP check was performed on the physical address of the next-level page table. However, for cases requiring Stage-2 address translation or when a page fault occurs during the page table fetch, a PMP check is unnecessary. Performing a PMP check in these cases could lead to false access fault reports. This commit fixes the issue.
Future work: The current exception handling logic is messy and unsustainable. A complete refactor of this code is needed in the future, rather than continuing to add patches.
show more ...
|
ebe07d61 | 20-Mar-2025 |
梁森 Liang Sen <[email protected]> |
feat(dfx): reuse dcache data sram read data register as mbist pipeline (#4371)
Co-authored-by: sfencevma <[email protected]> |
648f5569 | 17-Mar-2025 |
cz4e <[email protected]> |
fix(MainPipe): `error` and `writeback` addr generate logic (#4394)
There were errors in the previous design * `writeback` generate wrong addr * `writeback`'s addr use `s3_tag` to generate , no
fix(MainPipe): `error` and `writeback` addr generate logic (#4394)
There were errors in the previous design * `writeback` generate wrong addr * `writeback`'s addr use `s3_tag` to generate , no need to use `s3_tag_error` to select. * `error` generate wrong addr * `error` must use `s3_tag` to generate, not use `s3_req.addr`, * because the enable condition of `s3_req.addr` is different from that of `s3_error`, * should use access cacheline corresponding address
show more ...
|
12931efe | 14-Mar-2025 |
Yanqin Li <[email protected]> |
fix(uncache): if can merge, it can enter even if buffer is full (#4408) |
39e2cc5b | 13-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(L2TLB): Napot entries in LLPTW should not be compressed (#4396) |
b9bfce82 | 13-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(PTWCache): Should refill full GVPN to Page Cache (#4407) |
6aa6d737 | 13-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(PTW): High bits of GVPN should not be truncated (#4406) |
e5429136 | 13-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(LLPTW): Fix exception judgement for different virtualisation stages (#4404)
In the previous exception handling in LLPTW, both isAf and isGpf were checked for all cases, including allStage, onlyS
fix(LLPTW): Fix exception judgement for different virtualisation stages (#4404)
In the previous exception handling in LLPTW, both isAf and isGpf were checked for all cases, including allStage, onlyStage1, and noS2xlate.
In fact, for allStage, only isPf & isGpf needs to be checked, while for onlyStage1 and noS2xlate, only isPf & isAf should be checked.
This commit fixes this issue.
show more ...
|
f9395f72 | 11-Mar-2025 |
Haoyuan Feng <[email protected]> |
feat(PTWCache): Support a more precise flush for L2 TLB entries (#4390)
In the previous design, since we stored asid/vmid and vaddr information in SRAM, it was not possible to simply read out all th
feat(PTWCache): Support a more precise flush for L2 TLB entries (#4390)
In the previous design, since we stored asid/vmid and vaddr information in SRAM, it was not possible to simply read out all the information in a single cycle for comparison with the parameters of sfence/hfence. As a result, for L2 TLB entries, we ignored the rs1/rs2 parameters passed by sfence/hfence and instead flushed all valid entries, regardless of asid/vmid or vaddr.
However, this caused unnecessary flushing of a large number of entries during process switching in virtualized environments, leading to L2 TLB misses after a process switch. This forced the processor to perform a page table walk in memory again, negatively impacting performance.
In this commit, asid/vmid and vaddr are hashed and stored in the register file. When an sfence/hfence signal is received, these stored values are compared against the incoming parameters, allowing for a more precise TLB flush.
show more ...
|
c1eb2883 | 11-Mar-2025 |
Haoyuan Feng <[email protected]> |
fix(MMU): Unify latency in different CSR bundles (#4389)
In the previous design, since we don't flush pipeline after modifying `satp`, we need to set whether or not increase the delay for different
fix(MMU): Unify latency in different CSR bundles (#4389)
In the previous design, since we don't flush pipeline after modifying `satp`, we need to set whether or not increase the delay for different signals in the csr bundle, such as priv and satp, respectively.
Currently, CSR will add send redirect signal to refresh the pipeline after modifying `satp`, so we can unify latency in different CSR bundles to fenceDelay.
show more ...
|
fad7c425 | 10-Mar-2025 |
Anzo <[email protected]> |
fix(MainPipe): `DCache` meta is not changed when sc/cas fails (#4217)
`sc` miss are treated as failures directly and are no longer sent to the `missqueue`. The meta is no longer updated when `sc` fa
fix(MainPipe): `DCache` meta is not changed when sc/cas fails (#4217)
`sc` miss are treated as failures directly and are no longer sent to the `missqueue`. The meta is no longer updated when `sc` fail.
show more ...
|
d6b0a27f | 09-Mar-2025 |
LMiaoH <[email protected]> |
fix(Bitmap): Fix early `s_llptw_req` trigger in PTW (#4375)
- When `HasBitmapCheck` is enabled in `CVMConfig`. - PTW module checks whether a bitmap check(`whether_need_bitmap_check`) is needed after
fix(Bitmap): Fix early `s_llptw_req` trigger in PTW (#4375)
- When `HasBitmapCheck` is enabled in `CVMConfig`. - PTW module checks whether a bitmap check(`whether_need_bitmap_check`) is needed after receiving a `mem.resp` . then set `s_llptw_req` to be valid and send a request to LLPTW. - However, the `s_llptw_req` signal becomes valid one cycle too early.
Co-authored-by: LMiao <[email protected]>
show more ...
|
591ae1c5 | 09-Mar-2025 |
Xin Tian <[email protected]> |
fix(L2TLB): fix `hptw_bypassed` wrong used in `refill_valid` (#4366)
The Reg `hptw_bypassed` is used to indicate a hptw's mem response no need to refill PTWCache. So add condition `from_hptw(mem.d.b
fix(L2TLB): fix `hptw_bypassed` wrong used in `refill_valid` (#4366)
The Reg `hptw_bypassed` is used to indicate a hptw's mem response no need to refill PTWCache. So add condition `from_hptw(mem.d.bits.source)` for `hptw_bypassed` used in `refill_valid`, to fix bug of Reg `hptw_bypassed` wrongly killing refill request from another ptw.
show more ...
|
11269ca7 | 09-Mar-2025 |
Tang Haojin <[email protected]> |
chore: fix several deprecation warning (#4352) |
455e3b53 | 06-Mar-2025 |
cz4e <[email protected]> |
fix(MainPipe): fix `s3_l2_error` and `s3_error` enable signal (#4345)
* `s2_fire` and `s2_can_to_s3` are different * `io.error.valid` uses `s2_fire`, but `s3_l2_error` uses `s2_can_to_s3`, causing
fix(MainPipe): fix `s3_l2_error` and `s3_error` enable signal (#4345)
* `s2_fire` and `s2_can_to_s3` are different * `io.error.valid` uses `s2_fire`, but `s3_l2_error` uses `s2_can_to_s3`, causing `io.error.valid` to be updated, but `s3_l2_error` not to be updated.
show more ...
|
10cfb21d | 03-Mar-2025 |
cz4e <[email protected]> |
fix(DCache): use `ParallelMux` instead of `Mux1H` (#4340)
* When there are multiple errors,`Mux1H` is equivalent to using `|`, for example
* error 0, valid = 1, addr0 = 0x1000 * error 1, va
fix(DCache): use `ParallelMux` instead of `Mux1H` (#4340)
* When there are multiple errors,`Mux1H` is equivalent to using `|`, for example
* error 0, valid = 1, addr0 = 0x1000 * error 1, valid = 1, addr1 = 0x0ffff * the result is `io.error.valid == 1`, but `io.error.bits.addr == (addr0 | addr1)`, cause `Mux1H` will generate circuit like this: ``` addr = (valid0 ? addr0 : 'h0) | (valid1 ? addr1 : 'h0) ``` * This problem can be avoided by using `ParallelMux`
show more ...
|
f5d5a4f3 | 03-Mar-2025 |
CharlieLiu <[email protected]> |
fix(DCache): fix wrong condition for blocking lr (#4337)
Following lr should be blocked when previous lr's resv_set is still valid, which means `lrsc_count > 0`.
In previous PR #3017 and #4117, `lr
fix(DCache): fix wrong condition for blocking lr (#4337)
Following lr should be blocked when previous lr's resv_set is still valid, which means `lrsc_count > 0`.
In previous PR #3017 and #4117, `lrsc_count > 8` is used as block condition, and stop update `lrsc_count` when it reaches 8, fix it now.
show more ...
|
4b2c87ba | 27-Feb-2025 |
梁森 Liang Sen <[email protected]> |
feat(dfx): integerate dfx components (#4312) |
b6c14329 | 24-Feb-2025 |
cz4e <[email protected]> |
timing(MainPipe): remove `set_conflict` for tag/meta read (#4295)
* reduce meta/tag read enable generate logic
* when `set_conflict == 1`, request can not go to `s1`, and `write` has
higher priori
timing(MainPipe): remove `set_conflict` for tag/meta read (#4295)
* reduce meta/tag read enable generate logic
* when `set_conflict == 1`, request can not go to `s1`, and `write` has
higher priority to `read`
show more ...
|