#
522c7f99 |
| 07-Mar-2025 |
Anzo <[email protected]> |
fix(LSU): misaligned violation detection stuck (#4369)
Since a load instruction that cross 16Byte needs to be split and accessed twice, it needs to enter the `RAR Queue` twice, but occupies only one
fix(LSU): misaligned violation detection stuck (#4369)
Since a load instruction that cross 16Byte needs to be split and accessed twice, it needs to enter the `RAR Queue` twice, but occupies only one `virtual load queue`, so in the extreme case it may happen that 36 load instructions that span 16Byte fill all 72 `RAR queues`.
---
There is some problem with our previous handling; if the oldest load instruction spanning 16Byte enters the `replayqueue` and at the same time there exists an instruction in the `loadmisalignbuffer` that can't finish executing because the `RAR Queue` is full, then the oldest load instruction is never cannot be issued because the `loadmisalignbuffer` has instructions in it all the time.
---
Therefore, we use a more violent scheme to do this. When the RAR is full, we let the misaligned load generate a rollback, and the next load instruction that the loadmisalignbuffer can receive must be the oldest (if it is misaligned).
show more ...
|
#
90f8d3cf |
| 06-Mar-2025 |
cz4e <[email protected]> |
fix(LoadUnit): exclude prefetch requests (#4367)
* In order to ensure timing, the RAR enqueue conditions need to be compromised, worst source of timing from `pmp` and `missQueue`.
* if `LoadQueueRA
fix(LoadUnit): exclude prefetch requests (#4367)
* In order to ensure timing, the RAR enqueue conditions need to be compromised, worst source of timing from `pmp` and `missQueue`.
* if `LoadQueueRARSize` == `VirtualLoadQueueSize`, just need to exclude prefetching. * if `LoadQueueRARSize` < `VirtualLoadQueueSize`, need to consider the situation of `s2_can_query`
show more ...
|
#
0d55e1db |
| 28-Feb-2025 |
cz4e <[email protected]> |
timing(LoadQueueRAR, LoadUnit): adjust rar/raw query logic (#4297)
* Because of `LoadQueueRARSize == VirtualLoadQueueSize`, so no need to add additional logic for rar enq * When no need fast replay,
timing(LoadQueueRAR, LoadUnit): adjust rar/raw query logic (#4297)
* Because of `LoadQueueRARSize == VirtualLoadQueueSize`, so no need to add additional logic for rar enq * When no need fast replay, loadunit allocate raw entry
show more ...
|
#
3c808de0 |
| 17-Feb-2025 |
Anzo <[email protected]> |
fix(LSU): fix cbo instr exceptions and implementation (#4262)
1. typo.
2. `cbo` instr not produce misaligned exception.
3. `cbo zero` instr need flush `sbuffer`.
4. `cbo zero` sets mask correctly
fix(LSU): fix cbo instr exceptions and implementation (#4262)
1. typo.
2. `cbo` instr not produce misaligned exception.
3. `cbo zero` instr need flush `sbuffer`.
4. `cbo zero` sets mask correctly
5. Adding RAW checks to `cbo zero`.
6. Adding trigger(Debug Mode) checks to `cbo zero`.
7. Fixed several issues with the CBO instruction in NEMU.
----
In order not to create ambiguity with `io.mmioStout`, a new port of
`StoreQueue` is introduced for writeback `cbo zero` after flush sbuffer.
arbitration is performed in `MemBlock`, and currently, `cbo zero` has
higher priority by default.
`cbo zero` should not be writteback at the same time as `mmio`.
---
A check on `CacheLine` has been added to `RAWQueue` to ensure memory
consistency when executing `cbo zero`.
See this issues:https://github.com/OpenXiangShan/XiangShan/issues/4240
for specific issues.
---
The `cbo` instruction requires a trigger check.
---------
Co-authored-by: zhanglinjuan <[email protected]>
show more ...
|
#
9e12e8ed |
| 08-Feb-2025 |
cz4e <[email protected]> |
style(Bundles): move bundles to Bundles.scala (#4247)
|
#
9b12a106 |
| 25-Dec-2024 |
Anzo <[email protected]> |
area(LoadQueue): remove useless regs (#4062)
Vector Load's additional release logic in the `RAR/RAW Queue` looks unneeded, which would result in the `RAR/RAW Queue` storing redundant `regs` for `uop
area(LoadQueue): remove useless regs (#4062)
Vector Load's additional release logic in the `RAR/RAW Queue` looks unneeded, which would result in the `RAR/RAW Queue` storing redundant `regs` for `uopidx`.
show more ...
|
#
8b33cd30 |
| 13-Dec-2024 |
klin02 <[email protected]> |
feat(XSLog): move all XSLog outside WhenContext for collection
As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside Wh
feat(XSLog): move all XSLog outside WhenContext for collection
As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to XSDebug(cond1 && cond2, pable)
show more ...
|
#
549073a0 |
| 10-Dec-2024 |
cz4e <[email protected]> |
area(Lsq): compress rar/raw paddr and remove sq useless regs (#3976)
* LoadQueueRAR PAddr hash function, total 16bits:
: compress rar/raw paddr and remove sq useless regs (#3976)
* LoadQueueRAR PAddr hash function, total 16bits:

* LoadQueueRAW use PAddr[29:6], total 24bits
show more ...
|
#
e10e20c6 |
| 27-Nov-2024 |
Yanqin Li <[email protected]> |
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code normalization
* fix(pbmt): handle cases where the load unit is byte, word, etc
* style(uncache): fix an import
* fix(uncahce): address match should use non-offset address when forwading
In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time.
* style(RAR): remove redundant design of `nc` reg
show more ...
|
#
c7353d05 |
| 03-Sep-2024 |
Yanqin Li <[email protected]> |
feat(NCld): support WMO access for NC ld
* feat(LDU): add support for NC in LoadUnit
* feat(LQ,UB): add support for NC in load queue and uncache buffer
* chore(pbmt): add xsperf for nc ld statistic
|
#
5003e6f8 |
| 23-Jul-2024 |
Huijin Li <[email protected]> |
LSQ: optimize static clock gating coverage and fix x_value in vcs (#3176)
optimize LSQ static clock gating coverage, fix x_value in vcs
|
#
a7828dc1 |
| 12-Jun-2024 |
Tang Haojin <[email protected]> |
Revert "LSQ: optimize static clock gating coverage (#3023)" (#3055)
|
#
ff9b84b9 |
| 11-Jun-2024 |
lwd <[email protected]> |
LSQ: refactor vector load/store commit judging logic to fix X in vcs (#3048)
|
#
3b94d5d7 |
| 07-Jun-2024 |
Xuan Hu <[email protected]> |
LSQ: use RegNextWithEnable when RegEnable.next contains RegEnable.enable (#3046)
|
#
31fae68e |
| 03-Jun-2024 |
Yanqin Li <[email protected]> |
clockgate: set default initialization with 0 to fix X in vcs (#3031)
|
#
082b30d1 |
| 31-May-2024 |
Huijin Li <[email protected]> |
LSQ: optimize static clock gating coverage (#3023)
|
#
627be78b |
| 23-Apr-2024 |
good-circle <[email protected]> |
VLSU, lsq: support more than one vector pipeline
|
#
26af847e |
| 25-Mar-2024 |
good-circle <[email protected]> |
rv64v: implement lsu & lsq vector datapath
|
#
aab688f4 |
| 27-Dec-2023 |
Xuan Hu <[email protected]> |
Merge remote-tracking branch 'upstream/kunminghu' into tmp-backend-merge-master
|
#
30f5dffa |
| 18-Dec-2023 |
sfencevma <[email protected]> |
LQ: Fixed the bug that the load did not detect RAR violation (#2555)
Bugs description:
LoadQueueRAR requires 2 cycles to store paddr,when a probe request comes in the previous cycle,released will n
LQ: Fixed the bug that the load did not detect RAR violation (#2555)
Bugs description:
LoadQueueRAR requires 2 cycles to store paddr,when a probe request comes in the previous cycle,released will not be updated in correctly.
Bugs fix:
Add a bypass register, store paddr temporary.
show more ...
|
#
8241cb85 |
| 17-Dec-2023 |
Xuan Hu <[email protected]> |
Merge remote-tracking branch 'upstream/master' into backendq
|
#
cd2ff98b |
| 01-Dec-2023 |
happy-lx <[email protected]> |
Rebase Timing Fix of Memblock from fix-timing branch (#2501)
* fix LQ timing
* l1pf: fix pf queue to ldu timing
* disable ecc path for timing analysis
* TODO: remove this
* fix pipeline
Rebase Timing Fix of Memblock from fix-timing branch (#2501)
* fix LQ timing
* l1pf: fix pf queue to ldu timing
* disable ecc path for timing analysis
* TODO: remove this
* fix pipeline
* memblock: add a Reg between inner/outer reset_vec
* missqueue: make mem_grant always ready
* Enable ECC path again
* remove fast replay reorder logic
* l1pf: use chosen of arbiter to improve timing
* remove reorder remain logic
* mq: use ParallelORR instead of orR
* Strengthen the conditions for load to load path for timing
* fix load to load data select for timing
* refactoring lq replay valid logic
* fix replay port
* fix load unit s0 arbitor logic
* add topdown wiring
* fix ldu ecc path
* remove lateKill
* ecc: physically remove ecc in DataArray
* loadpipe: use ParallelORR and ParallelMux for timing
* mainpipe: use ParallelMux and ParallelorR for timing
* fix fast replay is killed at s1
* fix replay cancel logic
* fix mq nack feedback logic
* sms: fix pf queue tlb req logic for timing
* kill load at s1
* fix loadqueuereplay enq logic
* opt raw rollback arbiter logic
* fix ecc_delayed writeback logic
* train all l1 pf and sms at load s3 for better timing
* disable load to load forward
* Revert "kill load at s1"
This reverts commit 56d47582ad4dd9c83373fb2db2a0709075485d4d.
* fix s0 kill logic
* ITLBRepeater: Add one more buffer when PTW resp
* remove trigger
* fix feedback_slow logic
* add latch in uncachebuffer rollback
* remove trigger in port
* fast replay: use dcache ready
* fix replay logic at s1
* uncache: fix uncache writeback
* fix delay kill logic
* fix clean exception loigc at s3
* fix ldu rollback logic
* fix ldu rollback valid logic
---------
Co-authored-by: sfencevma <[email protected]>
Co-authored-by: XiChen <[email protected]>
Co-authored-by: Lyn <[email protected]>
Co-authored-by: good-circle <[email protected]>
show more ...
|
#
4b0d80d8 |
| 11-Oct-2023 |
Xuan Hu <[email protected]> |
Merge upstream/master into tmp-backend-merge-master
|
#
8891a219 |
| 08-Oct-2023 |
Yinan Xu <[email protected]> |
Bump rocket-chip (#2353)
|
#
f275998a |
| 09-Aug-2023 |
sfencevma <[email protected]> |
MemBlock: fix timing (#2223)
* fix probe_ttob_check_resp timing
* move probe_ttb check to mainpipe s2, get resp in s3
* fix main_pipe_req timing
* remove fastarbiter
* fix prefetcher tim
MemBlock: fix timing (#2223)
* fix probe_ttob_check_resp timing
* move probe_ttb check to mainpipe s2, get resp in s3
* fix main_pipe_req timing
* remove fastarbiter
* fix prefetcher timing
* remove select invalid way first
* MemBlock: fix timing
* add redirectCancelCount
* correct canAccept
* fix loadQueueReplay select timing
* rename sleepIndex
* rename selectIndexOH
---------
Co-authored-by: lixin <[email protected]>
show more ...
|