History log of /XiangShan/src/main/scala/xiangshan/mem/lsqueue/LoadQueueRAR.scala (Results 1 – 25 of 31)
Revision Date Author Comments
# 522c7f99 07-Mar-2025 Anzo <[email protected]>

fix(LSU): misaligned violation detection stuck (#4369)

Since a load instruction that cross 16Byte needs to be split and
accessed twice, it needs to enter the `RAR Queue` twice, but occupies
only one

fix(LSU): misaligned violation detection stuck (#4369)

Since a load instruction that cross 16Byte needs to be split and
accessed twice, it needs to enter the `RAR Queue` twice, but occupies
only one `virtual load queue`, so in the extreme case it may happen that
36 load instructions that span 16Byte fill all 72 `RAR queues`.

---

There is some problem with our previous handling; if the oldest load
instruction spanning 16Byte enters the `replayqueue` and at the same
time there exists an instruction in the `loadmisalignbuffer` that can't
finish executing because the `RAR Queue` is full, then the oldest load
instruction is never cannot be issued because the `loadmisalignbuffer`
has instructions in it all the time.

---

Therefore, we use a more violent scheme to do this.
When the RAR is full, we let the misaligned load generate a rollback,
and the next load instruction that the loadmisalignbuffer can receive
must be the oldest (if it is misaligned).

show more ...


# 90f8d3cf 06-Mar-2025 cz4e <[email protected]>

fix(LoadUnit): exclude prefetch requests (#4367)

* In order to ensure timing, the RAR enqueue conditions need to be
compromised, worst source of timing from `pmp` and `missQueue`.

* if `LoadQueueRA

fix(LoadUnit): exclude prefetch requests (#4367)

* In order to ensure timing, the RAR enqueue conditions need to be
compromised, worst source of timing from `pmp` and `missQueue`.

* if `LoadQueueRARSize` == `VirtualLoadQueueSize`, just need to exclude
prefetching.

* if `LoadQueueRARSize` < `VirtualLoadQueueSize`, need to consider the
situation of `s2_can_query`

show more ...


# 0d55e1db 28-Feb-2025 cz4e <[email protected]>

timing(LoadQueueRAR, LoadUnit): adjust rar/raw query logic (#4297)

* Because of `LoadQueueRARSize == VirtualLoadQueueSize`, so no need to
add additional logic for rar enq
* When no need fast replay,

timing(LoadQueueRAR, LoadUnit): adjust rar/raw query logic (#4297)

* Because of `LoadQueueRARSize == VirtualLoadQueueSize`, so no need to
add additional logic for rar enq
* When no need fast replay, loadunit allocate raw entry

show more ...


# 3c808de0 17-Feb-2025 Anzo <[email protected]>

fix(LSU): fix cbo instr exceptions and implementation (#4262)

1. typo.
2. `cbo` instr not produce misaligned exception.
3. `cbo zero` instr need flush `sbuffer`.
4. `cbo zero` sets mask correctly

fix(LSU): fix cbo instr exceptions and implementation (#4262)

1. typo.
2. `cbo` instr not produce misaligned exception.
3. `cbo zero` instr need flush `sbuffer`.
4. `cbo zero` sets mask correctly
5. Adding RAW checks to `cbo zero`.
6. Adding trigger(Debug Mode) checks to `cbo zero`.
7. Fixed several issues with the CBO instruction in NEMU.
----

In order not to create ambiguity with `io.mmioStout`, a new port of
`StoreQueue` is introduced for writeback `cbo zero` after flush sbuffer.
arbitration is performed in `MemBlock`, and currently, `cbo zero` has
higher priority by default.
`cbo zero` should not be writteback at the same time as `mmio`.

---
A check on `CacheLine` has been added to `RAWQueue` to ensure memory
consistency when executing `cbo zero`.
See this issues:https://github.com/OpenXiangShan/XiangShan/issues/4240
for specific issues.

---
The `cbo` instruction requires a trigger check.

---------

Co-authored-by: zhanglinjuan <[email protected]>

show more ...


# 9e12e8ed 08-Feb-2025 cz4e <[email protected]>

style(Bundles): move bundles to Bundles.scala (#4247)


# 9b12a106 25-Dec-2024 Anzo <[email protected]>

area(LoadQueue): remove useless regs (#4062)

Vector Load's additional release logic in the `RAR/RAW Queue` looks
unneeded, which would result in the `RAR/RAW Queue` storing redundant
`regs` for `uop

area(LoadQueue): remove useless regs (#4062)

Vector Load's additional release logic in the `RAR/RAW Queue` looks
unneeded, which would result in the `RAR/RAW Queue` storing redundant
`regs` for `uopidx`.

show more ...


# 8b33cd30 13-Dec-2024 klin02 <[email protected]>

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
Wh

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to
XSDebug(cond1 && cond2, pable)

show more ...


# 549073a0 10-Dec-2024 cz4e <[email protected]>

area(Lsq): compress rar/raw paddr and remove sq useless regs (#3976)

* LoadQueueRAR PAddr hash function, total 16bits:
![vaddr_compress](https://github.com/user-attachments/assets/6b87fb4d-7080-4b5

area(Lsq): compress rar/raw paddr and remove sq useless regs (#3976)

* LoadQueueRAR PAddr hash function, total 16bits:
![vaddr_compress](https://github.com/user-attachments/assets/6b87fb4d-7080-4b59-bf20-0e0f991ab141)
* LoadQueueRAW use PAddr[29:6], total 24bits

show more ...


# e10e20c6 27-Nov-2024 Yanqin Li <[email protected]>

style(pbmt): remove the useless and standardize code

* style(pbmt): remove outstanding constant which is just for self-test

* fix(uncache): added mask comparison for `addrMatch`

* style(mem): code

style(pbmt): remove the useless and standardize code

* style(pbmt): remove outstanding constant which is just for self-test

* fix(uncache): added mask comparison for `addrMatch`

* style(mem): code normalization

* fix(pbmt): handle cases where the load unit is byte, word, etc

* style(uncache): fix an import

* fix(uncahce): address match should use non-offset address when forwading

In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time.

* style(RAR): remove redundant design of `nc` reg

show more ...


# c7353d05 03-Sep-2024 Yanqin Li <[email protected]>

feat(NCld): support WMO access for NC ld

* feat(LDU): add support for NC in LoadUnit

* feat(LQ,UB): add support for NC in load queue and uncache buffer

* chore(pbmt): add xsperf for nc ld statistic


# 5003e6f8 23-Jul-2024 Huijin Li <[email protected]>

LSQ: optimize static clock gating coverage and fix x_value in vcs (#3176)

optimize LSQ static clock gating coverage, fix x_value in vcs


# a7828dc1 12-Jun-2024 Tang Haojin <[email protected]>

Revert "LSQ: optimize static clock gating coverage (#3023)" (#3055)


# ff9b84b9 11-Jun-2024 lwd <[email protected]>

LSQ: refactor vector load/store commit judging logic to fix X in vcs (#3048)


# 3b94d5d7 07-Jun-2024 Xuan Hu <[email protected]>

LSQ: use RegNextWithEnable when RegEnable.next contains RegEnable.enable (#3046)


# 31fae68e 03-Jun-2024 Yanqin Li <[email protected]>

clockgate: set default initialization with 0 to fix X in vcs (#3031)


# 082b30d1 31-May-2024 Huijin Li <[email protected]>

LSQ: optimize static clock gating coverage (#3023)


# 627be78b 23-Apr-2024 good-circle <[email protected]>

VLSU, lsq: support more than one vector pipeline


# 26af847e 25-Mar-2024 good-circle <[email protected]>

rv64v: implement lsu & lsq vector datapath


# aab688f4 27-Dec-2023 Xuan Hu <[email protected]>

Merge remote-tracking branch 'upstream/kunminghu' into tmp-backend-merge-master


# 30f5dffa 18-Dec-2023 sfencevma <[email protected]>

LQ: Fixed the bug that the load did not detect RAR violation (#2555)

Bugs description:
LoadQueueRAR requires 2 cycles to store paddr,when a probe request comes in the previous cycle,released will n

LQ: Fixed the bug that the load did not detect RAR violation (#2555)

Bugs description:
LoadQueueRAR requires 2 cycles to store paddr,when a probe request comes in the previous cycle,released will not be updated in correctly.

Bugs fix:
Add a bypass register, store paddr temporary.

show more ...


# 8241cb85 17-Dec-2023 Xuan Hu <[email protected]>

Merge remote-tracking branch 'upstream/master' into backendq


# cd2ff98b 01-Dec-2023 happy-lx <[email protected]>

Rebase Timing Fix of Memblock from fix-timing branch (#2501)

* fix LQ timing

* l1pf: fix pf queue to ldu timing

* disable ecc path for timing analysis

* TODO: remove this

* fix pipeline

Rebase Timing Fix of Memblock from fix-timing branch (#2501)

* fix LQ timing

* l1pf: fix pf queue to ldu timing

* disable ecc path for timing analysis

* TODO: remove this

* fix pipeline

* memblock: add a Reg between inner/outer reset_vec

* missqueue: make mem_grant always ready

* Enable ECC path again

* remove fast replay reorder logic

* l1pf: use chosen of arbiter to improve timing

* remove reorder remain logic

* mq: use ParallelORR instead of orR

* Strengthen the conditions for load to load path for timing

* fix load to load data select for timing

* refactoring lq replay valid logic

* fix replay port

* fix load unit s0 arbitor logic

* add topdown wiring

* fix ldu ecc path

* remove lateKill

* ecc: physically remove ecc in DataArray

* loadpipe: use ParallelORR and ParallelMux for timing

* mainpipe: use ParallelMux and ParallelorR for timing

* fix fast replay is killed at s1

* fix replay cancel logic

* fix mq nack feedback logic

* sms: fix pf queue tlb req logic for timing

* kill load at s1

* fix loadqueuereplay enq logic

* opt raw rollback arbiter logic

* fix ecc_delayed writeback logic

* train all l1 pf and sms at load s3 for better timing

* disable load to load forward

* Revert "kill load at s1"

This reverts commit 56d47582ad4dd9c83373fb2db2a0709075485d4d.

* fix s0 kill logic

* ITLBRepeater: Add one more buffer when PTW resp

* remove trigger

* fix feedback_slow logic

* add latch in uncachebuffer rollback

* remove trigger in port

* fast replay: use dcache ready

* fix replay logic at s1

* uncache: fix uncache writeback

* fix delay kill logic

* fix clean exception loigc at s3

* fix ldu rollback logic

* fix ldu rollback valid logic

---------

Co-authored-by: sfencevma <[email protected]>
Co-authored-by: XiChen <[email protected]>
Co-authored-by: Lyn <[email protected]>
Co-authored-by: good-circle <[email protected]>

show more ...


# 4b0d80d8 11-Oct-2023 Xuan Hu <[email protected]>

Merge upstream/master into tmp-backend-merge-master


# 8891a219 08-Oct-2023 Yinan Xu <[email protected]>

Bump rocket-chip (#2353)


# f275998a 09-Aug-2023 sfencevma <[email protected]>

MemBlock: fix timing (#2223)

* fix probe_ttob_check_resp timing

* move probe_ttb check to mainpipe s2, get resp in s3

* fix main_pipe_req timing

* remove fastarbiter

* fix prefetcher tim

MemBlock: fix timing (#2223)

* fix probe_ttob_check_resp timing

* move probe_ttb check to mainpipe s2, get resp in s3

* fix main_pipe_req timing

* remove fastarbiter

* fix prefetcher timing

* remove select invalid way first

* MemBlock: fix timing

* add redirectCancelCount

* correct canAccept

* fix loadQueueReplay select timing

* rename sleepIndex

* rename selectIndexOH

---------

Co-authored-by: lixin <[email protected]>

show more ...


12