History log of /XiangShan/src/main/scala/xiangshan/cache/dcache/loadpipe/LoadPipe.scala (Results 1 – 25 of 75)
Revision Date Author Comments
# 51f9a957 21-Feb-2025 cz4e <[email protected]>

style(LoadPipe): use `miss_req.bits.cancel` instead of `mq_enq_cancel` (#4296)


# fa5e530d 21-Jan-2025 cz4e <[email protected]>

timing(VSegmentUnit): duplicate latchVAddr (#4209)

* `latchVAddr` needs to index all dcache data sram from top to bottom,
which causes a large fanout, so duplicate `latchVaddr`


# fb49ae6b 25-Dec-2024 cz4e <[email protected]>

fix(LoadPipe): enable s1 permission check for prefetch_w (#4056)


# 8b33cd30 13-Dec-2024 klin02 <[email protected]>

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
Wh

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to
XSDebug(cond1 && cond2, pable)

show more ...


# 72dab974 16-Dec-2024 cz4e <[email protected]>

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache protection with Single Error Correct Double Error Detect
(SECDED) ECC on the RAMs. This includes the L1 DChace tag and data RAMs.
Not recovery error tag or data.
* Fault Handling Interrupt (Bus Error Unit Interrupt,BEU, 65)
* Error inject

## ECC Error Detect
An error might be triggered, when access L1 DCache.
* **Error Report**:
* Tag ECC Error: As long as an ECC error occurs on a certain path, it
is judged that an ECC error has occurred.
* Data ECC Error: If an ECC error occurs in the hit line, it is
considered
that an ECC error has occurred. If it does not hit, it will not be
processed.
* If an instruction access triggers an ECC error, a Hardware error is
considered and an exception is reported.
* Whenever there is an error in starting, an error message needs to
be sent to BEU.
* When the hardware detects an error, it reports it to the BEU and
triggers the NMI external interrupt(65).

* **Load instruction**:
* Only ECC errors of tags or data will be triggered during execution,
and the errors will be reported to the BEU and a `Hardware Error`
will be reported.

* **Probe/Snoop**:
* If a tag ecc error occurs, there is no need to change the cache
status,
and a `ProbeAck` with `corrupt=1` needs to be returned to l2.
* If a data ecc error occurs, change the cache status according to
the rules. If data needs to be returned, `ProbeAckData` with `corrupt=1`
needs to be returned to l2.

* **Replace/Evict**:
* `ReleaseData` with `corrupt=1` needs to be returned to l2.

* **Store to L1 DCache**:
* If a tag ecc error occurs, the cacheline is released according to the
`Repalce/Evict` process and the data is written to L1 DCache without
reporting errors to l2.
* If a data ecc error occurs, the data is written directly without
reporting
the error to l2.

* **Atomics**:
* report `Hardware Error`, do not report errors to l2.

## Error Inject
Each core's L1 DCache is configured with a memory map
register-controlled
controller, and each hardware unit that supports ECC is configured with
a
control bank. After the Bank register configuration is completed, L1
DCache
will trigger an ecc error for the first access L1 DCache.
<div style="text-align: center;">
<img
src="https://github.com/user-attachments/assets/8c4d23c5-0324-4e52-bcf4-29b47a282d72"
alt="err_inject" width="200" />
</div>

### Address Space
Address space `0x38022000`-`0x3802207F`, a total of 128 bytes of space,
this space is the local space of each hart.
<div style="text-align: center;">
<img width="292" alt="ctl_bank"
src="https://github.com/user-attachments/assets/89f88b24-37a4-4786-a192-401759eb95cf">
</div>

### L1 DCache Control Bank
Each Control Bank contains registers: `ECCCTL`, `ECCEID`, `ECCMASK`,
each register is 8 bytes.
<img width="414" alt="eccctl"
src="https://github.com/user-attachments/assets/b22ff437-d05d-4b3c-a353-dbea1afdc156">
* ECCCTL(ECC Control): ECC injection control register.
* `ese(error signaling enable)`: Indicates that the injection is valid
and is initialized to 0. When the injection is successful and `pst==0`,
ese will be clean.
* `pst(persist)`: Continuously inject signals. When `pst==1`,
the `ECCEID`
counter decreases to 0 and after successful injection, the
injection timer will be restored to the last set `ECCEID` and
re-injected;
when `pst==0`, it will be injected only once.
* `ede(error delay enable)`: Indicates that counter is valid and
initialized to 0. If
* `ese==1` and `ede==0`, error injection is effective immediately.
* `ese==1` and `ede==1`, you need to wait until `ECCEID`
decrements to 0 before the injection is effective.
* `cmp(component)`: Injection target, initialized to 0.
* 1'b0: The injection object is tag.
* 1'b1: The injection object is data.
* `bank`: The bank valid signal is initialized to 0. When the bit in
the `bank` is set, the corresponding mask is valid.

<img width="414" alt="ecceid"
src="https://github.com/user-attachments/assets/8cea0d8d-2540-44b1-b1f9-c1ed6ec5341e">

* ECCEID(ECC Error Inject Delay): ECC injection delay controller.
* When `ese==1` and `ede==1`, it
starts to decrease until it reaches 0. Currently, the same clock as
the core frequency is used, which can also be divided. Since ECC
injection relies on L1 DCache access, the time of the `EID` and the
time when the ECC error is triggered may not be consistent.

<img width="414" alt="eccmask"
src="https://github.com/user-attachments/assets/b1be83fd-17a6-4324-8aa6-45858249c476">

* ECCMASK(ECC Mask): ECC injection mask register.
* 0 means no inversion, 1 means flip.
Tag injection only uses the bits in `ECCMASK0` corresponding to
the tag length.

### Error Inject Example
```
1 # set control bank base address
2 mv x3, $(BASEADDR)
3
4 # set eid
5 mv x5, 500 # delay 500 cycles
6 sd x5, 8(x3) # mmio store
7
8 # set mask
9 mv x5, 0x1 # flip bit 0
10 sd x5, 16(x3) # mmio store
11
12 # set ctl
13 mv x5, 0x7 # comp = 0, ede = 1, pst = 1, ese = 1
14 sd x5, 0(x3) # mmio store
```

show more ...


# 8ffb12e4 13-Dec-2024 Anzo <[email protected]>

fix(bank_conflict): Selecting the oldest Load causes a conflict (#4036)

This modification changes `load bank conflict` from [default priority 0
1 2] to [so that the oldest Load does not have a `ban

fix(bank_conflict): Selecting the oldest Load causes a conflict (#4036)

This modification changes `load bank conflict` from [default priority 0
1 2] to [so that the oldest Load does not have a `bank conflict`].

In the following, `Load 0` refers to `LoadUnit 0`.

For example, before:
Load 0 lqidx 5
Load 1 lqidx 3
Load 2 lqidx 8
Assuming that three Loads have `bank conflict`, then we will default to
making Load1 and Load2 have `bank conflict` so that they can be
replayed.

---

However, this may lead to deadlocks in some cases.
For example:
Load 0 robidx 7
Store 0 robidx 6
Load 1 robidx 5

`Store 0` is dependent on `Load 1` for data, while `Load 0` is dependent
on `Store 0` for data, and `Load 0` and `Load 1` will have a `bank
conflict`.
In this case then, `Load 1` will `Replay` because of `bank conflict` and
`Load 0` will `Replay` because of `forward fault`(because of misalign).

---

With the modification, we will choose to make the oldest Load not
generate `bank conflict`, thus circumventing the jamming problem.
**Note !!! This may introduce performance fluctuations (up or down)**

show more ...


# 6a539f6d 29-Nov-2024 Anzooooo <[email protected]>

feat(LoadPipe): let 128bitReq be accessed at 128-bit aligned granularity


# e718f875 27-Nov-2024 Anzo <[email protected]>

refactor(LoadPipe): remove the redundant logic of the `mq_nack` (#3936)

Remove the redundant logic of the miss queue nack based on the issues
raised in the
issues(https://github.com/OpenXiangShan/

refactor(LoadPipe): remove the redundant logic of the `mq_nack` (#3936)

Remove the redundant logic of the miss queue nack based on the issues
raised in the
issues(https://github.com/OpenXiangShan/XiangShan/issues/3916).

When a tlb miss and a dcache miss occur at the same time and the miss
queue nack, it will cause the `LoadUnit` to generate both replay signals
`C_TM` and `C_DR`. We will give priority to `C_TM`, which is why we need
to send a kill signal to dcache when a tlb miss occurs.

Although there was no problem before, as the
issue(https://github.com/OpenXiangShan/XiangShan/issues/3916) says, this
will cause ambiguity, and the miss queue nack message is already
included in `s2_nack_no_mshr`, so the choice is to remove the
`s2_miss_req_fire` signal from the generation logic of the `s2_mq_nack`
signal.

show more ...


# 92bcee1c 22-Nov-2024 cz4e <[email protected]>

timing(DCache): delay tag error to s3 instead of s2 (#3908)

* break hitVec -> ldu tag_error -> loadunit path


# a5f58fbc 29-Sep-2024 lixin <[email protected]>

timing(dataArray): seperate bankedDataRead kill

Do not let banked_read_valid include kill to improve the timing of reading sram.
Later, use kill to determine bankConflict in load s2.

fix(BankedData

timing(dataArray): seperate bankedDataRead kill

Do not let banked_read_valid include kill to improve the timing of reading sram.
Later, use kill to determine bankConflict in load s2.

fix(BankedDataArray): remove kill logic when generate rr_bank_conflict

data_bank will select the read address based on the priority of the valid signal.
When there are multiple read requests, bank conflicts occur and the high-priority
needs to be killed, the data read by the low-priority loadunit will be overwritten.

show more ...


# 4cc68b84 26-Sep-2024 sfencevma <[email protected]>

timing(LoadPipe): remove permission and coh check when generate hit logic


# 44f2941b 24-Sep-2024 Jiru Sun <[email protected]>

refactor(HPM): move HPMs from utils to utility repo (#3631)

Because HPMs will be used in Coupled L2 as well, delete
`PerfCounterUtils.scala` in Xiangshan and create
`HardwarePerfMonitor.scala` in

refactor(HPM): move HPMs from utils to utility repo (#3631)

Because HPMs will be used in Coupled L2 as well, delete
`PerfCounterUtils.scala` in Xiangshan and create
`HardwarePerfMonitor.scala` in Utility.
See also [Pull Request in
CoupledL2](https://github.com/OpenXiangShan/CoupledL2/pull/251#discussion_r1770738535).

show more ...


# 08b0bc30 03-Sep-2024 happy-lx <[email protected]>

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Opt

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Optimize VLSU feedback and redirect
+ Optimise ldCancel and writeback signal generation
+ Optimise TLB Query Vaddr/hlv/hlvx/valid etc
+ Delay MMIO Store writeback for 1 Cycle
+ Fix tlbNoQuery and pmp logic
+ Remove clock gating for s3_fast_rep
+ Remove wbq conflict check to LoadPipe/MainPipe
+ Remove Mux in dcache resp data
+ Optimise data generation logic of LoadUnit
+ Duplicate Register in LoadUnit for data writeback
+ Duplicate Register in loadPipe for missQueue enq
+ Add skid buffer in VLSU
+ Select data from metaArray at S1
+ Simplify the enqueuing logic of missQueue
+ Separately generate the ready logic of miss Queue
+ Relax the conditions valid for bankdataArray reads
+ Add Reg between Dcache Mainpipe with sms prefetcher
+ Optimise store exceptionBuffer pipeline

---------

Co-authored-by: weiding liu <[email protected]>
Co-authored-by: Charlie Liu <[email protected]>
Co-authored-by: good-circle <[email protected]>

show more ...


# 6070f1e9 03-Sep-2024 happy-lx <[email protected]>

fix(L1PF): fix good_prefetch Counting logic (#3474)

Previous design:
When a demand load hits a Cache block fetched by the prefetcher, the
`PrefetchSource` of this block will be cleared,
causing i

fix(L1PF): fix good_prefetch Counting logic (#3474)

Previous design:
When a demand load hits a Cache block fetched by the prefetcher, the
`PrefetchSource` of this block will be cleared,
causing it to be mistakenly believed that it was not fetched by the
prefetcher initially when it is subsequently replaced from the cache,
resulting in not increasing the `good_prefetch` counter

Fix:
Now add a new cache block status(L1_HW_PREFETCH_CLEAR): indicating that
this block was originally fetched by the prefetcher

show more ...


# d4564868 17-Jul-2024 weiding liu <[email protected]>

Dcache: refactor dcache's read data delay for better port timing


# bb2f3f51 12-Jul-2024 Tang Haojin <[email protected]>

perf: use perfUtils in `Utility` (#3190)

Currently, log and perf utilities such as `XSPerfAccumulate` are
implemented in many repositories like XiangShan, CoupledL2 and HuanCun.
This PR unifies th

perf: use perfUtils in `Utility` (#3190)

Currently, log and perf utilities such as `XSPerfAccumulate` are
implemented in many repositories like XiangShan, CoupledL2 and HuanCun.
This PR unifies them and put them in Utility repository.

show more ...


# 344cf5d5 27-Jun-2024 CharlieLiu <[email protected]>

DCache: Remove redundant nack_data from mq_nack (#3110)

Remove redundant s2_nack_data from s2_mq_nack


# 10deab87 28-May-2024 good-circle <[email protected]>

Dcache: data read valid should not rely on tag hit


# 31d5a9c4 09-Jan-2024 sfencevma <[email protected]>

ECC: add enable option for ecc


# 5adc4829 16-Jun-2024 Yanqin Li <[email protected]>

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: CharlieLiu <[email protected]>

show more ...


# 0184a80e 15-Jun-2024 Yanqin Li <[email protected]>

L1CacheErrorInfo: code refactor for correct and convenient clockgate (#3044)


# c2bbba9f 21-May-2024 CharlieLiu <[email protected]>

DCache: Remove useless data_read when miss in LoadPipe (#2990)

- Remove useless data_read when DCache miss in LoadPipe
- Fix req priority in DCache MainPipe


# 20e09ab1 09-May-2024 happy-lx <[email protected]>

fix bug of stream (#2756)

Bug Description:
(1) Increase the way of Dcache to 8 to reduce the problem of running on the bwaves test caused by too many addresses mapped to the same set.
(2) Set ldu0

fix bug of stream (#2756)

Bug Description:
(1) Increase the way of Dcache to 8 to reduce the problem of running on the bwaves test caused by too many addresses mapped to the same set.
(2) Set ldu0 to a high-confidence prefetch request channel to increase the probability that the prefetch request will be accepted by Dcache's MSHR.
(3) Fix the issue that ldu sends an error ready back to the prefetcher to prevent the prefetch request from being dropped.
(4) Dont let the prefetch request access Dcache's DataArray.
(5) Add a extra port in Muti-level prefetch Queue to accept more pf req from stream&stride
(6) Larger Stream bit vector Array 16 -> 32 to cover muti Stream access pattern in Bwaves and GemsFDTD.

In addition, the decline in libquantum is a bit strange.

show more ...


# ffd3154d 25-Apr-2024 CharlieLiu <[email protected]>

DCache: New feature evict on refill (#2919)

- Remove module RefillPipe, move DCache replacer access/update to
MainPipe.
- Using l2_hint as an early wake-up signal for MSHR.

---------

Co-auth

DCache: New feature evict on refill (#2919)

- Remove module RefillPipe, move DCache replacer access/update to
MainPipe.
- Using l2_hint as an early wake-up signal for MSHR.

---------

Co-authored-by: YukunXue <[email protected]>
Co-authored-by: Tang Haojin <[email protected]>
Co-authored-by: ssszwic <[email protected]>
Co-authored-by: Kunlin You <[email protected]>

show more ...


# ec86549e 02-Jan-2024 sfencevma <[email protected]>

MemBlock: enable 3ld3st (#2524)

* enable 3ld3st

* assign enqLsq

* fix IssQueSize

* remove performance regression

* MMU: Fix ptwrepeater when 3ld + 3st

* fix minimal config params

*

MemBlock: enable 3ld3st (#2524)

* enable 3ld3st

* assign enqLsq

* fix IssQueSize

* remove performance regression

* MMU: Fix ptwrepeater when 3ld + 3st

* fix minimal config params

* fix minimal config LoadQueueReplaySize

* add 3ld3st switch

* fix bank conflict valid logic

* fix strict memory ambiguous logic

* fix wakeup logic

* disable 3ld3st by default

* modify minimal config params

---------

Co-authored-by: Lyn <[email protected]>
Co-authored-by: good-circle <[email protected]>

show more ...


123