History log of /XiangShan/src/main/scala/xiangshan/cache/dcache/mainpipe/WritebackQueue.scala (Results 1 – 25 of 34)
Revision Date Author Comments
# 8b33cd30 13-Dec-2024 klin02 <[email protected]>

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
Wh

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to
XSDebug(cond1 && cond2, pable)

show more ...


# 72dab974 16-Dec-2024 cz4e <[email protected]>

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache protection with Single Error Correct Double Error Detect
(SECDED) ECC on the RAMs. This includes the L1 DChace tag and data RAMs.
Not recovery error tag or data.
* Fault Handling Interrupt (Bus Error Unit Interrupt,BEU, 65)
* Error inject

## ECC Error Detect
An error might be triggered, when access L1 DCache.
* **Error Report**:
* Tag ECC Error: As long as an ECC error occurs on a certain path, it
is judged that an ECC error has occurred.
* Data ECC Error: If an ECC error occurs in the hit line, it is
considered
that an ECC error has occurred. If it does not hit, it will not be
processed.
* If an instruction access triggers an ECC error, a Hardware error is
considered and an exception is reported.
* Whenever there is an error in starting, an error message needs to
be sent to BEU.
* When the hardware detects an error, it reports it to the BEU and
triggers the NMI external interrupt(65).

* **Load instruction**:
* Only ECC errors of tags or data will be triggered during execution,
and the errors will be reported to the BEU and a `Hardware Error`
will be reported.

* **Probe/Snoop**:
* If a tag ecc error occurs, there is no need to change the cache
status,
and a `ProbeAck` with `corrupt=1` needs to be returned to l2.
* If a data ecc error occurs, change the cache status according to
the rules. If data needs to be returned, `ProbeAckData` with `corrupt=1`
needs to be returned to l2.

* **Replace/Evict**:
* `ReleaseData` with `corrupt=1` needs to be returned to l2.

* **Store to L1 DCache**:
* If a tag ecc error occurs, the cacheline is released according to the
`Repalce/Evict` process and the data is written to L1 DCache without
reporting errors to l2.
* If a data ecc error occurs, the data is written directly without
reporting
the error to l2.

* **Atomics**:
* report `Hardware Error`, do not report errors to l2.

## Error Inject
Each core's L1 DCache is configured with a memory map
register-controlled
controller, and each hardware unit that supports ECC is configured with
a
control bank. After the Bank register configuration is completed, L1
DCache
will trigger an ecc error for the first access L1 DCache.
<div style="text-align: center;">
<img
src="https://github.com/user-attachments/assets/8c4d23c5-0324-4e52-bcf4-29b47a282d72"
alt="err_inject" width="200" />
</div>

### Address Space
Address space `0x38022000`-`0x3802207F`, a total of 128 bytes of space,
this space is the local space of each hart.
<div style="text-align: center;">
<img width="292" alt="ctl_bank"
src="https://github.com/user-attachments/assets/89f88b24-37a4-4786-a192-401759eb95cf">
</div>

### L1 DCache Control Bank
Each Control Bank contains registers: `ECCCTL`, `ECCEID`, `ECCMASK`,
each register is 8 bytes.
<img width="414" alt="eccctl"
src="https://github.com/user-attachments/assets/b22ff437-d05d-4b3c-a353-dbea1afdc156">
* ECCCTL(ECC Control): ECC injection control register.
* `ese(error signaling enable)`: Indicates that the injection is valid
and is initialized to 0. When the injection is successful and `pst==0`,
ese will be clean.
* `pst(persist)`: Continuously inject signals. When `pst==1`,
the `ECCEID`
counter decreases to 0 and after successful injection, the
injection timer will be restored to the last set `ECCEID` and
re-injected;
when `pst==0`, it will be injected only once.
* `ede(error delay enable)`: Indicates that counter is valid and
initialized to 0. If
* `ese==1` and `ede==0`, error injection is effective immediately.
* `ese==1` and `ede==1`, you need to wait until `ECCEID`
decrements to 0 before the injection is effective.
* `cmp(component)`: Injection target, initialized to 0.
* 1'b0: The injection object is tag.
* 1'b1: The injection object is data.
* `bank`: The bank valid signal is initialized to 0. When the bit in
the `bank` is set, the corresponding mask is valid.

<img width="414" alt="ecceid"
src="https://github.com/user-attachments/assets/8cea0d8d-2540-44b1-b1f9-c1ed6ec5341e">

* ECCEID(ECC Error Inject Delay): ECC injection delay controller.
* When `ese==1` and `ede==1`, it
starts to decrease until it reaches 0. Currently, the same clock as
the core frequency is used, which can also be divided. Since ECC
injection relies on L1 DCache access, the time of the `EID` and the
time when the ECC error is triggered may not be consistent.

<img width="414" alt="eccmask"
src="https://github.com/user-attachments/assets/b1be83fd-17a6-4324-8aa6-45858249c476">

* ECCMASK(ECC Mask): ECC injection mask register.
* 0 means no inversion, 1 means flip.
Tag injection only uses the bits in `ECCMASK0` corresponding to
the tag length.

### Error Inject Example
```
1 # set control bank base address
2 mv x3, $(BASEADDR)
3
4 # set eid
5 mv x5, 500 # delay 500 cycles
6 sd x5, 8(x3) # mmio store
7
8 # set mask
9 mv x5, 0x1 # flip bit 0
10 sd x5, 16(x3) # mmio store
11
12 # set ctl
13 mv x5, 0x7 # comp = 0, ede = 1, pst = 1, ese = 1
14 sd x5, 0(x3) # mmio store
```

show more ...


# 44f2941b 24-Sep-2024 Jiru Sun <[email protected]>

refactor(HPM): move HPMs from utils to utility repo (#3631)

Because HPMs will be used in Coupled L2 as well, delete
`PerfCounterUtils.scala` in Xiangshan and create
`HardwarePerfMonitor.scala` in

refactor(HPM): move HPMs from utils to utility repo (#3631)

Because HPMs will be used in Coupled L2 as well, delete
`PerfCounterUtils.scala` in Xiangshan and create
`HardwarePerfMonitor.scala` in Utility.
See also [Pull Request in
CoupledL2](https://github.com/OpenXiangShan/CoupledL2/pull/251#discussion_r1770738535).

show more ...


# 08b0bc30 03-Sep-2024 happy-lx <[email protected]>

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Opt

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Optimize VLSU feedback and redirect
+ Optimise ldCancel and writeback signal generation
+ Optimise TLB Query Vaddr/hlv/hlvx/valid etc
+ Delay MMIO Store writeback for 1 Cycle
+ Fix tlbNoQuery and pmp logic
+ Remove clock gating for s3_fast_rep
+ Remove wbq conflict check to LoadPipe/MainPipe
+ Remove Mux in dcache resp data
+ Optimise data generation logic of LoadUnit
+ Duplicate Register in LoadUnit for data writeback
+ Duplicate Register in loadPipe for missQueue enq
+ Add skid buffer in VLSU
+ Select data from metaArray at S1
+ Simplify the enqueuing logic of missQueue
+ Separately generate the ready logic of miss Queue
+ Relax the conditions valid for bankdataArray reads
+ Add Reg between Dcache Mainpipe with sms prefetcher
+ Optimise store exceptionBuffer pipeline

---------

Co-authored-by: weiding liu <[email protected]>
Co-authored-by: Charlie Liu <[email protected]>
Co-authored-by: good-circle <[email protected]>

show more ...


# 7ecd6591 30-Jul-2024 Charlie Liu <[email protected]>

DCache: Replay refill_req when the evict_addr matching a valid mshr


# 1461d8f8 12-Jul-2024 CharlieLiu <[email protected]>

DCache: Block the writeback req when the addr matching a valid req in mshr (#3179)

Bug:
- When two req with different addr x and y enter MissQueue together, req
y is real miss req, while req x is

DCache: Block the writeback req when the addr matching a valid req in mshr (#3179)

Bug:
- When two req with different addr x and y enter MissQueue together, req
y is real miss req, while req x is **AcquireBlock BtoT**. Req y receive
the resp from L2 first and complete the refill operation by replacing
the data block with addr x (decided by plru algorithm). MainPipe will
release the data block with addr in writeback queue through req
**Release BtoN** to L2. Addr x receive GrantData with permission toT at
last.
- From the view of L2, the req sequence of addr x is: Acquire BtoT ->
GrantData toT -> Release BtoN, which is abnormal.

Fix: When a valid req reaches wbq, check if there is any valid mshr with
same block_addr. If a mshr is found, block current wbq_req.

show more ...


# bb2f3f51 12-Jul-2024 Tang Haojin <[email protected]>

perf: use perfUtils in `Utility` (#3190)

Currently, log and perf utilities such as `XSPerfAccumulate` are
implemented in many repositories like XiangShan, CoupledL2 and HuanCun.
This PR unifies th

perf: use perfUtils in `Utility` (#3190)

Currently, log and perf utilities such as `XSPerfAccumulate` are
implemented in many repositories like XiangShan, CoupledL2 and HuanCun.
This PR unifies them and put them in Utility repository.

show more ...


# 5adc4829 16-Jun-2024 Yanqin Li <[email protected]>

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: CharlieLiu <[email protected]>

show more ...


# ffd3154d 25-Apr-2024 CharlieLiu <[email protected]>

DCache: New feature evict on refill (#2919)

- Remove module RefillPipe, move DCache replacer access/update to
MainPipe.
- Using l2_hint as an early wake-up signal for MSHR.

---------

Co-auth

DCache: New feature evict on refill (#2919)

- Remove module RefillPipe, move DCache replacer access/update to
MainPipe.
- Using l2_hint as an early wake-up signal for MSHR.

---------

Co-authored-by: YukunXue <[email protected]>
Co-authored-by: Tang Haojin <[email protected]>
Co-authored-by: ssszwic <[email protected]>
Co-authored-by: Kunlin You <[email protected]>

show more ...


# 7f37d55f 09-Oct-2023 Tang Haojin <[email protected]>

chore: bump rocket, Scala 2.13.10, and Chisel 3.6.0 (#2326)

Co-authored-by: Yinan Xu <[email protected]>


# 8891a219 08-Oct-2023 Yinan Xu <[email protected]>

Bump rocket-chip (#2353)


# 935edac4 21-Sep-2023 Tang Haojin <[email protected]>

chore: remove deprecated brackets, APIs, etc. (#2321)


# 9cb34a20 29-Aug-2023 happy-lx <[email protected]>

wbq: fix wbq's FSM logic (#2259)

* All the remain_set are set to the corresponding value before entering the s_release_req state
* set remain_clr to 0 when state change from s_release_req(probe) to

wbq: fix wbq's FSM logic (#2259)

* All the remain_set are set to the corresponding value before entering the s_release_req state
* set remain_clr to 0 when state change from s_release_req(probe) to
s_release_req(release)

show more ...


# f275998a 09-Aug-2023 sfencevma <[email protected]>

MemBlock: fix timing (#2223)

* fix probe_ttob_check_resp timing

* move probe_ttb check to mainpipe s2, get resp in s3

* fix main_pipe_req timing

* remove fastarbiter

* fix prefetcher tim

MemBlock: fix timing (#2223)

* fix probe_ttob_check_resp timing

* move probe_ttb check to mainpipe s2, get resp in s3

* fix main_pipe_req timing

* remove fastarbiter

* fix prefetcher timing

* remove select invalid way first

* MemBlock: fix timing

* add redirectCancelCount

* correct canAccept

* fix loadQueueReplay select timing

* rename sleepIndex

* rename selectIndexOH

---------

Co-authored-by: lixin <[email protected]>

show more ...


# 15ee59e4 25-May-2023 wakafa <[email protected]>

Merge coupledL2 into master (#2064)

* icache: Acquire -> Get to L2

* gitmodules: add coupledL2 as submodule

* cpl2: merge coupledL2 into master

* Changes includes:
* coupledL2 integratio

Merge coupledL2 into master (#2064)

* icache: Acquire -> Get to L2

* gitmodules: add coupledL2 as submodule

* cpl2: merge coupledL2 into master

* Changes includes:
* coupledL2 integration
* modify user&echo fields in i$/d$/ptw
* set d$ never always-releasedata
* remove hw perfcnt connection for L2

* bump utility

* icache: remove unused releaseUnit

* config: minimalconfig includes l2

* Otherwise, dirty bits maintainence may be broken
* Known issue: L2 should have more than 1 bank to avoid compiling problem

* bump Utility

* bump coupledL2: fix bugs in dual-core

* bump coupledL2

* icache: set icache as non-coherent node

* bump coupledL2: fix dirty problem in L2 ProbeAckData

---------

Co-authored-by: guohongyu <[email protected]>
Co-authored-by: XiChen <[email protected]>

show more ...


# b8f6ff86 26-Sep-2022 William Wang <[email protected]>

dcache: fix replace & probeAck TtoB perm problem (#1791)

* chore: fix WBQEntryReleaseUpdate bundle naming

There is no real hardware change

* dcache: fix replace & probeAck TtoB perm problem

dcache: fix replace & probeAck TtoB perm problem (#1791)

* chore: fix WBQEntryReleaseUpdate bundle naming

There is no real hardware change

* dcache: fix replace & probeAck TtoB perm problem

When dcache replaces a cacheline, it will move that cacheline data to
writeback queue, and wait until refill data come. When refill data
comes, it writes dcache data array and update meta for that cacheline,
then wakes up cacheline release req and write data to l2 cache.

In previous design, if a probe request comes before real l1 to l2 release
req, it can be merged in the same writeback queue entry. Probe req will
update dcache meta in mainpipe s3, then be merged in writeback queue.
However, for a probe TtoB req, the following problem may happen:

1) a replace req waits for refill in writeback queue entry X
2) probe TtoB req enters mainpipe s3, set cacheline coh to B
3) probe TtoB req is merged to writeback queue entry X
4) writeback queue entry X is waken up, do probeack immediately (TtoN)
5) refill data for replace req comes from l2, a refill req enters mainpipe
and update dcache meta (set cacheline being replaced coh to N)

Between 4) and 5), l2 thinks that l1 coh is N, but l1 coh is actually B,
here comes the problem.

Temp patch for nanhu:

Now we let all probe req do extra check. If it is a TtoB probe req and the
coresponding cacheline release req is already in writeback queue, we set
dcache meta coh to N. As we do set block in dcache mainpipe, we can do
that check safely when probe req is in mainpipe.

show more ...


# 6c7e5e86 12-Aug-2022 zhanglinjuan <[email protected]>

MainPipe: fix fanout (#1735)


# 84026448 11-Aug-2022 William Wang <[email protected]>

dcache: only update wbq addr when allocate (#1731)

It will remove fanout from mem_release.valid releated logic


# 5c01cc3c 06-Aug-2022 zhanglinjuan <[email protected]>

WritebackQueue: fix bug when ProbeAck is merged with a ReleaseData (#1709)


# c3a5fe5f 04-Aug-2022 happy-lx <[email protected]>

dcache: duplicate registers for better fanout (#1700)


# 7a919e05 03-Aug-2022 William Wang <[email protected]>

dcache: delay wbq data update for 1 cycle (#1701)

This commit and an extra cycle for miss queue store data and mask write.
For now, there are 18 missqueue entries. Each entry has a 512 bit
data re

dcache: delay wbq data update for 1 cycle (#1701)

This commit and an extra cycle for miss queue store data and mask write.
For now, there are 18 missqueue entries. Each entry has a 512 bit
data reg and a 64 bit mask reg. If we update writeback queue data in 1
cycle, the fanout will be at least 18x(512+64) = 10368.

Now writeback queue req meta update is unchanged, however, data and mask
update will happen 1 cycle after req fire or release update fire (T0).
In T0, data and meta will be written to a buffer in missqueue.
In T1, s_data_merge or s_data_override in each missqueue entry will
be used as data and mask wen.

show more ...


# f94d088c 23-Jul-2022 Ziyue-Zhang <[email protected]>

dcache: fix fan-out in WritebackEntry (#1675)

Co-authored-by: Ziyue Zhang <[email protected]>


# b6d53cef 05-Jul-2022 William Wang <[email protected]>

mem,hpm: optimize memblock hpm timing


# 1ca0e4f3 10-Dec-2021 Yinan Xu <[email protected]>

core: refactor hardware performance counters (#1335)

This commit optimizes the coding style and timing for hardware
performance counters.

By default, performance counters are RegNext(RegNext(_)).


# 43a0c310 02-Dec-2021 zhanglinjuan <[email protected]>

WritebackQueue: fix bug when a ProbeAck follows a Release (#1295)


12