History log of /XiangShan/src/main/scala/xiangshan/cache/dcache/data/BankedDataArray.scala (Results 1 – 25 of 61)
Revision Date Author Comments
# 602aa9f1 02-Apr-2025 cz4e <[email protected]>

feat(Sram): add `SRAM_CTL` interface (#4474)

* add `SRAM_CTL` interface for SRAMTemplate
* use `SRAM_WITH_CTL` to enable,
e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1
SRAM_WITH_CTL=

feat(Sram): add `SRAM_CTL` interface (#4474)

* add `SRAM_CTL` interface for SRAMTemplate
* use `SRAM_WITH_CTL` to enable,
e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1
SRAM_WITH_CTL=1`

show more ...


# ebe07d61 20-Mar-2025 梁森 Liang Sen <[email protected]>

feat(dfx): reuse dcache data sram read data register as mbist pipeline (#4371)

Co-authored-by: sfencevma <[email protected]>


# 11269ca7 09-Mar-2025 Tang Haojin <[email protected]>

chore: fix several deprecation warning (#4352)


# 4b2c87ba 27-Feb-2025 梁森 Liang Sen <[email protected]>

feat(dfx): integerate dfx components (#4312)


# fa5e530d 21-Jan-2025 cz4e <[email protected]>

timing(VSegmentUnit): duplicate latchVAddr (#4209)

* `latchVAddr` needs to index all dcache data sram from top to bottom,
which causes a large fanout, so duplicate `latchVaddr`


# 0b9f4b2d 25-Dec-2024 cz4e <[email protected]>

area(CacheOpDecoder): remove CacheOpDecoder (#4050)

* CacheOpDecoder is no longer used


# 8b33cd30 13-Dec-2024 klin02 <[email protected]>

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
Wh

feat(XSLog): move all XSLog outside WhenContext for collection

As data in WhenContext is not acessible in another module. To support
XSLog collection, we move all XSLog and related signal outside
WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to
XSDebug(cond1 && cond2, pable)

show more ...


# 452b5843 19-Dec-2024 Huijin Li <[email protected]>

power(MemBlock): power optimization in MemBlock (#4059)

power optimization:
(1) use “withClockGate” instead of ClockGate in DCache
(2) reduce LSQ entries


# 72dab974 16-Dec-2024 cz4e <[email protected]>

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache

feat(CtrlUnit, DCache): support L1 DCache RAS (#4009)

# L1 DCache RAS extension support

The L1 DCache supports the part of Reliability, Availability, and
Serviceability (RAS) Extension.
* L1 DCache protection with Single Error Correct Double Error Detect
(SECDED) ECC on the RAMs. This includes the L1 DChace tag and data RAMs.
Not recovery error tag or data.
* Fault Handling Interrupt (Bus Error Unit Interrupt,BEU, 65)
* Error inject

## ECC Error Detect
An error might be triggered, when access L1 DCache.
* **Error Report**:
* Tag ECC Error: As long as an ECC error occurs on a certain path, it
is judged that an ECC error has occurred.
* Data ECC Error: If an ECC error occurs in the hit line, it is
considered
that an ECC error has occurred. If it does not hit, it will not be
processed.
* If an instruction access triggers an ECC error, a Hardware error is
considered and an exception is reported.
* Whenever there is an error in starting, an error message needs to
be sent to BEU.
* When the hardware detects an error, it reports it to the BEU and
triggers the NMI external interrupt(65).

* **Load instruction**:
* Only ECC errors of tags or data will be triggered during execution,
and the errors will be reported to the BEU and a `Hardware Error`
will be reported.

* **Probe/Snoop**:
* If a tag ecc error occurs, there is no need to change the cache
status,
and a `ProbeAck` with `corrupt=1` needs to be returned to l2.
* If a data ecc error occurs, change the cache status according to
the rules. If data needs to be returned, `ProbeAckData` with `corrupt=1`
needs to be returned to l2.

* **Replace/Evict**:
* `ReleaseData` with `corrupt=1` needs to be returned to l2.

* **Store to L1 DCache**:
* If a tag ecc error occurs, the cacheline is released according to the
`Repalce/Evict` process and the data is written to L1 DCache without
reporting errors to l2.
* If a data ecc error occurs, the data is written directly without
reporting
the error to l2.

* **Atomics**:
* report `Hardware Error`, do not report errors to l2.

## Error Inject
Each core's L1 DCache is configured with a memory map
register-controlled
controller, and each hardware unit that supports ECC is configured with
a
control bank. After the Bank register configuration is completed, L1
DCache
will trigger an ecc error for the first access L1 DCache.
<div style="text-align: center;">
<img
src="https://github.com/user-attachments/assets/8c4d23c5-0324-4e52-bcf4-29b47a282d72"
alt="err_inject" width="200" />
</div>

### Address Space
Address space `0x38022000`-`0x3802207F`, a total of 128 bytes of space,
this space is the local space of each hart.
<div style="text-align: center;">
<img width="292" alt="ctl_bank"
src="https://github.com/user-attachments/assets/89f88b24-37a4-4786-a192-401759eb95cf">
</div>

### L1 DCache Control Bank
Each Control Bank contains registers: `ECCCTL`, `ECCEID`, `ECCMASK`,
each register is 8 bytes.
<img width="414" alt="eccctl"
src="https://github.com/user-attachments/assets/b22ff437-d05d-4b3c-a353-dbea1afdc156">
* ECCCTL(ECC Control): ECC injection control register.
* `ese(error signaling enable)`: Indicates that the injection is valid
and is initialized to 0. When the injection is successful and `pst==0`,
ese will be clean.
* `pst(persist)`: Continuously inject signals. When `pst==1`,
the `ECCEID`
counter decreases to 0 and after successful injection, the
injection timer will be restored to the last set `ECCEID` and
re-injected;
when `pst==0`, it will be injected only once.
* `ede(error delay enable)`: Indicates that counter is valid and
initialized to 0. If
* `ese==1` and `ede==0`, error injection is effective immediately.
* `ese==1` and `ede==1`, you need to wait until `ECCEID`
decrements to 0 before the injection is effective.
* `cmp(component)`: Injection target, initialized to 0.
* 1'b0: The injection object is tag.
* 1'b1: The injection object is data.
* `bank`: The bank valid signal is initialized to 0. When the bit in
the `bank` is set, the corresponding mask is valid.

<img width="414" alt="ecceid"
src="https://github.com/user-attachments/assets/8cea0d8d-2540-44b1-b1f9-c1ed6ec5341e">

* ECCEID(ECC Error Inject Delay): ECC injection delay controller.
* When `ese==1` and `ede==1`, it
starts to decrease until it reaches 0. Currently, the same clock as
the core frequency is used, which can also be divided. Since ECC
injection relies on L1 DCache access, the time of the `EID` and the
time when the ECC error is triggered may not be consistent.

<img width="414" alt="eccmask"
src="https://github.com/user-attachments/assets/b1be83fd-17a6-4324-8aa6-45858249c476">

* ECCMASK(ECC Mask): ECC injection mask register.
* 0 means no inversion, 1 means flip.
Tag injection only uses the bits in `ECCMASK0` corresponding to
the tag length.

### Error Inject Example
```
1 # set control bank base address
2 mv x3, $(BASEADDR)
3
4 # set eid
5 mv x5, 500 # delay 500 cycles
6 sd x5, 8(x3) # mmio store
7
8 # set mask
9 mv x5, 0x1 # flip bit 0
10 sd x5, 16(x3) # mmio store
11
12 # set ctl
13 mv x5, 0x7 # comp = 0, ede = 1, pst = 1, ese = 1
14 sd x5, 0(x3) # mmio store
```

show more ...


# c5a867ff 16-Dec-2024 Anzo <[email protected]>

fix(BankedDataArray): fix `oldest` selection logic (#4039)

Changes in this Commit(8ffb12e45361b854daf46d200530e9b2b01e4a9c) will
make:
In this case, there will be multiple replay: ldu0,1,2's lqptr=

fix(BankedDataArray): fix `oldest` selection logic (#4039)

Changes in this Commit(8ffb12e45361b854daf46d200530e9b2b01e4a9c) will
make:
In this case, there will be multiple replay: ldu0,1,2's lqptr= [5,7,6],
bank_conflict only in ldu1 and ldu2. Ideally only replay ldu1, but here
both ldu1 and ldu2 will replay.

This mod fixes the issue and theoretically performance will improve
again.

show more ...


# 8ffb12e4 13-Dec-2024 Anzo <[email protected]>

fix(bank_conflict): Selecting the oldest Load causes a conflict (#4036)

This modification changes `load bank conflict` from [default priority 0
1 2] to [so that the oldest Load does not have a `ban

fix(bank_conflict): Selecting the oldest Load causes a conflict (#4036)

This modification changes `load bank conflict` from [default priority 0
1 2] to [so that the oldest Load does not have a `bank conflict`].

In the following, `Load 0` refers to `LoadUnit 0`.

For example, before:
Load 0 lqidx 5
Load 1 lqidx 3
Load 2 lqidx 8
Assuming that three Loads have `bank conflict`, then we will default to
making Load1 and Load2 have `bank conflict` so that they can be
replayed.

---

However, this may lead to deadlocks in some cases.
For example:
Load 0 robidx 7
Store 0 robidx 6
Load 1 robidx 5

`Store 0` is dependent on `Load 1` for data, while `Load 0` is dependent
on `Store 0` for data, and `Load 0` and `Load 1` will have a `bank
conflict`.
In this case then, `Load 1` will `Replay` because of `bank conflict` and
`Load 0` will `Replay` because of `forward fault`(because of misalign).

---

With the modification, we will choose to make the oldest Load not
generate `bank conflict`, thus circumventing the jamming problem.
**Note !!! This may introduce performance fluctuations (up or down)**

show more ...


# 98d2aaa1 12-Dec-2024 cz4e <[email protected]>

fix(BankedDataArray): fix readline error_delayed selection (#4018)

Bug description:
use **s2** index to select **s3** readline **error_delayed**

Fix:
use **s3** index to select **s3** readline

fix(BankedDataArray): fix readline error_delayed selection (#4018)

Bug description:
use **s2** index to select **s3** readline **error_delayed**

Fix:
use **s3** index to select **s3** readline **error_delayed**

show more ...


# b34797bc 25-Nov-2024 cz4e <[email protected]>

area(DCache ECC): combine ecc with tag/data (#3902)


# c49ebec8 18-Nov-2024 Haoyuan Feng <[email protected]>

docs: add acknowledgements (#3861)


# a5f58fbc 29-Sep-2024 lixin <[email protected]>

timing(dataArray): seperate bankedDataRead kill

Do not let banked_read_valid include kill to improve the timing of reading sram.
Later, use kill to determine bankConflict in load s2.

fix(BankedData

timing(dataArray): seperate bankedDataRead kill

Do not let banked_read_valid include kill to improve the timing of reading sram.
Later, use kill to determine bankConflict in load s2.

fix(BankedDataArray): remove kill logic when generate rr_bank_conflict

data_bank will select the read address based on the priority of the valid signal.
When there are multiple read requests, bank conflicts occur and the high-priority
needs to be killed, the data read by the low-priority loadunit will be overwritten.

show more ...


# b32e9518 08-Nov-2024 Huijin Li <[email protected]>

power(MemBlock): add ClockGate for DCache SRAM (#3824)

By using ClockGate for DCache SRAM, memory Power has 64% reduction,
MemBlock total power has 23.38% reduction.


# 7bd3dbdd 06-Sep-2024 happy-lx <[email protected]>

fix(dcache): fix perf bug of BankedDataArray (#3509)

If the addresses(for example:0x88000000, 0x90000000) of two read
requests fall in the same dcache set(0), the same bank(0), and different
ways,

fix(dcache): fix perf bug of BankedDataArray (#3509)

If the addresses(for example:0x88000000, 0x90000000) of two read
requests fall in the same dcache set(0), the same bank(0), and different
ways, bank conflict will occur in the previous design.

In fact, in the design of BankedDataArray, each read request will read
all the way of an entire bank. So this situation should not necessarily
produce a bank conflict.

code Example:
li x31,10
a:
li x30,1024
li x21,0x88000000
li x22,0x90000000
b:
ld x3,0(x21)
ld x4,0(x22)
addi x21,x21,8
addi x22,x22,8
addi x30,x30,-1
bnez x30,b

addi x31,x31,-1
bnez x31,a

show more ...


# 08b0bc30 03-Sep-2024 happy-lx <[email protected]>

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Opt

timing(MemBlock): optimize MemBlock timing (#3467)

This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Optimize VLSU feedback and redirect
+ Optimise ldCancel and writeback signal generation
+ Optimise TLB Query Vaddr/hlv/hlvx/valid etc
+ Delay MMIO Store writeback for 1 Cycle
+ Fix tlbNoQuery and pmp logic
+ Remove clock gating for s3_fast_rep
+ Remove wbq conflict check to LoadPipe/MainPipe
+ Remove Mux in dcache resp data
+ Optimise data generation logic of LoadUnit
+ Duplicate Register in LoadUnit for data writeback
+ Duplicate Register in loadPipe for missQueue enq
+ Add skid buffer in VLSU
+ Select data from metaArray at S1
+ Simplify the enqueuing logic of missQueue
+ Separately generate the ready logic of miss Queue
+ Relax the conditions valid for bankdataArray reads
+ Add Reg between Dcache Mainpipe with sms prefetcher
+ Optimise store exceptionBuffer pipeline

---------

Co-authored-by: weiding liu <[email protected]>
Co-authored-by: Charlie Liu <[email protected]>
Co-authored-by: good-circle <[email protected]>

show more ...


# d4564868 17-Jul-2024 weiding liu <[email protected]>

Dcache: refactor dcache's read data delay for better port timing


# 4a0e27ec 31-Jul-2024 Yanqin Li <[email protected]>

wpu: fix the issue of abnormal power (#2976)

fix points:
1. parameter bug in DCacheWrapper
2. add clock gate to avoid frequent flip in BankedDataArray
3. remove redundant designs in WPU

power

wpu: fix the issue of abnormal power (#2976)

fix points:
1. parameter bug in DCacheWrapper
2. add clock gate to avoid frequent flip in BankedDataArray
3. remove redundant designs in WPU

power comparison:
![image](https://github.com/user-attachments/assets/8605098c-30a9-4b4e-a34b-69fd87a816df)

show more ...


# e3da8bad 22-Jul-2024 Tang Haojin <[email protected]>

build: purge chisel 3 and add deprecation check (#3250)


# 31d5a9c4 09-Jan-2024 sfencevma <[email protected]>

ECC: add enable option for ecc


# 5adc4829 16-Jun-2024 Yanqin Li <[email protected]>

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <

memblock: add rest clockgate of reg (#3017)

Co-authored-by: cai luoshan <[email protected]>
Co-authored-by: Cai Luoshan <[email protected]>
Co-authored-by: good-circle <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: Ma-YX <[email protected]>
Co-authored-by: CharlieLiu <[email protected]>

show more ...


# 0184a80e 15-Jun-2024 Yanqin Li <[email protected]>

L1CacheErrorInfo: code refactor for correct and convenient clockgate (#3044)


# c686adcd 10-May-2024 Yinan Xu <[email protected]>

Bump utility and disable ConstantIn by default (#2955)

* use BigInt for initValue of Constantin.createRecord
* use WITH_CONSTANTIN=1 to enable the ConstantIn plugin


123