#
dd3d70ba |
| 08-Apr-2025 |
Yanqin Li <[email protected]> |
fix(Uncache): uncache mm store needs difftest to update goldenmem (#4470)
|
#
12931efe |
| 14-Mar-2025 |
Yanqin Li <[email protected]> |
fix(uncache): if can merge, it can enter even if buffer is full (#4408)
|
#
11269ca7 |
| 09-Mar-2025 |
Tang Haojin <[email protected]> |
chore: fix several deprecation warning (#4352)
|
#
759834f0 |
| 24-Feb-2025 |
Yanqin Li <[email protected]> |
fix(Uncache): correct the indexes (#4304)
1. Wrong forward index 2. Duplicate judgment `waitSame` index
|
#
1eb8dd22 |
| 24-Feb-2025 |
Kunlin You <[email protected]> |
submodule(utility), XSDebug: support collecting missing XSDebug (#4251)
Previous in PR#3982, we support collecting XSLogs to LogPerfEndpoint.
However with --enable-log, we should also collect some
submodule(utility), XSDebug: support collecting missing XSDebug (#4251)
Previous in PR#3982, we support collecting XSLogs to LogPerfEndpoint.
However with --enable-log, we should also collect some missing XSDebug.
This change move these missing XSDebug outside WhenContext, and add
WireInit to LogUtils' apply, to enable probing some subaccessed data,
like a vec elem with dynamic index.
show more ...
|
#
ccd7d228 |
| 18-Feb-2025 |
Yanqin Li <[email protected]> |
fix(Uncache): handle flush (#4230)
|
#
d74a7897 |
| 17-Feb-2025 |
Yanqin Li <[email protected]> |
fix(uncache): avoid merging the corner cases (#4268)
When `lsq.req` enters `uncachebuffer` and the entry that meets the merge conditions of `lsq.req` is in the following cases, the entry can not be
fix(uncache): avoid merging the corner cases (#4268)
When `lsq.req` enters `uncachebuffer` and the entry that meets the merge conditions of `lsq.req` is in the following cases, the entry can not be merge. 1. receiving uncache response 2. waiting return `lsq.resp`
Why? the status of the former entry changes to `waitReturn`. There is no signal to wake up the latter entries whoes state are `waitSame`, because the trigger former entry will not be sent to bus, get response and wake up the latter entry.
show more ...
|
#
9e12e8ed |
| 08-Feb-2025 |
cz4e <[email protected]> |
style(Bundles): move bundles to Bundles.scala (#4247)
|
#
74050fc0 |
| 26-Jan-2025 |
Yanqin Li <[email protected]> |
perf(Uncache): add merge policy when entering (#4154)
# Background
## Problem
How to design a more efficient entry rule for a new load/store request when a load/store with the same address already
perf(Uncache): add merge policy when entering (#4154)
# Background
## Problem
How to design a more efficient entry rule for a new load/store request when a load/store with the same address already exists in the `ubuffer`?
* **Old Design**: Always **reject** the new request. * **New Desig**n: Consider **merging** requests.
## Merge Scenarios
‼️If the new one can be merge into the existing one, both need to be `NC`.
1. **New Store Request:** 1. **Existing Store:** Merge (the new store is younger). 2. **Existing Load:** Reject.
2. **New Load Request:** 1. **Existing Load:** Merge (the new load may be younger or older. Both are ok to merge). 2. **Existing Store:** Reject.
# What this PR do?
## 1. Entry Actions
1. **Allocate** a new entry and mark as `valid` 1. When there is no matching address. 2. **Allocate** a new entry and mark as `valid` and `waitSame`: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is either selected to issue or issued. 3. **Merge** into an Existing Entry: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is **not** selected to issue or issued. 4. **Reject** the New Request: 1. When the ubuffer is full. 2. When there is a matching address, but: * The virtual addresses or attributes are **different**.
**NOTE:** According to the definition in the TL-UL SPEC, the `mask` must be continuous and naturally aligned, and the `addr` must correspond to the mask. Therefore, the "**same attributes**" here introduces a new condition: the merged `mask` must meet the requirements of being continuous and naturally aligned (function `continueAndAlign`). During merging, the block offset of addr must be synchronously updated in `UncacheEntry.update`.
## 2. Handshake Mechanism Between `LoadQueueUncache (M)` and `Uncache (S)`
> `mid`: master id > > `sid`: slave id
**Old Design:**
- `M` sends a `req` with a **`mid`**. - `S` receives the `req`, records the **`mid`**. - `S` sends a `resp` with the **`mid`**. - `M` receives the `resp` and matches it with the recorded **`mid`**.
**New Design:**
- `M` sends a `req` with a **`mid`**. - `S` receives the `req` and responds with `{mid, sid}` . - `M` matches it with the **`mid`** and updates its record with the received **`sid`**. - `S` sends a `resp` with the its **`sid`**. - `M` receives the `resp` and matches it with the recorded **`sid`**.
**Benefit:** The new design allows `S` to merge requests when new request enters.
## 3. Forwarding Mechanism
**Old Design:** Each address in the `ubuffer` is **unique**, so forwarding is straightforward based on a match.
**New Design:**
* A single address may have up to two entries matched in the `ubuffer`. * If it has two matched enties, it must be true that one entry is marked `inflight` and the other entry is marked `waitSame`. In this case, the forwarded data comes from the merged data of two entries, with the `inflight` entry being the older one.
## 4. Bug Fixes
1. In the `loadUnit`, `!tlbMiss` cannot be directly used as `tlbHit`, because when `tlbValid` is false, `!tlbMiss` can still be true. 2. `Uncache` state machine transition: The state indicating "**able to send requests**" (previously `s_refill_req`, now `s_inflight`) should not be triggered by `reqFire` but rather by `acquireFire`.
<img width="747" alt="image" src="https://github.com/user-attachments/assets/75fbc761-1da8-43d9-a0e6-615cc58cefef" />
# Evaluation
- ✅ timing - ✅ performance
| Type | 4B*1000 | Speedup1-IO | 1B*4096 | Speedup2-IO | | -------------- | ------- | ----------- | ------- | ----------- | | IO | 51026 | 1 | 208149 | 1.00 | | NC | 42343 | 1.21 | 169248 | 1.23 | | NC+OT | 20379 | 2.50 | 160101 | 1.30 | | NC+OT+mergeOpt | 16308 | 3.13 | 126369 | 1.65 | | cache | 1298 | 39.31 | 4410 | 47.20 |
show more ...
|
#
db81ab70 |
| 09-Jan-2025 |
Yanqin Li <[email protected]> |
fix(uncache): consider both corrupt and denied when granting (#4150)
From TileLink SPEC 1.9.3 Chapter7 "TileLink Uncached Lightweight (TL-UL)":
* AccessAck: `d_corrupt` is reserved and must be 0. *
fix(uncache): consider both corrupt and denied when granting (#4150)
From TileLink SPEC 1.9.3 Chapter7 "TileLink Uncached Lightweight (TL-UL)":
* AccessAck: `d_corrupt` is reserved and must be 0. * AccessAckData, `d_corrupt` being HIGH indicates that masked data in this beat is corrupt.
So it need consider both `d_denied` and `d_corrupt` when geting the data.
For uncache now, it complete in one beat, so there can execute `d_denied || d_corrupt` directly.
show more ...
|
#
519244c7 |
| 25-Dec-2024 |
Yanqin Li <[email protected]> |
submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071)
* L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/Coupl
submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071)
* L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/CoupledL2/pull/273) * LLC: [Non-cache requests are forwarded directly downstream without entering the slice](https://github.com/OpenXiangShan/OpenLLC/pull/28)
show more ...
|
#
8b33cd30 |
| 13-Dec-2024 |
klin02 <[email protected]> |
feat(XSLog): move all XSLog outside WhenContext for collection
As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside Wh
feat(XSLog): move all XSLog outside WhenContext for collection
As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to XSDebug(cond1 && cond2, pable)
show more ...
|
#
e10e20c6 |
| 27-Nov-2024 |
Yanqin Li <[email protected]> |
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code normalization
* fix(pbmt): handle cases where the load unit is byte, word, etc
* style(uncache): fix an import
* fix(uncahce): address match should use non-offset address when forwading
In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time.
* style(RAR): remove redundant design of `nc` reg
show more ...
|
#
043d3da4 |
| 27-Nov-2024 |
Yanqin Li <[email protected]> |
timing(uncache): match paddr in f1 during ubuffer forwarding
|
#
46236761 |
| 22-Nov-2024 |
Yanqin Li <[email protected]> |
fix(uncache): fix a list of bugs of outstanding
* fix(uncache): delay flush until receiving uncache resp when state is not idle
* style(pbmt): remove some useless code and comments
* fix(uncache):
fix(uncache): fix a list of bugs of outstanding
* fix(uncache): delay flush until receiving uncache resp when state is not idle
* style(pbmt): remove some useless code and comments
* fix(uncache): not alloc and ready when existing same address entry
show more ...
|
#
e04c5f64 |
| 19-Nov-2024 |
Yanqin Li <[email protected]> |
feat(outstanding): support nc outstanding and remove mmio st outstanding
|
#
cfdd605f |
| 14-Nov-2024 |
Yanqin Li <[email protected]> |
feat(uncache): change queue to buffer to prepare for outstanding
|
#
c7353d05 |
| 03-Sep-2024 |
Yanqin Li <[email protected]> |
feat(NCld): support WMO access for NC ld
* feat(LDU): add support for NC in LoadUnit
* feat(LQ,UB): add support for NC in load queue and uncache buffer
* chore(pbmt): add xsperf for nc ld statistic
|
#
58cb1b0b |
| 06-Jun-2024 |
zhanglinjuan <[email protected]> |
CoupledL2, Uncache, LSQ: support non-data error handling (#3042)
According to CHI specification, a non-data error should be reported when
an error is detected that is not related to data corruption
CoupledL2, Uncache, LSQ: support non-data error handling (#3042)
According to CHI specification, a non-data error should be reported when
an error is detected that is not related to data corruption. Typically
this error is reported for:
* An attempt to access a location that does not exist.
* An illegal access, such as a write to a read only location.
* An attempt to use a transaction type that is not supported.
While the second kind of errors can be resolved by PMA, the first and
the third kind of errors were not supported yet.
This commit implements non-data error handling path. MMIOBridge in
CoupledL2 transfers CHI `RespErr` field downwards into TileLink `denied`
field upwards. Uncache in DCache passes the error to LSQ to generate
access fault exception:
* For MMIO loads, UncacheBuffer writes back `exceptionVec` to LoadUnit
s0 and informs exception address to ExceptionBuffer at the same time.
* For MMIO stores, SQ writes back `exceptionVec` to Backend directly.
BTW, data error is still not supported.
show more ...
|
#
06999a30 |
| 26-Nov-2023 |
sfencevma <[email protected]> |
Uncache: fix flush.empty logic (#2504)
Co-authored-by: Lyn <[email protected]>
|
#
8891a219 |
| 08-Oct-2023 |
Yinan Xu <[email protected]> |
Bump rocket-chip (#2353)
|
#
935edac4 |
| 21-Sep-2023 |
Tang Haojin <[email protected]> |
chore: remove deprecated brackets, APIs, etc. (#2321)
|
#
95e60e55 |
| 18-Sep-2023 |
Tang Haojin <[email protected]> |
LazyModule: do not inline lazy modules in XS (#2311)
|
#
3c02ee8f |
| 25-Dec-2022 |
wakafa <[email protected]> |
Separate Utility submodule from XiangShan (#1861)
* misc: add utility submodule
* misc: adjust to new utility framework
* bump utility: revert resetgen
* bump huancun
|
#
37225120 |
| 07-Dec-2022 |
sfencevma <[email protected]> |
Uncache: optimize write operation (#1844)
This commit adds an uncache write buffer to accelerate uncache write
For uncacheable address range, now we use atomic bit in PMA to indicate
uncache wri
Uncache: optimize write operation (#1844)
This commit adds an uncache write buffer to accelerate uncache write
For uncacheable address range, now we use atomic bit in PMA to indicate
uncache write in this range should not use uncache write buffer.
Note that XiangShan does not support atomic insts in uncacheable address range.
* uncache: optimize write operation
* pma: add atomic config
* uncache: assign hartId
* remove some pma atomic
* extend peripheral id width
Co-authored-by: Lyn <[email protected]>
show more ...
|