Uncache.scala - OpenGrok history log for /XiangShan/src/main/scala/xiangshan/cache/dcache/Uncache.scala

Revision	Date	Author	Comments
# dd3d70ba	08-Apr-2025	Yanqin Li <[email protected]>	fix(Uncache): uncache mm store needs difftest to update goldenmem (#4470)
# 12931efe	14-Mar-2025	Yanqin Li <[email protected]>	fix(uncache): if can merge, it can enter even if buffer is full (#4408)
# 11269ca7	09-Mar-2025	Tang Haojin <[email protected]>	chore: fix several deprecation warning (#4352)
# 759834f0	24-Feb-2025	Yanqin Li <[email protected]>	fix(Uncache): correct the indexes (#4304) 1. Wrong forward index 2. Duplicate judgment `waitSame` index
# 1eb8dd22	24-Feb-2025	Kunlin You <[email protected]>	submodule(utility), XSDebug: support collecting missing XSDebug (#4251) Previous in PR#3982, we support collecting XSLogs to LogPerfEndpoint. However with --enable-log, we should also collect some submodule(utility), XSDebug: support collecting missing XSDebug (#4251) Previous in PR#3982, we support collecting XSLogs to LogPerfEndpoint. However with --enable-log, we should also collect some missing XSDebug. This change move these missing XSDebug outside WhenContext, and add WireInit to LogUtils' apply, to enable probing some subaccessed data, like a vec elem with dynamic index. show more ...
# ccd7d228	18-Feb-2025	Yanqin Li <[email protected]>	fix(Uncache): handle flush (#4230)
# d74a7897	17-Feb-2025	Yanqin Li <[email protected]>	fix(uncache): avoid merging the corner cases (#4268) When `lsq.req` enters `uncachebuffer` and the entry that meets the merge conditions of `lsq.req` is in the following cases, the entry can not be fix(uncache): avoid merging the corner cases (#4268) When `lsq.req` enters `uncachebuffer` and the entry that meets the merge conditions of `lsq.req` is in the following cases, the entry can not be merge. 1. receiving uncache response 2. waiting return `lsq.resp` Why? the status of the former entry changes to `waitReturn`. There is no signal to wake up the latter entries whoes state are `waitSame`, because the trigger former entry will not be sent to bus, get response and wake up the latter entry. show more ...
# 9e12e8ed	08-Feb-2025	cz4e <[email protected]>	style(Bundles): move bundles to Bundles.scala (#4247)
# 74050fc0	26-Jan-2025	Yanqin Li <[email protected]>	perf(Uncache): add merge policy when entering (#4154) # Background ## Problem How to design a more efficient entry rule for a new load/store request when a load/store with the same address already perf(Uncache): add merge policy when entering (#4154) # Background ## Problem How to design a more efficient entry rule for a new load/store request when a load/store with the same address already exists in the `ubuffer`？ * Old Design: Always reject the new request. * New Design: Consider merging requests. ## Merge Scenarios ‼️If the new one can be merge into the existing one, both need to be `NC`. 1. New Store Request: 1. Existing Store: Merge (the new store is younger). 2. Existing Load: Reject. 2. New Load Request: 1. Existing Load: Merge (the new load may be younger or older. Both are ok to merge). 2. Existing Store: Reject. # What this PR do? ## 1. Entry Actions 1. Allocate a new entry and mark as `valid` 1. When there is no matching address. 2. Allocate a new entry and mark as `valid` and `waitSame`: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is either selected to issue or issued. 3. Merge into an Existing Entry: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is not selected to issue or issued. 4. Reject the New Request: 1. When the ubuffer is full. 2. When there is a matching address, but: * The virtual addresses or attributes are different. NOTE: According to the definition in the TL-UL SPEC, the `mask` must be continuous and naturally aligned, and the `addr` must correspond to the mask. Therefore, the "same attributes" here introduces a new condition: the merged `mask` must meet the requirements of being continuous and naturally aligned (function `continueAndAlign`). During merging, the block offset of addr must be synchronously updated in `UncacheEntry.update`. ## 2. Handshake Mechanism Between `LoadQueueUncache (M)` and `Uncache (S)` > `mid`: master id > > `sid`: slave id Old Design: - `M` sends a `req` with a `mid`. - `S` receives the `req`, records the `mid`. - `S` sends a `resp` with the `mid`. - `M` receives the `resp` and matches it with the recorded `mid`. New Design: - `M` sends a `req` with a `mid`. - `S` receives the `req` and responds with `{mid, sid}` . - `M` matches it with the `mid` and updates its record with the received `sid`. - `S` sends a `resp` with the its `sid`. - `M` receives the `resp` and matches it with the recorded `sid`. Benefit: The new design allows `S` to merge requests when new request enters. ## 3. Forwarding Mechanism Old Design: Each address in the `ubuffer` is unique, so forwarding is straightforward based on a match. New Design: * A single address may have up to two entries matched in the `ubuffer`. * If it has two matched enties, it must be true that one entry is marked `inflight` and the other entry is marked `waitSame`. In this case, the forwarded data comes from the merged data of two entries, with the `inflight` entry being the older one. ## 4. Bug Fixes 1. In the `loadUnit`, `!tlbMiss` cannot be directly used as `tlbHit`, because when `tlbValid` is false, `!tlbMiss` can still be true. 2. `Uncache` state machine transition: The state indicating "able to send requests" (previously `s_refill_req`, now `s_inflight`) should not be triggered by `reqFire` but rather by `acquireFire`. <img width="747" alt="image" src="https://github.com/user-attachments/assets/75fbc761-1da8-43d9-a0e6-615cc58cefef" /> # Evaluation - ✅ timing - ✅ performance \| Type \| 4B1000 \| Speedup1-IO \| 1B4096 \| Speedup2-IO \| \| -------------- \| ------- \| ----------- \| ------- \| ----------- \| \| IO \| 51026 \| 1 \| 208149 \| 1.00 \| \| NC \| 42343 \| 1.21 \| 169248 \| 1.23 \| \| NC+OT \| 20379 \| 2.50 \| 160101 \| 1.30 \| \| NC+OT+mergeOpt \| 16308 \| 3.13 \| 126369 \| 1.65 \| \| cache \| 1298 \| 39.31 \| 4410 \| 47.20 \| show more ...
# db81ab70	09-Jan-2025	Yanqin Li <[email protected]>	fix(uncache): consider both corrupt and denied when granting (#4150) From TileLink SPEC 1.9.3 Chapter7 "TileLink Uncached Lightweight (TL-UL)": * AccessAck: `d_corrupt` is reserved and must be 0. * fix(uncache): consider both corrupt and denied when granting (#4150) From TileLink SPEC 1.9.3 Chapter7 "TileLink Uncached Lightweight (TL-UL)": * AccessAck: `d_corrupt` is reserved and must be 0. * AccessAckData, `d_corrupt` being HIGH indicates that masked data in this beat is corrupt. So it need consider both `d_denied` and `d_corrupt` when geting the data. For uncache now, it complete in one beat, so there can execute `d_denied \|\| d_corrupt` directly. show more ...
# 519244c7	25-Dec-2024	Yanqin Li <[email protected]>	submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071) * L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/Coupl submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071) * L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/CoupledL2/pull/273) * LLC: [Non-cache requests are forwarded directly downstream without entering the slice](https://github.com/OpenXiangShan/OpenLLC/pull/28) show more ...
# 8b33cd30	13-Dec-2024	klin02 <[email protected]>	feat(XSLog): move all XSLog outside WhenContext for collection As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside Wh feat(XSLog): move all XSLog outside WhenContext for collection As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to XSDebug(cond1 && cond2, pable) show more ...
# e10e20c6	27-Nov-2024	Yanqin Li <[email protected]>	style(pbmt): remove the useless and standardize code * style(pbmt): remove outstanding constant which is just for self-test * fix(uncache): added mask comparison for `addrMatch` * style(mem): code style(pbmt): remove the useless and standardize code * style(pbmt): remove outstanding constant which is just for self-test * fix(uncache): added mask comparison for `addrMatch` * style(mem): code normalization * fix(pbmt): handle cases where the load unit is byte, word, etc * style(uncache): fix an import * fix(uncahce): address match should use non-offset address when forwading In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time. * style(RAR): remove redundant design of `nc` reg show more ...
# 043d3da4	27-Nov-2024	Yanqin Li <[email protected]>	timing(uncache): match paddr in f1 during ubuffer forwarding
# 46236761	22-Nov-2024	Yanqin Li <[email protected]>	fix(uncache): fix a list of bugs of outstanding * fix(uncache): delay flush until receiving uncache resp when state is not idle * style(pbmt): remove some useless code and comments * fix(uncache): fix(uncache): fix a list of bugs of outstanding * fix(uncache): delay flush until receiving uncache resp when state is not idle * style(pbmt): remove some useless code and comments * fix(uncache): not alloc and ready when existing same address entry show more ...
# e04c5f64	19-Nov-2024	Yanqin Li <[email protected]>	feat(outstanding): support nc outstanding and remove mmio st outstanding
# cfdd605f	14-Nov-2024	Yanqin Li <[email protected]>	feat(uncache): change queue to buffer to prepare for outstanding
# c7353d05	03-Sep-2024	Yanqin Li <[email protected]>	feat(NCld): support WMO access for NC ld * feat(LDU): add support for NC in LoadUnit * feat(LQ,UB): add support for NC in load queue and uncache buffer * chore(pbmt): add xsperf for nc ld statistic
# 58cb1b0b	06-Jun-2024	zhanglinjuan <[email protected]>	CoupledL2, Uncache, LSQ: support non-data error handling (#3042) According to CHI specification, a non-data error should be reported when an error is detected that is not related to data corruption CoupledL2, Uncache, LSQ: support non-data error handling (#3042) According to CHI specification, a non-data error should be reported when an error is detected that is not related to data corruption. Typically this error is reported for: * An attempt to access a location that does not exist. * An illegal access, such as a write to a read only location. * An attempt to use a transaction type that is not supported. While the second kind of errors can be resolved by PMA, the first and the third kind of errors were not supported yet. This commit implements non-data error handling path. MMIOBridge in CoupledL2 transfers CHI `RespErr` field downwards into TileLink `denied` field upwards. Uncache in DCache passes the error to LSQ to generate access fault exception: * For MMIO loads, UncacheBuffer writes back `exceptionVec` to LoadUnit s0 and informs exception address to ExceptionBuffer at the same time. * For MMIO stores, SQ writes back `exceptionVec` to Backend directly. BTW, data error is still not supported. show more ...
# 06999a30	26-Nov-2023	sfencevma <[email protected]>	Uncache: fix flush.empty logic (#2504) Co-authored-by: Lyn <[email protected]>
# 8891a219	08-Oct-2023	Yinan Xu <[email protected]>	Bump rocket-chip (#2353)
# 935edac4	21-Sep-2023	Tang Haojin <[email protected]>	chore: remove deprecated brackets, APIs, etc. (#2321)
# 95e60e55	18-Sep-2023	Tang Haojin <[email protected]>	LazyModule: do not inline lazy modules in XS (#2311)
# 3c02ee8f	25-Dec-2022	wakafa <[email protected]>	Separate Utility submodule from XiangShan (#1861) * misc: add utility submodule * misc: adjust to new utility framework * bump utility: revert resetgen * bump huancun
# 37225120	07-Dec-2022	sfencevma <[email protected]>	Uncache: optimize write operation (#1844) This commit adds an uncache write buffer to accelerate uncache write For uncacheable address range, now we use atomic bit in PMA to indicate uncache wri Uncache: optimize write operation (#1844) This commit adds an uncache write buffer to accelerate uncache write For uncacheable address range, now we use atomic bit in PMA to indicate uncache write in this range should not use uncache write buffer. Note that XiangShan does not support atomic insts in uncacheable address range. * uncache: optimize write operation * pma: add atomic config * uncache: assign hartId * remove some pma atomic * extend peripheral id width Co-authored-by: Lyn <[email protected]> show more ...
12