#
9e12e8ed |
| 08-Feb-2025 |
cz4e <[email protected]> |
style(Bundles): move bundles to Bundles.scala (#4247)
|
#
519244c7 |
| 25-Dec-2024 |
Yanqin Li <[email protected]> |
submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071)
* L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/Coupl
submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071)
* L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/CoupledL2/pull/273) * LLC: [Non-cache requests are forwarded directly downstream without entering the slice](https://github.com/OpenXiangShan/OpenLLC/pull/28)
show more ...
|
#
562eaa0c |
| 15-Dec-2024 |
Anzooooo <[email protected]> |
fix(MemBlock): fix misaligned exception and remove redundant reg from `SQ`
|
#
b240e1c0 |
| 07-Nov-2024 |
Anzooooo <[email protected]> |
feat(Zicclsm): refactoring misalign and support vector misalign
|
#
38c29594 |
| 26-Nov-2024 |
zhanglinjuan <[email protected]> |
feat(MemBlock): add support for Zacas extension
fix(AtomicsUnit, MemBlock): fix loss of multiple stds
In the previous design, AtomicsUnit receives stds from StdExeUnit and arbitrate at most one std
feat(MemBlock): add support for Zacas extension
fix(AtomicsUnit, MemBlock): fix loss of multiple stds
In the previous design, AtomicsUnit receives stds from StdExeUnit and arbitrate at most one std uop for one cycle. This works fine on most of the AMOs and LR/SC because they require only one std uop. However AMOCAS requires at least two std uops, which may be issued from two separate issue queues at the same time, leading to the loss of std uops.
This commit fixes this by taking all the outputs of the StdExeUnits into account with arbitration logics.
fix(AtomicsUnit): DCache req can only be sent at `s_cache_req`
fix(AtomicsUnit, difftest): fix difftest io for atomic events
fix(MainPipe): fix precedence of `&` and `=/=` operator
fix(MainPipe): AMOCAS should not wait for AMOALU
fix(MemBlock): remove unnecessary assertion
fix(MainPipe): only CAS instruction can assert `s3_cas_fail`
fix(AtomicsUnit): fix bug in data select logic
submodule(difftest): bump difftest
show more ...
|
#
e10e20c6 |
| 27-Nov-2024 |
Yanqin Li <[email protected]> |
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code
style(pbmt): remove the useless and standardize code
* style(pbmt): remove outstanding constant which is just for self-test
* fix(uncache): added mask comparison for `addrMatch`
* style(mem): code normalization
* fix(pbmt): handle cases where the load unit is byte, word, etc
* style(uncache): fix an import
* fix(uncahce): address match should use non-offset address when forwading
In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time.
* style(RAR): remove redundant design of `nc` reg
show more ...
|
#
e04c5f64 |
| 19-Nov-2024 |
Yanqin Li <[email protected]> |
feat(outstanding): support nc outstanding and remove mmio st outstanding
|
#
c7353d05 |
| 03-Sep-2024 |
Yanqin Li <[email protected]> |
feat(NCld): support WMO access for NC ld
* feat(LDU): add support for NC in LoadUnit
* feat(LQ,UB): add support for NC in load queue and uncache buffer
* chore(pbmt): add xsperf for nc ld statistic
|
#
d0d2c22d |
| 15-Sep-2024 |
Anzooooo <[email protected]> |
feat(VLSU): element in which the exception occurs needs to retain its old value
|
#
c0355297 |
| 11-Sep-2024 |
Anzooooo <[email protected]> |
feat(VLSU): set vstart when the support vector accesses anomalies
|
#
46e9ee74 |
| 27-Sep-2024 |
Haoyuan Feng <[email protected]> |
fix(exception): fix exception vaddr generate logic (#3639)
In LSU, for exceptions that can be detected before address
translation(`preaf`, `prepf` or `pregpf`), the original vaddr should be
retain
fix(exception): fix exception vaddr generate logic (#3639)
In LSU, for exceptions that can be detected before address
translation(`preaf`, `prepf` or `pregpf`), the original vaddr should be
retained. And for exceptions detected after address translation, the
48-bit vaddr needs to be zero-extended or sign-extended according to
different modes(`GenExceptionVa`), and then write to *tval.
Also fix some connection bugs.
show more ...
|
#
ad415ae0 |
| 21-Sep-2024 |
Xiaokun-Pei <[email protected]> |
feat(trap): support m/htinst for specific G-stage translation (#3604)
According to RISC-V priv spec, mtinst/htinst could be always written
zero on trap into M/HS-mode, except for Guest-Page-Fault t
feat(trap): support m/htinst for specific G-stage translation (#3604)
According to RISC-V priv spec, mtinst/htinst could be always written
zero on trap into M/HS-mode, except for Guest-Page-Fault traps that meet
both of the following conditions:
- the trap is caused by a G-stage translation which supports VS-stage
translation
- a nonzero value is written to mtval2/htval
"isForVSnonLeafPTE" is used only in exceptional circumstances that gpf
happens in the G-stage translation which supports VS-stage translation,
such as searching the non-leaf pte of VS-stage.
This patch adds support for writing proper value to mtinst/htinst when
specific trap occurs. And bump the nemu.
show more ...
|
#
db6cfb5a |
| 19-Sep-2024 |
Haoyuan Feng <[email protected]> |
fix(exception): check high address bits of lsu (#3596)
In previous implementation, we simply truncated the higher bits of jump
target or load & store address, which made it impossible to raise
exc
fix(exception): check high address bits of lsu (#3596)
In previous implementation, we simply truncated the higher bits of jump
target or load & store address, which made it impossible to raise
exceptions in such cases.
Commit
https://github.com/OpenXiangShan/XiangShan/commit/c1b28b66879239a5b3a44741376f3b002e8ac834
has already fixed high address bits checking of jump target. This commit
fixes lsu part, checking full address in tlb and passing full address
directly to csr.
show more ...
|
#
f4221883 |
| 06-Sep-2024 |
happy-lx <[email protected]> |
perf(L1PF): Stream only pf at miss/pfHit (#3508)
Perf Bug Description:
<img
src="https://github.com/user-attachments/assets/3d1a7105-088b-467a-9c93-833f534bb4e6"
width="300"/>
Stream Prefetcher
perf(L1PF): Stream only pf at miss/pfHit (#3508)
Perf Bug Description:
<img
src="https://github.com/user-attachments/assets/3d1a7105-088b-467a-9c93-833f534bb4e6"
width="300"/>
Stream Prefetcher is **trained and triggered in all memory access
traces**. If the program(As shown above) repeatedly accesses an 8K space
in a loop, the first loop can be prefetched normally, but in the
subsequent loop the data has been fetched back to Dcache already. In
theory, there is no need to prefetch again, since the Stream Prefetcher
is triggered in all memory access traces, which will cause subsequent
prefetching requests to be triggered and preempt the pipeline which may
cause performance loss.
FIX:
Let the Stream prefetcher only trigger prefetching when **miss and
Prefetch hit** (training still uses all memory access traces).
show more ...
|
#
08b0bc30 |
| 03-Sep-2024 |
happy-lx <[email protected]> |
timing(MemBlock): optimize MemBlock timing (#3467)
This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Opt
timing(MemBlock): optimize MemBlock timing (#3467)
This PR optimizes the timing of MemBlock. Specific optimizations include
but are not limited to:
+ TLB use the redirect for the next cycle
+ Optimize VLSU feedback and redirect
+ Optimise ldCancel and writeback signal generation
+ Optimise TLB Query Vaddr/hlv/hlvx/valid etc
+ Delay MMIO Store writeback for 1 Cycle
+ Fix tlbNoQuery and pmp logic
+ Remove clock gating for s3_fast_rep
+ Remove wbq conflict check to LoadPipe/MainPipe
+ Remove Mux in dcache resp data
+ Optimise data generation logic of LoadUnit
+ Duplicate Register in LoadUnit for data writeback
+ Duplicate Register in loadPipe for missQueue enq
+ Add skid buffer in VLSU
+ Select data from metaArray at S1
+ Simplify the enqueuing logic of missQueue
+ Separately generate the ready logic of miss Queue
+ Relax the conditions valid for bankdataArray reads
+ Add Reg between Dcache Mainpipe with sms prefetcher
+ Optimise store exceptionBuffer pipeline
---------
Co-authored-by: weiding liu <[email protected]>
Co-authored-by: Charlie Liu <[email protected]>
Co-authored-by: good-circle <[email protected]>
show more ...
|
#
41d8d239 |
| 21-Aug-2024 |
happy-lx <[email protected]> |
RVA23: Support Zicclsm & Zama16b (Handling Unaligned Load Store by Hardware) (#3320)
This PR supports handling load store unaligned exceptions by hardware
and provides CSR-controlled switches
--
RVA23: Support Zicclsm & Zama16b (Handling Unaligned Load Store by Hardware) (#3320)
This PR supports handling load store unaligned exceptions by hardware
and provides CSR-controlled switches
---------
Co-authored-by: xiaofeibao <[email protected]>
show more ...
|
#
e3da8bad |
| 22-Jul-2024 |
Tang Haojin <[email protected]> |
build: purge chisel 3 and add deprecation check (#3250)
|
#
38f78b5d |
| 10-Jul-2024 |
xiaofeibao-xjtu <[email protected]> |
Backend&MemBlock: feedback use sqidx instead of robidx and uopidx for fix timing (#3172)
|
#
5adc4829 |
| 16-Jun-2024 |
Yanqin Li <[email protected]> |
memblock: add rest clockgate of reg (#3017)
Co-authored-by: cai luoshan <[email protected]> Co-authored-by: Cai Luoshan <[email protected]> Co-authored-by: good-circle <
memblock: add rest clockgate of reg (#3017)
Co-authored-by: cai luoshan <[email protected]> Co-authored-by: Cai Luoshan <[email protected]> Co-authored-by: good-circle <[email protected]> Co-authored-by: Ma-YX <[email protected]> Co-authored-by: Ma-YX <[email protected]> Co-authored-by: CharlieLiu <[email protected]>
show more ...
|
#
2643bd71 |
| 14-May-2024 |
good-circle <[email protected]> |
StoreQueue: re-enter exceptionbuffer when store_s2
storeunit will resp some exception (misaligned or pagefault) when store_s1, however, pmp will raise access fault when store_s2, which should be wri
StoreQueue: re-enter exceptionbuffer when store_s2
storeunit will resp some exception (misaligned or pagefault) when store_s1, however, pmp will raise access fault when store_s2, which should be written into exceptionbuffer
show more ...
|
#
dde74b27 |
| 09-May-2024 |
Anzooooo <[email protected]> |
VLSU: fix st-ld violation checks
when store pipeline is 128-bits vector store, st-ld checker need compare vaddr(paddrBits - 1, 4), instead of vaddr(paddrBits - 1, 3).
|
#
a4d1b2d1 |
| 13-May-2024 |
good-circle <[email protected]> |
Merge branch 'master' into vlsu-merge-master-0504
|
#
bad60841 |
| 10-May-2024 |
Xiaokun-Pei <[email protected]> |
IFU & GPAMem, RVH: fix the bug about getting gpa (#2960)
1. Delete some useless codes about gpaddr.
2. fix the bugs about wrong gpa was writen in mtval2 or htval when guest
page fault occured
|
#
20e09ab1 |
| 09-May-2024 |
happy-lx <[email protected]> |
fix bug of stream (#2756)
Bug Description:
(1) Increase the way of Dcache to 8 to reduce the problem of running on the bwaves test caused by too many addresses mapped to the same set.
(2) Set ldu0
fix bug of stream (#2756)
Bug Description:
(1) Increase the way of Dcache to 8 to reduce the problem of running on the bwaves test caused by too many addresses mapped to the same set.
(2) Set ldu0 to a high-confidence prefetch request channel to increase the probability that the prefetch request will be accepted by Dcache's MSHR.
(3) Fix the issue that ldu sends an error ready back to the prefetcher to prevent the prefetch request from being dropped.
(4) Dont let the prefetch request access Dcache's DataArray.
(5) Add a extra port in Muti-level prefetch Queue to accept more pf req from stream&stride
(6) Larger Stream bit vector Array 16 -> 32 to cover muti Stream access pattern in Bwaves and GemsFDTD.
In addition, the decline in libquantum is a bit strange.
show more ...
|
#
25df626e |
| 04-May-2024 |
good-circle <[email protected]> |
Merge branch 'master' into vlsu-tmp-master
|