#
f06ca0bf |
| 13-Jul-2021 |
Lingrui98 <[email protected]> |
[WIP] finish ftq logic and fix syntax errors
* Now can pass compiling.
[WIP] comment out-of-date code in frontend
[WIP] move NewFtq to xiangshan.frontend and rename class to Ftq
Ibuffer: update s
[WIP] finish ftq logic and fix syntax errors
* Now can pass compiling.
[WIP] comment out-of-date code in frontend
[WIP] move NewFtq to xiangshan.frontend and rename class to Ftq
Ibuffer: update sigal names for new IFU
[WIP] remove redundant NewFrontend
[WIP] set entry_fetch_status to f_sent once send req to buf
Fix syntax error in IFU
Fix syntax error in IFU/ICache/Ibuffer
[WIP] indent fix in ftq
BPU: Move GlobalHistory define from IFU.scala to BPU.scala
[WIP] fix some compilation errors
BPU: Remove HasIFUConst and move some bundles from BPU.scala to frontendBundle.scala
[WIP] fix some compilation errors
[WIP] rename ftq-bpu ios
[WIP] recover some const definitions
[WIP] fix some compilation errors
[WIP]connect some IOs in frontend
BPU: fix syntax error
[WIP] fix compilation errors in predecode
BPU: fix RAS syntax error
[WIP] add some simulation perf counters back
BPU: Remove numBr redefine in ubtb and bim
show more ...
|
#
16a1cc4b |
| 14-Jul-2021 |
zoujr <[email protected]> |
[WIP] BPU: Modify interface name add handshake between pipeline stage
|
#
4ee28b02 |
| 10-Jul-2021 |
zoujr <[email protected]> |
Merge branch 'decoupled-frontend' of github.com:OpenXiangShan/XiangShan into decoupled-frontend
|
#
3c02c6c7 |
| 08-Jul-2021 |
zoujr <[email protected]> |
[WIP]BPU: Decoupled frontend BPU design
|
#
e0d9a9f0 |
| 05-Jul-2021 |
Lingrui98 <[email protected]> |
core: move ftq to frontend
|
#
c6d43980 |
| 04-Jun-2021 |
Lemover <[email protected]> |
Add MulanPSL-2.0 License (#824)
In this commit, we add License for XiangShan project.
|
#
5c7674fe |
| 15-May-2021 |
Yinan Xu <[email protected]> |
backend,RS: rewrite RS to optimize timing (#812)
* test,vcs: call $finish when difftest fails
* backend,RS: refactor with more submodules
This commit rewrites the reservation station in a more
backend,RS: rewrite RS to optimize timing (#812)
* test,vcs: call $finish when difftest fails
* backend,RS: refactor with more submodules
This commit rewrites the reservation station in a more configurable style.
The new RS has not finished.
- Support only integer instructions
- Feedback from load/store instructions is not supported
- Fast wakeup for multi-cycle instructions is not supported
- Submodules are refined later
* RS: use wakeup signals from arbiter.out
* RS: support feedback and re-schedule when needed
For load and store reservation stations, the instructions that left RS before may be
replayed later.
* test,vcs: check difftest_state and return on nemu trap instructions
* backend,RS: support floating-point operands and delayed regfile read for store RS
This commit adds support for floating-point instructions in reservation stations.
Beside, currently fp data for store operands come a cycle later than int data. This
feature is also supported.
Currently the RS should be ready for any circumstances.
* rs,status: don't trigger assertions when !status.valid
* test,vcs: add +workload option to specify the ram init file
* backend,rs: don't enqueue when redirect.valid or flush.valid
* backend,rs: support wait bit that instruction waits until store issues
This commit adds support for wait bit, which is mainly used in load and
store reservation stations to delay instruction issue until the corresponding
store instruction issued.
* backend,RS: optimize timing
This commit optimizes BypassNetwork and PayloadArray timing.
- duplicate bypass mask to avoid too many FO4
- use one-hot vec to get read data
show more ...
|
#
de169c67 |
| 11-May-2021 |
William Wang <[email protected]> |
backend,mem: add Store Sets memory dependence predictor (#796)
* LoadQueue: send stFtqIdx via rollback request
* It will make it possible for setore set to update its SSIT
* StoreSet: setup st
backend,mem: add Store Sets memory dependence predictor (#796)
* LoadQueue: send stFtqIdx via rollback request
* It will make it possible for setore set to update its SSIT
* StoreSet: setup store set update req
* StoreSet: add store set identifier table (SSIT)
* StoreSet: add last fetched store table (LFST)
* StoreSet: put SSIT into decode stage
* StoreSet: put LFST into dispatch1
* Future work: optimize timing
* RS: store rs now supports delayed issue
* StoreSet: add perf counter
* StoreSet: fix SSIT update logic
* StoreSet: delay LFST update input for 1 cycle
* StoreSet: fix LFST update logic
* StoreSet: fix LFST raddr width
* StoreSet: do not force store in ss issue in order
Classic store set requires store in the same store set issue in seq.
However, in current micro-architecture, such restrict will lead to
severe perf lost. We choose to disable it until we find another way
to fix it.
* StoreSet: support ooo store in the same store set
* StoreSet: fix store set merge logic
* StoreSet: check earlier store when read LFST
* If store-load pair is in the same dispatch bundle, loadWaitBit should
also be set for load
* StoreSet: increase default SSIT flush period
* StoreSet: fix LFST read logic
* Fix commit c0e541d14
* StoreSet: add StoreSetEnable parameter
* RSFeedback: add source type
* StoreQueue: split store addr and store data
* StoreQueue: update ls forward logic
* Now it supports splited addr and data
* Chore: force assign name for load/store unit
* RS: add rs'support for store a-d split
* StoreQueue: fix stlf logic
* StoreQueue: fix addr wb sq update logic
* AtomicsUnit: support splited a/d
* Parameters: disable store set by default
* WaitTable: wait table will not cause store delay
* WaitTable: recover default reset period to 2^17
* Fix dev-stad merge conflict
* StoreSet: enable storeset
* RS: disable store rs delay logic
CI perf shows that current delay logic will cause perf loss. Disable
unnecessary delay logic will help.
To be more specific, `io.readyVec` caused the problem. It will be
updated in future commits.
* RS: opt select logic with load delay (ldWait)
* StoreSet: disable 2-bit lwt
Co-authored-by: ZhangZifei <[email protected]>
show more ...
|
#
2bd5334d |
| 09-May-2021 |
Yinan Xu <[email protected]> |
bundle: use Vec for src in ExuInput (#805)
This commit replaces src1, src2, src3 in Bundle ExuInput with Vec(3, UInt).
Should be easier for RS.
|
#
20e31bd1 |
| 01-May-2021 |
Yinan Xu <[email protected]> |
bundle,uop: use Vec for lsrc, psrc, srcState and srcType (#797)
This commit uses Vec for lsrc, psrc, srcState and srcType in MicroOp bundle.
This makes uop easier to access.
|
#
1b7adedc |
| 30-Apr-2021 |
William Wang <[email protected]> |
MemBlock: split store addr and store data (#781)
* RSFeedback: add source type
* StoreQueue: split store addr and store data
* StoreQueue: update ls forward logic
* Now it supports splited
MemBlock: split store addr and store data (#781)
* RSFeedback: add source type
* StoreQueue: split store addr and store data
* StoreQueue: update ls forward logic
* Now it supports splited addr and data
* Chore: force assign name for load/store unit
* RS: add rs'support for store a-d split
* StoreQueue: fix stlf logic
* StoreQueue: fix addr wb sq update logic
* AtomicsUnit: support splited a/d
* StoreQueue: add sbuffer enq condition assertion
Store data op (std) may still be invalid after store addr op's (sta)
commitment, so datavalid needs to be checked before commiting
store data to sbuffer
Note that at current commit a non-completed std op for a
commited store may exist. We should make sure that uop
will not be cancelled by a latter branch mispredict. More work
to be done!
* Roq: add std/sta split writeback logic
Now store will commit only if both sta & std have been writebacked
Co-authored-by: ZhangZifei <[email protected]>
show more ...
|
#
83596a03 |
| 26-Apr-2021 |
Yinan Xu <[email protected]> |
uop,needRfRPort: ignore srcState by default (#784)
|
#
a338f247 |
| 22-Apr-2021 |
Yinan Xu <[email protected]> |
Add dispatch and issue performance counters (#770)
In this commit, we add performance counters for dispatch and issue stages
to track the number of instructions dispatched and issued. Active regfil
Add dispatch and issue performance counters (#770)
In this commit, we add performance counters for dispatch and issue stages
to track the number of instructions dispatched and issued. Active regfile
read ports are counted as ready instruction source registers.
show more ...
|
#
2225d46e |
| 19-Apr-2021 |
Jiawei Lin <[email protected]> |
Refactor parameters, SimTop and difftest (#753)
* difftest: use DPI-C to refactor difftest
In this commit, difftest is refactored with DPI-C calls.
There're a few reasons:
(1) From Verilator's
Refactor parameters, SimTop and difftest (#753)
* difftest: use DPI-C to refactor difftest
In this commit, difftest is refactored with DPI-C calls.
There're a few reasons:
(1) From Verilator's manual, DPI-C calls should be more efficient than accessing from dut_ptr.
(2) DPI-C is cross-platform (Verilator, VCS, ...)
(3) difftest APIs are splited from emu.cpp to possibly support more backend platforms
(NEMU, Spike, ...)
The performance at this commit is quite slower than the original emu.
Performance issues will be fixed later.
* [WIP] SimTop: try to use 'XSTop' as soc
* CircularQueuePtr: ues F-bounded polymorphis instead implict helper
* Refactor parameters & Clean up code
* difftest: support basic difftest
* Support diffetst in new sim top
* Difftest; convert recode fmt to ieee754 when comparing fp regs
* Difftest: pass sign-ext pc to dpic functions && fix exception pc
* Debug: add int/exc inst wb to debug queue
* Difftest: pass sign-ext pc to dpic functions && fix exception pc
* Difftest: fix naive commit num limit
Co-authored-by: Yinan Xu <[email protected]>
Co-authored-by: William Wang <[email protected]>
show more ...
|
#
8f6a1237 |
| 14-Mar-2021 |
Steve Gou <[email protected]> |
btb: use single port sram to meet timing constraints (#692)
* add perf counters for btb and ubtb
* update btb only on not hit or jalr mispredicts to reduce write stalls
|
#
aac4464e |
| 11-Mar-2021 |
Yinan Xu <[email protected]> |
Add support for a simple version of move elimination (#682)
In this commit, we add support for a simpler version of move elimination.
The original instruction sequences are:
move r1, r0
add r2,
Add support for a simple version of move elimination (#682)
In this commit, we add support for a simpler version of move elimination.
The original instruction sequences are:
move r1, r0
add r2, r1, r3
The optimized sequnces are:
move pr1, pr0
add pr2, pr0, pr3 # instead of add pr2, pr1, pr3
In this way, add can be issued once r0 is ready and move seems to be eliminated.
show more ...
|
#
62f57a35 |
| 05-Mar-2021 |
Lemover <[email protected]> |
TLB&RS: when ptw back, wake up all the replay-state rs entries (#643)
|
#
f3f22d72 |
| 04-Mar-2021 |
Yinan Xu <[email protected]> |
csr: add smblockctl for customized control of memory block (#634)
|
#
6762815c |
| 03-Mar-2021 |
Steve Gou <[email protected]> |
update sc implementation, with wrbypass fixed in tage, bim and sc (#624)
* core: enable sc
* sc: calculate sum again on update
* sc: clean ups
* sc: add some debug info
* sc, tage, bim:
update sc implementation, with wrbypass fixed in tage, bim and sc (#624)
* core: enable sc
* sc: calculate sum again on update
* sc: clean ups
* sc: add some debug info
* sc, tage, bim: fix wrbypass logic, add wrbypass for SC
* core: disable sc by default
Co-authored-by: jinyue110 <[email protected]>
show more ...
|
#
16470009 |
| 28-Feb-2021 |
Lingrui98 <[email protected]> |
Merge remote-tracking branch 'origin/master' into ubtb-alloc-on-write
|
#
72da94f4 |
| 28-Feb-2021 |
Lingrui98 <[email protected]> |
ubtb: alloc ways on write
|
#
2b8b2e7a |
| 28-Feb-2021 |
William Wang <[email protected]> |
Add a naive memory violation predictor (#591)
* WaitTable: add waittable framework
* WaitTable: get replay info from RedirectGenerator
* StoreQueue: maintain issuePtr for load rs
* RS: add
Add a naive memory violation predictor (#591)
* WaitTable: add waittable framework
* WaitTable: get replay info from RedirectGenerator
* StoreQueue: maintain issuePtr for load rs
* RS: add loadWait to rs (only for load Unit's rs)
* WaitTable: fix update logic
* StoreQueue: fix issuePtr update logic
* chore: set loadWaitBit in ibuffer
* StoreQueue: fix issuePtrExt update logic
Former logic does not work well with mmio logic
We may also make sure that issuePtrExt is not before cmtPtrExt
* WaitTable: write with priority
* StoreQueue: fix issuePtrExt update logic for mmio
* chore: fix typos
* CSR: add slvpredctrl
* slvpredctrl will control load violation predict micro architecture
* WaitTable: use xor folded pc to index waittable
Co-authored-by: ZhangZifei <[email protected]>
show more ...
|
#
79901335 |
| 25-Feb-2021 |
zoujr <[email protected]> |
Merge branch 'master' into bpu-perf
|
#
bbfca13a |
| 25-Feb-2021 |
zoujr <[email protected]> |
perf: Add FPGAPlatform switch for perf counters
|
#
b31c62ab |
| 25-Feb-2021 |
wangkaifan <[email protected]> |
perf: support external intervened pf-cnt clean & dump
|