#
bb762f60 |
| 18-May-2021 |
Steve Gou <[email protected]> |
ifu: when mispredicted inst is br, we should also shift ghr (#771)
|
#
de169c67 |
| 11-May-2021 |
William Wang <[email protected]> |
backend,mem: add Store Sets memory dependence predictor (#796)
* LoadQueue: send stFtqIdx via rollback request
* It will make it possible for setore set to update its SSIT
* StoreSet: setup st
backend,mem: add Store Sets memory dependence predictor (#796)
* LoadQueue: send stFtqIdx via rollback request
* It will make it possible for setore set to update its SSIT
* StoreSet: setup store set update req
* StoreSet: add store set identifier table (SSIT)
* StoreSet: add last fetched store table (LFST)
* StoreSet: put SSIT into decode stage
* StoreSet: put LFST into dispatch1
* Future work: optimize timing
* RS: store rs now supports delayed issue
* StoreSet: add perf counter
* StoreSet: fix SSIT update logic
* StoreSet: delay LFST update input for 1 cycle
* StoreSet: fix LFST update logic
* StoreSet: fix LFST raddr width
* StoreSet: do not force store in ss issue in order
Classic store set requires store in the same store set issue in seq.
However, in current micro-architecture, such restrict will lead to
severe perf lost. We choose to disable it until we find another way
to fix it.
* StoreSet: support ooo store in the same store set
* StoreSet: fix store set merge logic
* StoreSet: check earlier store when read LFST
* If store-load pair is in the same dispatch bundle, loadWaitBit should
also be set for load
* StoreSet: increase default SSIT flush period
* StoreSet: fix LFST read logic
* Fix commit c0e541d14
* StoreSet: add StoreSetEnable parameter
* RSFeedback: add source type
* StoreQueue: split store addr and store data
* StoreQueue: update ls forward logic
* Now it supports splited addr and data
* Chore: force assign name for load/store unit
* RS: add rs'support for store a-d split
* StoreQueue: fix stlf logic
* StoreQueue: fix addr wb sq update logic
* AtomicsUnit: support splited a/d
* Parameters: disable store set by default
* WaitTable: wait table will not cause store delay
* WaitTable: recover default reset period to 2^17
* Fix dev-stad merge conflict
* StoreSet: enable storeset
* RS: disable store rs delay logic
CI perf shows that current delay logic will cause perf loss. Disable
unnecessary delay logic will help.
To be more specific, `io.readyVec` caused the problem. It will be
updated in future commits.
* RS: opt select logic with load delay (ldWait)
* StoreSet: disable 2-bit lwt
Co-authored-by: ZhangZifei <[email protected]>
show more ...
|
#
2225d46e |
| 19-Apr-2021 |
Jiawei Lin <[email protected]> |
Refactor parameters, SimTop and difftest (#753)
* difftest: use DPI-C to refactor difftest
In this commit, difftest is refactored with DPI-C calls.
There're a few reasons:
(1) From Verilator's
Refactor parameters, SimTop and difftest (#753)
* difftest: use DPI-C to refactor difftest
In this commit, difftest is refactored with DPI-C calls.
There're a few reasons:
(1) From Verilator's manual, DPI-C calls should be more efficient than accessing from dut_ptr.
(2) DPI-C is cross-platform (Verilator, VCS, ...)
(3) difftest APIs are splited from emu.cpp to possibly support more backend platforms
(NEMU, Spike, ...)
The performance at this commit is quite slower than the original emu.
Performance issues will be fixed later.
* [WIP] SimTop: try to use 'XSTop' as soc
* CircularQueuePtr: ues F-bounded polymorphis instead implict helper
* Refactor parameters & Clean up code
* difftest: support basic difftest
* Support diffetst in new sim top
* Difftest; convert recode fmt to ieee754 when comparing fp regs
* Difftest: pass sign-ext pc to dpic functions && fix exception pc
* Debug: add int/exc inst wb to debug queue
* Difftest: pass sign-ext pc to dpic functions && fix exception pc
* Difftest: fix naive commit num limit
Co-authored-by: Yinan Xu <[email protected]>
Co-authored-by: William Wang <[email protected]>
show more ...
|
#
b7b0d6c1 |
| 05-Apr-2021 |
ljw <[email protected]> |
CircularQueuePtr: use F-bounded polymorphism instead of implicit helper (#750)
|
#
408a32b7 |
| 25-Mar-2021 |
Allen <[email protected]> |
Refactor XSPerf, now we have three XSPerf Functions. XSPerfAccumulate: sum up performance values. XSPerfHistogram: count the occurrence of performance values, split them into bins, so that we can est
Refactor XSPerf, now we have three XSPerf Functions. XSPerfAccumulate: sum up performance values. XSPerfHistogram: count the occurrence of performance values, split them into bins, so that we can estimate their distribution. XSPerfMax: get max of performance values.
show more ...
|
#
bc72443c |
| 19-Mar-2021 |
jinyue110 <[email protected]> |
L1plusCache: add error io.
|
#
8f6a1237 |
| 14-Mar-2021 |
Steve Gou <[email protected]> |
btb: use single port sram to meet timing constraints (#692)
* add perf counters for btb and ubtb
* update btb only on not hit or jalr mispredicts to reduce write stalls
|
#
56695d82 |
| 06-Mar-2021 |
Steve Gou <[email protected]> |
IFU: add performance counters (#649)
* core: enable sc
* sc: calculate sum again on update
* sc: clean ups
* sc: add some debug info
* sc, tage, bim: fix wrbypass logic, add wrbypass for
IFU: add performance counters (#649)
* core: enable sc
* sc: calculate sum again on update
* sc: clean ups
* sc: add some debug info
* sc, tage, bim: fix wrbypass logic, add wrbypass for SC
* sc: restrict threshold update conditions and prevent overflow problem
* sc: use seperative thresholds for each bank
* sc: update debug info
* sc: use adaptive threshold algorithm from the original O-GEHL
* tage, bim, sc: optimize wrbypass logic
* sc: initialize threshold to 60
* loop: remove unuseful RegNext on redirect
* ifu: add perf counters
* Perf: Add loopPredictor perf counters
* sc: fix perf logics
Co-authored-by: jinyue110 <[email protected]>
Co-authored-by: zoujr <[email protected]>
show more ...
|
#
0be3bec3 |
| 04-Mar-2021 |
Steve Gou <[email protected]> |
Merge pull request #628 from RISCVERS/redirect-gh-opt-timing
ifu: opt timing of redirect ghist
|
#
cbca794f |
| 02-Mar-2021 |
Lingrui98 <[email protected]> |
ifu: remove redirect_gh and bpu_req_gh
|
#
79e9a2ef |
| 01-Mar-2021 |
Lingrui98 <[email protected]> |
ifu: remove if1_can_go
|
#
6273bc45 |
| 02-Mar-2021 |
Lingrui98 <[email protected]> |
ifu: opt timing of redirect ghist
We pass redirect ghist directly to a mux, whose output is connected to bpu.s1_hist, so that the delay of three cascaded 64-bit-wide 2-1 mux could be saved
|
#
bbd22639 |
| 28-Feb-2021 |
zoujr <[email protected]> |
perf: Remove unused code
|
#
47c2accd |
| 28-Feb-2021 |
zoujr <[email protected]> |
perf: Fix compile error
|
#
b68cf2ef |
| 28-Feb-2021 |
zoujr <[email protected]> |
Merge branch 'master' into bpu-perf
|
#
17e43f8e |
| 28-Feb-2021 |
zoujr <[email protected]> |
Merge branch 'master' into bpu-perf
|
#
2b8b2e7a |
| 28-Feb-2021 |
William Wang <[email protected]> |
Add a naive memory violation predictor (#591)
* WaitTable: add waittable framework
* WaitTable: get replay info from RedirectGenerator
* StoreQueue: maintain issuePtr for load rs
* RS: add
Add a naive memory violation predictor (#591)
* WaitTable: add waittable framework
* WaitTable: get replay info from RedirectGenerator
* StoreQueue: maintain issuePtr for load rs
* RS: add loadWait to rs (only for load Unit's rs)
* WaitTable: fix update logic
* StoreQueue: fix issuePtr update logic
* chore: set loadWaitBit in ibuffer
* StoreQueue: fix issuePtrExt update logic
Former logic does not work well with mmio logic
We may also make sure that issuePtrExt is not before cmtPtrExt
* WaitTable: write with priority
* StoreQueue: fix issuePtrExt update logic for mmio
* chore: fix typos
* CSR: add slvpredctrl
* slvpredctrl will control load violation predict micro architecture
* WaitTable: use xor folded pc to index waittable
Co-authored-by: ZhangZifei <[email protected]>
show more ...
|
#
fd9b3cac |
| 28-Feb-2021 |
Steve Gou <[email protected]> |
ifu: fix predTakenRedirect logic for if3 and if4 (#605)
|
#
b06fe9d0 |
| 27-Feb-2021 |
zoujr <[email protected]> |
perf: Add perf counters for predictors
|
#
eedc2e58 |
| 26-Feb-2021 |
Steve Gou <[email protected]> |
csr,bpu: support enabling and disabling branch predictors via sbpctl (#593)
* csr: add sbpctrl to control branch predictors
* bpu: add dynamic switch to each predictor
* csr: change spfctl and
csr,bpu: support enabling and disabling branch predictors via sbpctl (#593)
* csr: add sbpctrl to control branch predictors
* bpu: add dynamic switch to each predictor
* csr: change spfctl and sbpctl address
* bpu: fix s3 connections
Co-authored-by: Yinan Xu <[email protected]>
show more ...
|
#
0ca50dbb |
| 24-Feb-2021 |
zoujr <[email protected]> |
ftq: add bpu perf counters
|
#
bacba42a |
| 03-Feb-2021 |
ZhangZifei <[email protected]> |
Merge branch 'master' into ptw-refactor
|
#
19272be7 |
| 02-Feb-2021 |
jinyue110 <[email protected]> |
IFU/icacheMissQueue: move io.fush from refill.valid
|
#
b02cb8f3 |
| 02-Feb-2021 |
ZhangZifei <[email protected]> |
Merge branch 'master' into ptw-refactor
|
#
86a8633a |
| 01-Feb-2021 |
ZhangZifei <[email protected]> |
TLB-test: disable tlb unit test
|