9658ce50 | 25-Mar-2022 |
LinJiawei <[email protected]> |
Bump chisel to 3.5.0 |
3a6db8a3 | 29-Dec-2021 |
Yinan Xu <[email protected]> |
dispatch: block enq when previous instructions have exception (#1400)
This commit adds blocking logic for instructions when they enter
dispatch queues. If previous instructions have exceptions, any
dispatch: block enq when previous instructions have exception (#1400)
This commit adds blocking logic for instructions when they enter
dispatch queues. If previous instructions have exceptions, any
following instructions should be enter dispatch queue.
Consider the following case. If uop(0) has an exception and is a load.
If uop(1) does not have an exception and is a load as well. Then the
allocation logic in dispatch queue will allocate an entry for both
uop(0) and uop(1). However, uop(0) will not set enq.valid and leave
the entry in dispatch queue empty. uop(1) will be allocated in dpq.
In dispatch queue, pointers are updated according to the real number
of instruction enqueue, which is one. While the second is actually
allocated. This causes errors.
show more ...
|
a4e57ea3 | 20-Dec-2021 |
Li Qianruo <[email protected]> |
Merge branch 'master' into trigger |
ddb65c47 | 16-Dec-2021 |
Li Qianruo <[email protected]> |
Trigger: hardwire timing to 1
We have singlestep already so triggers do not need to hit after inst commits |
068bf978 | 10-Dec-2021 |
Li Qianruo <[email protected]> |
Merge branch 'master' into trigger |
84e47f35 | 09-Dec-2021 |
Li Qianruo <[email protected]> |
Refactor trigger |
1ca0e4f3 | 10-Dec-2021 |
Yinan Xu <[email protected]> |
core: refactor hardware performance counters (#1335)
This commit optimizes the coding style and timing for hardware
performance counters.
By default, performance counters are RegNext(RegNext(_)). |
6ab6918f | 09-Dec-2021 |
Yinan Xu <[email protected]> |
core: refactor writeback parameters (#1327)
This commit adds WritebackSink and WritebackSource parameters for
multiple modules. These traits hide implementation details from
other modules by defin
core: refactor writeback parameters (#1327)
This commit adds WritebackSink and WritebackSource parameters for
multiple modules. These traits hide implementation details from
other modules by defining IO-related functions in modules.
By using WritebackSink, ROB is able to choose the writeback sources.
Now fflags and exceptions are connected from exe units to reduce write
ports and optimize timing.
Further optimizations on write-back to RS and better coding style to
be added later.
show more ...
|
d6477c69 | 05-Dec-2021 |
Yinan Xu <[email protected]> |
wb,load: delay load fp for one cycle (#1296) |
980c1bc3 | 23-Nov-2021 |
William Wang <[email protected]> |
mem,mdp: use robIdx instead of sqIdx (#1242)
* mdp: implement SSIT with sram
* mdp: use robIdx instead of sqIdx
Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
get corr
mem,mdp: use robIdx instead of sqIdx (#1242)
* mdp: implement SSIT with sram
* mdp: use robIdx instead of sqIdx
Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
get correct sqIdx in dispatch. Unlike robIdx, it is hard to maintain a
"speculatively assigned" sqIdx, as it is hard to track store insts in
dispatch queue. Yet we can still use "speculatively assigned" robIdx
for memory dependency predictor.
For now, memory dependency predictor uses "speculatively assigned"
robIdx to track inflight store.
However, sqIdx is still used to track those store which's addr is valid
but data it not valid. When load insts try to get forward data from
those store, load insts will get that store's sqIdx and wait in RS.
They will not waken until store data with that sqIdx is issued.
* mdp: add track robIdx recover logic
show more ...
|
5668a921 | 16-Nov-2021 |
Jiawei Lin <[email protected]> |
Fix multi-core dedup bug (#1235)
* FDivSqrt: use hierarchy API to avoid dedup bug
* Dedup: use hartId from io port instead of core parameters
* Bump fudian |
1545277a | 11-Nov-2021 |
Yinan Xu <[email protected]> |
top: enable fpga option for simulation emu (#1213)
* disable log as default
* code clean up |
7057cff8 | 24-Oct-2021 |
Yinan Xu <[email protected]> |
lsq: enqueue at dispatch2 stage (#1167)
This commit changes when instructions enter load/store queue.
Now, at dispatch2, load/store instructions enter load/store queue. |
cd365d4c | 23-Oct-2021 |
rvcoresjw <[email protected]> |
add performance counters at core and hauncun (#1156)
* Add perf counters
* add reg from hpm counter source
* add print perfcounter enable |
c3abb8b6 | 22-Oct-2021 |
Yinan Xu <[email protected]> |
rob: optimize bits width in storage (#1155)
This PR optimizes out isFused and crossPageIPFFix usages in Rob's DispatchData. They will not be stored in ROB. Now DispatchData has only 38 bits.
* is
rob: optimize bits width in storage (#1155)
This PR optimizes out isFused and crossPageIPFFix usages in Rob's DispatchData. They will not be stored in ROB. Now DispatchData has only 38 bits.
* isFused is merged with commitType (2 bits reduced)
* crossPageIPFFix is used only in ExceptionGen (1 bit reduced)
* rename: reduce ldest usages
* decode: set isMove to false if ldest is zero
show more ...
|
a020ce37 | 17-Oct-2021 |
Yinan Xu <[email protected]> |
backend: remove lsrc usages after rename (#1124)
This commit removes lsrc usages in the fence unit and lsrc is no longer
needed after an instruction is renamed. It helps timing and area.
lsrc is
backend: remove lsrc usages after rename (#1124)
This commit removes lsrc usages in the fence unit and lsrc is no longer
needed after an instruction is renamed. It helps timing and area.
lsrc is placed in imm at rename stage (the last stage we need lsrc).
They are extracted in the fence unit. Imm needs to go through the
pipelines because Jump needs it (and we re-use it for lsrc).
show more ...
|
70224bf6 | 16-Oct-2021 |
Yinan Xu <[email protected]> |
rename: support full-featured move elimination (#1123)
This commit optimizes the move elimination implementation.
Reference counting for every physical register is recorded. Originally
0-31 regi
rename: support full-featured move elimination (#1123)
This commit optimizes the move elimination implementation.
Reference counting for every physical register is recorded. Originally
0-31 registers have counters of ones. Every time the physical register
is allocated or deallocated, the counter is increased or decreased by
one. When the counter becomes zero from a non-zero value, the register
is freed and released to freelist.
show more ...
|
f4b2089a | 16-Oct-2021 |
Yinan Xu <[email protected]> |
core: use redirect ports for flush (#1121)
This commit removes flush IO for every module. Flush now re-uses
redirect ports to flush the instructions. |
d1fe0262 | 16-Oct-2021 |
William Wang <[email protected]> |
Add strict mode to reduce mdp mispredict (#1113)
* storeset: fix waitForSqIdx generate logic
Now right waitForSqIdx will be generated for earlier store in the same
dispatch bundle.
* mdp: add
Add strict mode to reduce mdp mispredict (#1113)
* storeset: fix waitForSqIdx generate logic
Now right waitForSqIdx will be generated for earlier store in the same
dispatch bundle.
* mdp: add strict wait mode
When loadWaitStrict && loadWaitBit, load will wait in rs until all
older store addr calculation are finished.
* chore: add storeset_load_strict_wait counter
show more ...
|
f973ab00 | 13-Oct-2021 |
Yinan Xu <[email protected]> |
dispatch2Rs: load balance between two ports (#1110)
This commit adds load balance support for two dispatch ports, between 0
and 2, 1 and 3, etc. |
c7160cd3 | 12-Oct-2021 |
William Wang <[email protected]> |
mem: update block load logic (#1035)
* mem: update block load logic
Now load will be selected as soon as the store it depends on is ready,
which is predicted by Store Sets
* mem: opt block lo
mem: update block load logic (#1035)
* mem: update block load logic
Now load will be selected as soon as the store it depends on is ready,
which is predicted by Store Sets
* mem: opt block load logic
Load blocked by std invalid will wait for that std to issue
Load blocked by load violation wait for that sta to issue
* csr: add 2 extra storeset config bits
Following bits were added to slvpredctl:
- storeset_wait_store
- storeset_no_fast_wakeup
* storeset: fix waitForSqIdx generate logic
Now right waitForSqIdx will be generated for earlier store in the same
dispatch bundle
show more ...
|
33177a7c | 12-Oct-2021 |
Yinan Xu <[email protected]> |
core: update dispatch port parameters (#1103)
This commit changes how dispatch ports (regfile ports) are connected to
reservation station ports:
INT regfile:
* INT(0-1) --> ALU0, MUL0, JUMP
*
core: update dispatch port parameters (#1103)
This commit changes how dispatch ports (regfile ports) are connected to
reservation station ports:
INT regfile:
* INT(0-1) --> ALU0, MUL0, JUMP
* INT(2-3) --> ALU1, MUL0
* INT(4-5) --> ALU2, MUL1
* INT(6-7) --> ALU3, MUL1
* INT(8) --> LOAD0
* INT(9) --> LOAD1
* INT(10) --> STA0
* INT(11) --> STA1
* INT(12) --> STD0
* INT(13) --> STD1
FP regfile:
* FP(0-2) --> FMA0, FMISC0
* FP(3-5) --> FMA1, FMISC0
* FP(6-8) --> FMA2, FMISC1
* FP(9-11) --> FMA3, FMISC1
* FP(12) --> STD0
* FP(13) --> STD1
show more ...
|
7fa2c198 | 10-Oct-2021 |
Yinan Xu <[email protected]> |
renameTable: optimize read and write timing (#1101)
This commit optimizes RenameTable's timing.
Read addresses come from instruction buffer directly and has best
timing. So we let data read at d
renameTable: optimize read and write timing (#1101)
This commit optimizes RenameTable's timing.
Read addresses come from instruction buffer directly and has best
timing. So we let data read at decode stage and bypass write data
from this clock cycle to the read data at next cycle.
For write, we latch the write request and process it at the next cycle.
show more ...
|
20edb3f7 | 09-Oct-2021 |
William Wang <[email protected]> |
Add runahead debug signals (#1082)
* runahead: add runahead support (WIP)
* runahead: fix redirect event
* difftest: bump difftest
* runahead: bump version
Note: current runahead does no
Add runahead debug signals (#1082)
* runahead: add runahead support (WIP)
* runahead: fix redirect event
* difftest: bump difftest
* runahead: bump version
Note: current runahead does not support instruction fusion, disable that
in XiangShan if runahead is needed
* runahead: bump version
* difftest: bump version to support runahead
* chore: bump huancun to make ci happy
* chore: fix wrong submodule url
* difftest: bump version
BREAKING CHANGE: nemu update_config api has changed
show more ...
|
2b4e8253 | 01-Oct-2021 |
Yinan Xu <[email protected]> |
core: update parameters and module organizations (#1080)
This commit moves load/store reservation stations into the first
ExuBlock (or calling it IntegerBlock). The unnecessary dispatch module
is
core: update parameters and module organizations (#1080)
This commit moves load/store reservation stations into the first
ExuBlock (or calling it IntegerBlock). The unnecessary dispatch module
is also removed from CtrlBlock.
Now the module organization becomes:
* ExuBlock: Int RS, Load/Store RS, Int RF, Int FUs
* ExuBlock_1: Fp RS, Fp RF, Fp FUs
* MemBlock: Load/Store FUs
Besides, load queue has 80 entries and store queue has 64 entries now.
show more ...
|