#
1592abd1 |
| 08-Apr-2025 |
Yan Xu <[email protected]> |
feat: support inst lifetime trace (#4007)
PerfCCT(performance counter commit trace) is a Instruction-level granularity perfCounter like GEM5 How to use this: 1. Make with "WITH_CHISELDB=1" argument
feat: support inst lifetime trace (#4007)
PerfCCT(performance counter commit trace) is a Instruction-level granularity perfCounter like GEM5 How to use this: 1. Make with "WITH_CHISELDB=1" argument 2. Run with "--dump-db --dump-select-db lifetime", then get the database 3. Instruction lifetime visualize run "python3 scripts/perfcct.py "the-db-file-path" -p 1 -v | less" 4. Analysis script now is in XS-GEM5 repo, see https://github.com/OpenXiangShan/GEM5/blob/xs-dev/util/ClockAnalysis.py
How it works: 1. Allocate one unique tag "seqNum" like GEM5 for each instruction at fetch stage 2. Passing the "seqNum" in each pipeline 3. Recording perf data through the DPIC interface
show more ...
|
#
0ed0e482 |
| 20-Dec-2024 |
Guanghui Cheng <[email protected]> |
area(EXU): add parameter `needCopySrc` in FuConfig (#4063)
|
#
41eedc8d |
| 16-Dec-2024 |
linzhida <[email protected]> |
timing(zacas): move isDropAmocasSta logic gen from Scheduler to NewDispatch
|
#
a2fa0ad9 |
| 02-Dec-2024 |
xiaofeibao <[email protected]> |
area(backend): only use startAddr in pcMem
|
#
f57d73d6 |
| 16-Dec-2024 |
sinsanction <[email protected]> |
area(IssueQueue): encode exuOH as UInt to reduce storage (#4033)
|
#
4907ec88 |
| 19-Sep-2024 |
chengguanghui <[email protected]> |
feat(trace): add trace buffer.
|
#
38c29594 |
| 26-Nov-2024 |
zhanglinjuan <[email protected]> |
feat(MemBlock): add support for Zacas extension
fix(AtomicsUnit, MemBlock): fix loss of multiple stds
In the previous design, AtomicsUnit receives stds from StdExeUnit and arbitrate at most one std
feat(MemBlock): add support for Zacas extension
fix(AtomicsUnit, MemBlock): fix loss of multiple stds
In the previous design, AtomicsUnit receives stds from StdExeUnit and arbitrate at most one std uop for one cycle. This works fine on most of the AMOs and LR/SC because they require only one std uop. However AMOCAS requires at least two std uops, which may be issued from two separate issue queues at the same time, leading to the loss of std uops.
This commit fixes this by taking all the outputs of the StdExeUnits into account with arbitration logics.
fix(AtomicsUnit): DCache req can only be sent at `s_cache_req`
fix(AtomicsUnit, difftest): fix difftest io for atomic events
fix(MainPipe): fix precedence of `&` and `=/=` operator
fix(MainPipe): AMOCAS should not wait for AMOALU
fix(MemBlock): remove unnecessary assertion
fix(MainPipe): only CAS instruction can assert `s3_cas_fail`
fix(AtomicsUnit): fix bug in data select logic
submodule(difftest): bump difftest
show more ...
|
#
12861ac7 |
| 21-Nov-2024 |
linzhida <[email protected]> |
feat(Backend): add support for Zacas extension
misc: remove assert temporarily
|
#
4376b525 |
| 08-Nov-2024 |
Ziyue Zhang <[email protected]> |
busytable: support eliminate old vd when read vl's state
|
#
fbdb359d |
| 08-Nov-2024 |
Muzi <[email protected]> |
fix(ICache): cancel prefetch when there is exception from backend (#3787)
|
#
074ad6aa |
| 06-Nov-2024 |
zhanglinjuan <[email protected]> |
style(AtomicsUnit): remove unnecessary logics (#3836)
Atomics memory operations only work on word, double word and quad word
in the future. Therefore any code concerning byte and half word is
redu
style(AtomicsUnit): remove unnecessary logics (#3836)
Atomics memory operations only work on word, double word and quad word
in the future. Therefore any code concerning byte and half word is
redundant and only contributes to worse timing and area.
show more ...
|
#
25742929 |
| 21-Oct-2024 |
Xuan Hu <[email protected]> |
fix(Ebreak): use isPcBkpt to hold exception raised by ebreak (#3769)
* This signal is only used to distinguish EX_BP store pc or load/store
address in {m|s|vs}tval.
|
#
fe52823c |
| 18-Oct-2024 |
Xuan Hu <[email protected]> |
fix(Breakpoint): memory trigger set {m|s|vs}tval with faulting address (#3762)
* This commit fix the value of {m|s|vs}tval when load/store/atomic
trigger fire. The faulting address should be writte
fix(Breakpoint): memory trigger set {m|s|vs}tval with faulting address (#3762)
* This commit fix the value of {m|s|vs}tval when load/store/atomic
trigger fire. The faulting address should be written to tval.
show more ...
|
#
bd3e32c1 |
| 15-Oct-2024 |
sinsanction <[email protected]> |
fix(Backend, Mem): add `isFromLoadUnit` to avoid other units polluting RegCache (#3731)
|
#
2a4ac712 |
| 19-Sep-2024 |
Easton Man <[email protected]> |
feat(decode): no rob compress when is last in ftq
set canRobCompress to false when a instruction is the last one in its Ftq entry.
|
#
e43bb916 |
| 20-Sep-2024 |
Xuan Hu <[email protected]> |
feat(VecLoad): add VecLoadExcp module to handle merging old/new data
* When NF not 0, the register indices are arranged group by group. But in exception handle progress, all registers needed to merg
feat(VecLoad): add VecLoadExcp module to handle merging old/new data
* When NF not 0, the register indices are arranged group by group. But in exception handle progress, all registers needed to merge will be handled first, and then the registers needed to move will be handled later. * The need merge vdIdx can be until 8, so 4 bits reg is needed. * If the instruction is indexed, the eew of vd is sew from vtype. Otherwise, the eew of vd is encoded in instruction. * Use ivemulNoLessThanM1 and dvemulNoLessThanM1 to produce vemul_i_d to avoid either demul or iemul is less than M1. * For whole register load, need handle NF(nf + 1) dest regs. * Use data EMUL to calculate number of dest reg. * GetE8OffsetInVreg will return the n-th 8bit which idx mapped to. * Since xs will flush pipe, when vstart is not 0 and execute vector mem inst, the value of vstart in CSR is the first element of this vector instruction. When exception occurs, the vstart in writeback bundle is the new one, So writebacked vstart should never be used as the beginning of vector mem operation. * Non-seg indexed load use non-sequential vd. * When "index emul" / "data emul" equals 2, the old vd is located in vuopidx 0, 2, 4, 6, the new vd is located in vuopidx 1, 3, 5, 7. * Make rename's input not ready until VecExcpMod not busy. * Delay trap passed to difftest until VecExcpMod not busy. * Rab commit to VecExcpMod as it commit to Rat, and select real load reg maps in VecExcpMod. * Use isDstMask to distinguish vlm and other vle. * When isWhole, vd regs are sequential.
show more ...
|
#
b0480352 |
| 30-Aug-2024 |
Ziyue Zhang <[email protected]> |
feat(rv64v): support vleff instruction in backend
* use the last uop to update vl * the vleff instructions are run inorder
|
#
ad415ae0 |
| 21-Sep-2024 |
Xiaokun-Pei <[email protected]> |
feat(trap): support m/htinst for specific G-stage translation (#3604)
According to RISC-V priv spec, mtinst/htinst could be always written
zero on trap into M/HS-mode, except for Guest-Page-Fault t
feat(trap): support m/htinst for specific G-stage translation (#3604)
According to RISC-V priv spec, mtinst/htinst could be always written
zero on trap into M/HS-mode, except for Guest-Page-Fault traps that meet
both of the following conditions:
- the trap is caused by a G-stage translation which supports VS-stage
translation
- a nonzero value is written to mtval2/htval
"isForVSnonLeafPTE" is used only in exceptional circumstances that gpf
happens in the G-stage translation which supports VS-stage translation,
such as searching the non-leaf pte of VS-stage.
This patch adds support for writing proper value to mtinst/htinst when
specific trap occurs. And bump the nemu.
show more ...
|
#
db6cfb5a |
| 19-Sep-2024 |
Haoyuan Feng <[email protected]> |
fix(exception): check high address bits of lsu (#3596)
In previous implementation, we simply truncated the higher bits of jump
target or load & store address, which made it impossible to raise
exc
fix(exception): check high address bits of lsu (#3596)
In previous implementation, we simply truncated the higher bits of jump
target or load & store address, which made it impossible to raise
exceptions in such cases.
Commit
https://github.com/OpenXiangShan/XiangShan/commit/c1b28b66879239a5b3a44741376f3b002e8ac834
has already fixed high address bits checking of jump target. This commit
fixes lsu part, checking full address in tlb and passing full address
directly to csr.
show more ...
|
#
6112d994 |
| 09-Sep-2024 |
xiaofeibao <[email protected]> |
timing(Backend): remove useless ldest=/=0.U logic because rfWen will be false
|
#
c1b28b66 |
| 09-Sep-2024 |
Tang Haojin <[email protected]> |
fix(exception): check high address bits of jump target (#3003)
This commit contains high address bits checking of jump target. In previous implementation, we simply truncated the higher bits of jump
fix(exception): check high address bits of jump target (#3003)
This commit contains high address bits checking of jump target. In previous implementation, we simply truncated the higher bits of jump target address, which made it impossible to raise exceptions in such cases.
To resolve this problem, we detect the invalid jump target in jump/branch/CSR and, this information to frontend and store the complete invalid target in a single register in backend. The frontend will then raise an exception to backend and backend will also use the invalid target in the register to write xtval and mepc.
---------
Co-authored-by: Muzi <[email protected]> Co-authored-by: ngc7331 <[email protected]>
show more ...
|
#
e1e27da7 |
| 06-Sep-2024 |
Xuan Hu <[email protected]> |
fix(ROB): hinval should also do the same check as sinval (#3505)
|
#
92c61038 |
| 16-Aug-2024 |
Xuan Hu <[email protected]> |
Frontend,Backend: add xxtvala support
* utils * Add checkInputWidth function in NamedUInt to check if the UInt arg passed in has the same width as it defined. * Frontend * Pass the unexpanded in
Frontend,Backend: add xxtvala support
* utils * Add checkInputWidth function in NamedUInt to check if the UInt arg passed in has the same width as it defined. * Frontend * Pass the unexpanded instruciton to IBuffer if the C extension 16 bits instruction is illegal. * No need to use bypass illBuf, since the origin 16 bits instruction will be passed in the ctrlflow bundle. * IBuffer * Merge exceptionType and crossPageIPFFix into 3bit field, which type is IBufferExceptionType. * IBufferExceptionType can hold illegal instruction exception. * Backend * CSROpType.ro is removed, since we can use rs1 and rd passed in imm field to distinguish CSRR and CSRW in CSR module. * Create TrapInstMod to store the trap instruction and handle its update.
show more ...
|
#
fa16cf81 |
| 15-Aug-2024 |
lewislzh <[email protected]> |
Backend: support Shvstvala and Sstvala extension
|
#
49162c9a |
| 24-Aug-2024 |
Guanghui Cheng <[email protected]> |
Rob: fix bug of rob commit. (#3418)
In this PR, the main goal is to fix the bug encountered during ROB
commit. However, resolving this issue requires information about
`iretire` and `ilastsize`, w
Rob: fix bug of rob commit. (#3418)
In this PR, the main goal is to fix the bug encountered during ROB
commit. However, resolving this issue requires information about
`iretire` and `ilastsize`, which need be collected by the trace.
Therefore, I have also included the trace interface in this PR.
The specific changes are as follows:
* When rob commit, update the ftqIdx and ftqOffset to correctly notify
the frontend which instructions have been committed.
* In each robentry, the ftqIdx and ftqOffset belong to the first
instruction that was compressed, that is Necessary when exceptions
happen.
* Add trace Interface in hart.
* Add trace parameter in parameter.scala.
* Collect trace infomation in backend pipeline.
show more ...
|