History log of /XiangShan/src/main/scala/xiangshan/mem/MemCommon.scala (Results 51 – 75 of 88)
Revision Date Author Comments
# c61abc0c 06-Aug-2023 Xuan Hu <[email protected]>

merge master into new-backend

Todo: fix error


# 04665835 28-Jul-2023 Maxpicca-Li <[email protected]>

DCacheWPU: update the latest version (#2095)

Co-authored-by: bugGenerator <[email protected]>
Co-authored-by: William Wang <[email protected]>
Co-authored-by: Haoyuan Feng <fenghaoyuan19@mails

DCacheWPU: update the latest version (#2095)

Co-authored-by: bugGenerator <[email protected]>
Co-authored-by: William Wang <[email protected]>
Co-authored-by: Haoyuan Feng <[email protected]>

show more ...


# cdbff57c 24-Jul-2023 Haoyuan Feng <[email protected]>

Memblock: Add load/store 128 bits datapath (#2180)

* Memblock: Add load/store 128 bits datapath

---------

Co-authored-by: lulu0521 <[email protected]>

* Memblock: fix bug of raw addr ma

Memblock: Add load/store 128 bits datapath (#2180)

* Memblock: Add load/store 128 bits datapath

---------

Co-authored-by: lulu0521 <[email protected]>

* Memblock: fix bug of raw addr match

* Memblock, LoadUnit: Fix Vector RAW paddr match

---------

Co-authored-by: lulu0521 <[email protected]>

show more ...


# 14a67055 12-Jul-2023 sfencevma <[email protected]>

ldu, stu: Refactoring the code for ldu/stu (#2171)

* add new ldu and stu

* add fast replay kill at s1

* fix pointer chasing cancel

* pick flushpipe_rvc

* merge flushpipe_rvc

* fix s3_

ldu, stu: Refactoring the code for ldu/stu (#2171)

* add new ldu and stu

* add fast replay kill at s1

* fix pointer chasing cancel

* pick flushpipe_rvc

* merge flushpipe_rvc

* fix s3_cache_rep and s3_feedbacked

* fix fast replay condition

---------

Co-authored-by: Lyn <[email protected]>

show more ...


# d2b20d1a 02-Jun-2023 Tang Haojin <[email protected]>

top-down: align top-down with Gem5 (#2085)

* topdown: add defines of topdown counters enum

* redirect: add redirect type for perf

* top-down: add stallReason IOs

frontend -> ctrlBlock -> de

top-down: align top-down with Gem5 (#2085)

* topdown: add defines of topdown counters enum

* redirect: add redirect type for perf

* top-down: add stallReason IOs

frontend -> ctrlBlock -> decode -> rename -> dispatch

* top-down: add dummy connections

* top-down: update TopdownCounters

* top-down: imp backend analysis and counter dump

* top-down: add HartId in `addSource`

* top-down: broadcast lqIdx of ROB head

* top-down: frontend signal done

* top-down: add memblock topdown interface

* Bump HuanCun: add TopDownMonitor

* top-down: receive and handle reasons in dispatch

* top-down: remove previous top-down code

* TopDown: add MemReqSource enum

* TopDown: extend mshr_latency range

* TopDown: add basic Req Source

TODO: distinguish prefetch

* dcache: distinguish L1DataPrefetch and CPUData

* top-down: comment out debugging perf counters in ibuffer

* TopDown: add path to pass MemReqSource to HuanCun

* TopDown: use simpler logic to count reqSource and update Probe count

* frontend: update topdown counters

* Update HuanCun Topdown for MemReqSource

* top-down: fix load stalls

* top-down: Change the priority of different stall reasons

* top-down: breakdown OtherCoreStall

* sbuffer: fix eviction

* when valid count reaches StoreBufferSize, do eviction

* sbuffer: fix replaceIdx

* If the way selected by the replacement algorithm cannot be written into dcache, its result is not used.

* dcache, ldu: fix vaddr in missqueue

This commit prevents the high bits of the virtual address from being truncated

* fix-ldst_pri-230506

* mainpipe: fix loadsAreComing

* top-down: disable dedup

* top-down: remove old top-down config

* top-down: split lq addr from ls_debug

* top-down: purge previous top-down code

* top-down: add debug_vaddr in LoadQueueReplay

* add source rob_head_other_repay

* remove load_l1_cache_stall_with/wihtou_bank_conflict

* dcache: split CPUData & refill latency

* split CPUData to CPUStoreData & CPULoadData & CPUAtomicData
* monitor refill latency for all type of req

* dcache: fix perfcounter in mq

* io.req.bits.cancel should be applied when counting req.fire

* TopDown: add TopDown for CPL2 in XiangShan

* top-down: add hartid params to L2Cache

* top-down: fix dispatch queue bound

* top-down: no DqStall when robFull

* topdown: buspmu support latency statistic (#2106)

* perf: add buspmu between L2 and L3, support name argument

* bump difftest

* perf: busmonitor supports latency stat

* config: fix cpl2 compatible problem

* bump utility

* bump coupledL2

* bump huancun

* misc: adapt to utility key&field

* config: fix key&field source, remove deprecated argument

* buspmu: remove debug print

* bump coupledl2&huancun

* top-down: fix sq full condition

* top-down: classify "lq full" load bound

* top-down: bump submodules

* bump coupledL2: fix reqSource in data path

* bump coupledL2

---------

Co-authored-by: tastynoob <[email protected]>
Co-authored-by: Guokai Chen <[email protected]>
Co-authored-by: lixin <[email protected]>
Co-authored-by: XiChen <[email protected]>
Co-authored-by: Zhou Yaoyang <[email protected]>
Co-authored-by: Lyn <[email protected]>
Co-authored-by: wakafa <[email protected]>

show more ...


# b9e121df 02-Jun-2023 happy-lx <[email protected]>

hint: add CustomHint interface (#2111)

* hint: add CustomHint interface

* dcache: fix replacement & mshrId update

* access replacement only once per load
* update mshrId in replayqueue only w

hint: add CustomHint interface (#2111)

* hint: add CustomHint interface

* dcache: fix replacement & mshrId update

* access replacement only once per load
* update mshrId in replayqueue only when this load enters mshr

* replay: block cache miss load

* block cache miss load until hint or dcache refill appears

* buffer: fix hint buffer depth to 1

* ldu: add dcache miss l2hint fast replay path

* bump coupledL2

* bump utility

---------

Co-authored-by: Lyn <[email protected]>
Co-authored-by: wangkaifan <[email protected]>

show more ...


# 68d13085 25-May-2023 Xuan Hu <[email protected]>

Merge remote-tracking branch 'upstream/master' into tmp-new-backend-merge-vlsu

# Conflicts:
# .gitmodules
# build.sc
# src/main/scala/top/Configs.scala
# src/main/scala/xiangshan/Bundle.scala
# src/

Merge remote-tracking branch 'upstream/master' into tmp-new-backend-merge-vlsu

# Conflicts:
# .gitmodules
# build.sc
# src/main/scala/top/Configs.scala
# src/main/scala/xiangshan/Bundle.scala
# src/main/scala/xiangshan/Parameters.scala
# src/main/scala/xiangshan/XSCore.scala
# src/main/scala/xiangshan/backend/CtrlBlock.scala
# src/main/scala/xiangshan/backend/MemBlock.scala
# src/main/scala/xiangshan/backend/Scheduler.scala
# src/main/scala/xiangshan/backend/issue/ReservationStation.scala
# src/main/scala/xiangshan/backend/issue/StatusArray.scala
# src/main/scala/xiangshan/backend/rob/Rob.scala
# src/main/scala/xiangshan/mem/MemCommon.scala
# src/main/scala/xiangshan/mem/lsqueue/LSQWrapper.scala
# src/main/scala/xiangshan/mem/lsqueue/LoadQueue.scala
# src/main/scala/xiangshan/mem/lsqueue/StoreQueue.scala
# src/main/scala/xiangshan/mem/pipeline/LoadUnit.scala
# src/main/scala/xiangshan/mem/pipeline/StoreUnit.scala

show more ...


# e4f69d78 21-May-2023 sfencevma <[email protected]>

lsu: split lq for larger ooo load window (#2077)

BREAKING CHANGE: new LSU/LQ architecture introduced in this PR

In this commit, we replace unified LQ with:
* virtual load queue
* load replay qu

lsu: split lq for larger ooo load window (#2077)

BREAKING CHANGE: new LSU/LQ architecture introduced in this PR

In this commit, we replace unified LQ with:
* virtual load queue
* load replay queue
* load rar queue
* load raw queue
* uncache buffer

It will provide larger ooo load window.

NOTE: IPC loss in this commit is caused by MDP problems, for previous MDP
does not fit new LSU architecture.
MDP update is not included in this commit, IPC loss will be fixed by MDP update later.

---------

Co-authored-by: Lyn <[email protected]>

show more ...


# 67fcf090 18-Apr-2023 Xuan Hu <[email protected]>

Merge remote-tracking branch 'upstream/master' into new-backend


# 730cfbc0 16-Apr-2023 Xuan Hu <[email protected]>

backend: merge v2backend into backend


# 141a6449 27-Mar-2023 Xuan Hu <[email protected]>

backend: add load inst support


# 3b739f49 06-Mar-2023 Xuan Hu <[email protected]>

v2backend: huge tmp commit


# 1350347a 01-Feb-2023 William Wang <[email protected]>

ldu: software prefetch issue will always succeed


# 7f111a00 30-Jan-2023 William Wang <[email protected]>

chore: update prefetch interface


# 3af6aa6e 22-Oct-2022 William Wang <[email protected]>

dcache: add optional meta prefetch and access bit

Added meta_prefetch and meta_access related sim perf counter

For now, optional dcache meta prefetch and access can be removed safely


# b52348ae 13-Oct-2022 William Wang <[email protected]>

dcache: add hardware prefetch interface


# 144422dc 04-Jan-2023 Maxpicca-Li <[email protected]>

dcache: setup way predictor framework (#1857)

This commit sets up a basic dcache way predictor framework and a dummy predictor.
A Way Predictor Unit (WPU) module has been added to dcache. Dcache da

dcache: setup way predictor framework (#1857)

This commit sets up a basic dcache way predictor framework and a dummy predictor.
A Way Predictor Unit (WPU) module has been added to dcache. Dcache data SRAMs
have been reorganized for that.

The dummy predictor is disabled by default.

Besides, dcache bank conflict check has been optimized. It may cause timing problems,
to be fixed in the future.

* ideal wpu

* BankedDataArray: change architecture to reduce bank_conflict

* BankedDataArray: add db analysis

* Merge: the rest

* BankedDataArray: change the logic of rrl_bank_conflict, but let the number of rw_bank_conflict up

* Load Logic: changed to be as expected

reading data will be delayed by one cycle to make selection
writing data will be also delayed by one cycle to do write operation

* fix: ecc check error

* update the gitignore

* WPU: add regular wpu and change the replay mechanism

* WPU: fix refill fail bug, but a new addiw fail bug appears

* WPU: temporarily turn off to PR

* WPU: tfix all bug

* loadqueue: fix the initialization of replayCarry

* bankeddataarray: fix the bug

* DCacheWrapper: fix bug

* ready-to-run: correct the version

* WayPredictor: comments clean

* BankedDataArray: fix ecc_bank bug

* Parameter: set the enable signal of wpu

show more ...


# 683c1411 28-Dec-2022 happy-lx <[email protected]>

lq: Remove LQ data (#1862)

This PR remove data in lq.

All cache miss load instructions will be replayed by lq, and the forward path to the D channel
and mshr is added to the pipeline.
Special t

lq: Remove LQ data (#1862)

This PR remove data in lq.

All cache miss load instructions will be replayed by lq, and the forward path to the D channel
and mshr is added to the pipeline.
Special treatment is made for uncache load. The data is no longer stored in the datamodule
but stored in a separate register. ldout is only used as uncache writeback, and only ldout0
will be used. Adjust the priority so that the replayed instruction has the highest priority in S0.

Future work:
1. fix `milc` perf loss
2. remove data from MSHRs

* difftest: monitor cache miss latency

* lq, ldu, dcache: remove lq's data

* lq's data is no longer used
* replay cache miss load from lq (use counter to delay)
* if dcache's mshr gets refill data, wake up lq's missed load
* uncache load will writeback to ldu using ldout_0
* ldout_1 is no longer used

* lq, ldu: add forward port

* forward D and mshr in load S1, get result in S2
* remove useless code logic in loadQueueData

* misc: revert monitor

show more ...


# 3c02ee8f 25-Dec-2022 wakafa <[email protected]>

Separate Utility submodule from XiangShan (#1861)

* misc: add utility submodule

* misc: adjust to new utility framework

* bump utility: revert resetgen

* bump huancun


# 16c3b0b7 08-Dec-2022 sfencevma <[email protected]>

ldu: add st-ld violation re-execute (#1849)

* lsu: add st-ld violation re-execute

* misc: update vio check comments in LQ

Co-authored-by: Lyn <[email protected]>
Co-authored-by: Will

ldu: add st-ld violation re-execute (#1849)

* lsu: add st-ld violation re-execute

* misc: update vio check comments in LQ

Co-authored-by: Lyn <[email protected]>
Co-authored-by: William Wang <[email protected]>

show more ...


# 37225120 07-Dec-2022 sfencevma <[email protected]>

Uncache: optimize write operation (#1844)

This commit adds an uncache write buffer to accelerate uncache write

For uncacheable address range, now we use atomic bit in PMA to indicate
uncache wri

Uncache: optimize write operation (#1844)

This commit adds an uncache write buffer to accelerate uncache write

For uncacheable address range, now we use atomic bit in PMA to indicate
uncache write in this range should not use uncache write buffer.

Note that XiangShan does not support atomic insts in uncacheable address range.

* uncache: optimize write operation

* pma: add atomic config

* uncache: assign hartId

* remove some pma atomic

* extend peripheral id width

Co-authored-by: Lyn <[email protected]>

show more ...


# a760aeb0 02-Dec-2022 happy-lx <[email protected]>

Replay all load instructions from LQ (#1838)

This intermediate architecture replays all load instructions from LQ.
An independent load replay queue will be added later.

Performance loss caused b

Replay all load instructions from LQ (#1838)

This intermediate architecture replays all load instructions from LQ.
An independent load replay queue will be added later.

Performance loss caused by changing of load replay sequences will be
analyzed in the future.

* memblock: load queue based replay

* replay load from load queue rather than RS
* use counters to delay replay logic

* memblock: refactor priority

* lsq-replay has higher priority than try pointchasing

* RS: remove load store rs's feedback port

* ld-replay: a new path for fast replay

* when fast replay needed, wire it to loadqueue and it will be selected
this cycle and replay to load pipline s0 in next cycle

* memblock: refactor load S0

* move all the select logic from lsq to load S0
* split a tlbReplayDelayCycleCtrl out of loadqueue to speed up
generating emu

* loadqueue: parameterize replay

show more ...


# a19ae480 22-Sep-2022 William Wang <[email protected]>

dcache: optimize data sram read fanout (#1784)


# cb9c18dc 24-Aug-2022 William Wang <[email protected]>

ldu: select data in load_s3 (#1743)

rdataVec (i.e. sram read result merge forward result) is still
generated in load_s2. It will be write to load queue in load_s2


# 0a992150 06-Aug-2022 William Wang <[email protected]>

std: add an extra pipe stage for std (#1704)


1234