xref: /aosp_15_r20/external/arm-trusted-firmware/docs/plat/nvidia-tegra.rst (revision 54fd6939e177f8ff529b10183254802c76df6d08)
1*54fd6939SJiyong ParkNVIDIA Tegra
2*54fd6939SJiyong Park============
3*54fd6939SJiyong Park
4*54fd6939SJiyong Park-  .. rubric:: T194
5*54fd6939SJiyong Park      :name: t194
6*54fd6939SJiyong Park
7*54fd6939SJiyong ParkT194 has eight NVIDIA Carmel CPU cores in a coherent multi-processor
8*54fd6939SJiyong Parkconfiguration. The Carmel cores support the ARM Architecture version 8.2,
9*54fd6939SJiyong Parkexecuting both 64-bit AArch64 code, and 32-bit AArch32 code. The Carmel
10*54fd6939SJiyong Parkprocessors are organized as four dual-core clusters, where each cluster has
11*54fd6939SJiyong Parka dedicated 2 MiB Level-2 unified cache. A high speed coherency fabric connects
12*54fd6939SJiyong Parkthese processor complexes and allows heterogeneous multi-processing with all
13*54fd6939SJiyong Parkeight cores if required.
14*54fd6939SJiyong Park
15*54fd6939SJiyong Park-  .. rubric:: T186
16*54fd6939SJiyong Park      :name: t186
17*54fd6939SJiyong Park
18*54fd6939SJiyong ParkThe NVIDIA® Parker (T186) series system-on-chip (SoC) delivers a heterogeneous
19*54fd6939SJiyong Parkmulti-processing (HMP) solution designed to optimize performance and
20*54fd6939SJiyong Parkefficiency.
21*54fd6939SJiyong Park
22*54fd6939SJiyong ParkT186 has Dual NVIDIA Denver2 ARM® CPU cores, plus Quad ARM Cortex®-A57 cores,
23*54fd6939SJiyong Parkin a coherent multiprocessor configuration. The Denver 2 and Cortex-A57 cores
24*54fd6939SJiyong Parksupport ARMv8, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
25*54fd6939SJiyong Parkincluding legacy ARMv7 applications. The Denver 2 processors each have 128 KB
26*54fd6939SJiyong ParkInstruction and 64 KB Data Level 1 caches; and have a 2MB shared Level 2
27*54fd6939SJiyong Parkunified cache. The Cortex-A57 processors each have 48 KB Instruction and 32 KB
28*54fd6939SJiyong ParkData Level 1 caches; and also have a 2 MB shared Level 2 unified cache. A
29*54fd6939SJiyong Parkhigh speed coherency fabric connects these two processor complexes and allows
30*54fd6939SJiyong Parkheterogeneous multi-processing with all six cores if required.
31*54fd6939SJiyong Park
32*54fd6939SJiyong ParkDenver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is
33*54fd6939SJiyong Parkfully Armv8-A architecture compatible. Each of the two Denver cores
34*54fd6939SJiyong Parkimplements a 7-way superscalar microarchitecture (up to 7 concurrent
35*54fd6939SJiyong Parkmicro-ops can be executed per clock), and includes a 128KB 4-way L1
36*54fd6939SJiyong Parkinstruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2
37*54fd6939SJiyong Parkcache, which services both cores.
38*54fd6939SJiyong Park
39*54fd6939SJiyong ParkDenver implements an innovative process called Dynamic Code Optimization,
40*54fd6939SJiyong Parkwhich optimizes frequently used software routines at runtime into dense,
41*54fd6939SJiyong Parkhighly tuned microcode-equivalent routines. These are stored in a
42*54fd6939SJiyong Parkdedicated, 128MB main-memory-based optimization cache. After being read
43*54fd6939SJiyong Parkinto the instruction cache, the optimized micro-ops are executed,
44*54fd6939SJiyong Parkre-fetched and executed from the instruction cache as long as needed and
45*54fd6939SJiyong Parkcapacity allows.
46*54fd6939SJiyong Park
47*54fd6939SJiyong ParkEffectively, this reduces the need to re-optimize the software routines.
48*54fd6939SJiyong ParkInstead of using hardware to extract the instruction-level parallelism
49*54fd6939SJiyong Park(ILP) inherent in the code, Denver extracts the ILP once via software
50*54fd6939SJiyong Parktechniques, and then executes those routines repeatedly, thus amortizing
51*54fd6939SJiyong Parkthe cost of ILP extraction over the many execution instances.
52*54fd6939SJiyong Park
53*54fd6939SJiyong ParkDenver also features new low latency power-state transitions, in addition
54*54fd6939SJiyong Parkto extensive power-gating and dynamic voltage and clock scaling based on
55*54fd6939SJiyong Parkworkloads.
56*54fd6939SJiyong Park
57*54fd6939SJiyong Park-  .. rubric:: T210
58*54fd6939SJiyong Park      :name: t210
59*54fd6939SJiyong Park
60*54fd6939SJiyong ParkT210 has Quad Arm® Cortex®-A57 cores in a switched configuration with a
61*54fd6939SJiyong Parkcompanion set of quad Arm Cortex-A53 cores. The Cortex-A57 and A53 cores
62*54fd6939SJiyong Parksupport Armv8-A, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
63*54fd6939SJiyong Parkincluding legacy Armv7-A applications. The Cortex-A57 processors each have
64*54fd6939SJiyong Park48 KB Instruction and 32 KB Data Level 1 caches; and have a 2 MB shared
65*54fd6939SJiyong ParkLevel 2 unified cache. The Cortex-A53 processors each have 32 KB Instruction
66*54fd6939SJiyong Parkand 32 KB Data Level 1 caches; and have a 512 KB shared Level 2 unified cache.
67*54fd6939SJiyong Park
68*54fd6939SJiyong ParkDirectory structure
69*54fd6939SJiyong Park-------------------
70*54fd6939SJiyong Park
71*54fd6939SJiyong Park-  plat/nvidia/tegra/common - Common code for all Tegra SoCs
72*54fd6939SJiyong Park-  plat/nvidia/tegra/soc/txxx - Chip specific code
73*54fd6939SJiyong Park
74*54fd6939SJiyong ParkTrusted OS dispatcher
75*54fd6939SJiyong Park---------------------
76*54fd6939SJiyong Park
77*54fd6939SJiyong ParkTegra supports multiple Trusted OS'.
78*54fd6939SJiyong Park
79*54fd6939SJiyong Park- Trusted Little Kernel (TLK): In order to include the 'tlkd' dispatcher in
80*54fd6939SJiyong Park  the image, pass 'SPD=tlkd' on the command line while preparing a bl31 image.
81*54fd6939SJiyong Park- Trusty: In order to include the 'trusty' dispatcher in the image, pass
82*54fd6939SJiyong Park  'SPD=trusty' on the command line while preparing a bl31 image.
83*54fd6939SJiyong Park
84*54fd6939SJiyong ParkThis allows other Trusted OS vendors to use the upstream code and include
85*54fd6939SJiyong Parktheir dispatchers in the image without changing any makefiles.
86*54fd6939SJiyong Park
87*54fd6939SJiyong ParkThese are the supported Trusted OS' by Tegra platforms.
88*54fd6939SJiyong Park
89*54fd6939SJiyong Park- Tegra210: TLK and Trusty
90*54fd6939SJiyong Park- Tegra186: Trusty
91*54fd6939SJiyong Park- Tegra194: Trusty
92*54fd6939SJiyong Park
93*54fd6939SJiyong ParkScatter files
94*54fd6939SJiyong Park-------------
95*54fd6939SJiyong Park
96*54fd6939SJiyong ParkTegra platforms currently support scatter files and ld.S scripts. The scatter
97*54fd6939SJiyong Parkfiles help support ARMLINK linker to generate BL31 binaries. For now, there
98*54fd6939SJiyong Parkexists a common scatter file, plat/nvidia/tegra/scat/bl31.scat, for all Tegra
99*54fd6939SJiyong ParkSoCs. The `LINKER` build variable needs to point to the ARMLINK binary for
100*54fd6939SJiyong Parkthe scatter file to be used. Tegra platforms have verified BL31 image generation
101*54fd6939SJiyong Parkwith ARMCLANG (compilation) and ARMLINK (linking) for the Tegra186 platforms.
102*54fd6939SJiyong Park
103*54fd6939SJiyong ParkPreparing the BL31 image to run on Tegra SoCs
104*54fd6939SJiyong Park---------------------------------------------
105*54fd6939SJiyong Park
106*54fd6939SJiyong Park.. code:: shell
107*54fd6939SJiyong Park
108*54fd6939SJiyong Park    CROSS_COMPILE=<path-to-aarch64-gcc>/bin/aarch64-none-elf- make PLAT=tegra \
109*54fd6939SJiyong Park    TARGET_SOC=<target-soc e.g. t194|t186|t210> SPD=<dispatcher e.g. trusty|tlkd>
110*54fd6939SJiyong Park    bl31
111*54fd6939SJiyong Park
112*54fd6939SJiyong ParkPlatforms wanting to use different TZDRAM\_BASE, can add ``TZDRAM_BASE=<value>``
113*54fd6939SJiyong Parkto the build command line.
114*54fd6939SJiyong Park
115*54fd6939SJiyong ParkThe Tegra platform code expects a pointer to the following platform specific
116*54fd6939SJiyong Parkstructure via 'x1' register from the BL2 layer which is used by the
117*54fd6939SJiyong Parkbl31\_early\_platform\_setup() handler to extract the TZDRAM carveout base and
118*54fd6939SJiyong Parksize for loading the Trusted OS and the UART port ID to be used. The Tegra
119*54fd6939SJiyong Parkmemory controller driver programs this base/size in order to restrict NS
120*54fd6939SJiyong Parkaccesses.
121*54fd6939SJiyong Park
122*54fd6939SJiyong Parktypedef struct plat\_params\_from\_bl2 {
123*54fd6939SJiyong Park/\* TZ memory size */
124*54fd6939SJiyong Parkuint64\_t tzdram\_size;
125*54fd6939SJiyong Park/* TZ memory base */
126*54fd6939SJiyong Parkuint64\_t tzdram\_base;
127*54fd6939SJiyong Park/* UART port ID \*/
128*54fd6939SJiyong Parkint uart\_id;
129*54fd6939SJiyong Park/* L2 ECC parity protection disable flag \*/
130*54fd6939SJiyong Parkint l2\_ecc\_parity\_prot\_dis;
131*54fd6939SJiyong Park/* SHMEM base address for storing the boot logs \*/
132*54fd6939SJiyong Parkuint64\_t boot\_profiler\_shmem\_base;
133*54fd6939SJiyong Park} plat\_params\_from\_bl2\_t;
134*54fd6939SJiyong Park
135*54fd6939SJiyong ParkPower Management
136*54fd6939SJiyong Park----------------
137*54fd6939SJiyong Park
138*54fd6939SJiyong ParkThe PSCI implementation expects each platform to expose the 'power state'
139*54fd6939SJiyong Parkparameter to be used during the 'SYSTEM SUSPEND' call. The state-id field
140*54fd6939SJiyong Parkis implementation defined on Tegra SoCs and is preferably defined by
141*54fd6939SJiyong Parktegra\_def.h.
142*54fd6939SJiyong Park
143*54fd6939SJiyong ParkTegra configs
144*54fd6939SJiyong Park-------------
145*54fd6939SJiyong Park
146*54fd6939SJiyong Park-  'tegra\_enable\_l2\_ecc\_parity\_prot': This flag enables the L2 ECC and Parity
147*54fd6939SJiyong Park   Protection bit, for Arm Cortex-A57 CPUs, during CPU boot. This flag will
148*54fd6939SJiyong Park   be enabled by Tegrs SoCs during 'Cluster power up' or 'System Suspend' exit.
149