xref: /aosp_15_r20/external/pytorch/torch/package/mangling.md (revision da0073e96a02ea20f0ac840b70461e3646d07c45)
1# Import mangling in `torch.package`
2
3## Mangling rules
4These are the core invariants; if you are changing mangling code please preserve them.
5
61. For every module imported by `PackageImporter`, two attributes are mangled:
7    - `__module__`
8    - `__file__`
92. Any `__module__` and `__file__` attribute accessed inside
10   `Package{Ex|Im}porter` should be demangled immediately.
113. No mangled names should be serialized by `PackageExporter`.
12
13## Why do we mangle imported names?
14To avoid accidental name collisions with modules in `sys.modules`. Consider the following:
15
16    from torchvision.models import resnet18
17    local_resnet18 = resnet18()
18
19    # a loaded resnet18, potentially with a different implementation than the local one!
20    i = torch.PackageImporter('my_resnet_18.pt')
21    loaded_resnet18 = i.load_pickle('model', 'model.pkl')
22
23    print(type(local_resnet18).__module__)  # 'torchvision.models.resnet18'
24    print(type(loaded_resnet18).__module__)  # ALSO 'torchvision.models.resnet18'
25
26These two model types have the same originating `__module__` name set.
27While this isn't facially incorrect, there are a number of places in
28`cpython` and elsewhere that assume you can take any module name, look it
29up `sys.modules`, and get the right module back, including:
30- [`import_from`](https://github.com/python/cpython/blob/5977a7989d49c3e095c7659a58267d87a17b12b1/Python/ceval.c#L5500)
31- `inspect`: used in TorchScript to retrieve source code to compile
32- …probably more that we don't know about.
33
34In these cases, we may silently pick up the wrong module for `loaded_resnet18`
35and e.g. TorchScript the wrong source code for our model.
36
37## How names are mangled
38On import, all modules produced by a given `PackageImporter` are given a
39new top-level module as their parent. This is called the `mangle parent`. For example:
40
41    torchvision.models.resnet18
42
43becomes
44
45    <torch_package_0>.torchvision.models.resnet18
46
47The mangle parent is made unique to a given `PackageImporter` instance by
48bumping a process-global `mangle_index`, i.e. `<torch__package{mangle_index}>`.
49
50The mangle parent intentionally uses angle brackets (`<` and `>`) to make it
51very unlikely that mangled names will collide with any "real" user module.
52
53An imported module's `__file__` attribute is mangled in the same way, so:
54
55    torchvision/modules/resnet18.py
56
57becomes
58
59    <torch_package_0>.torchvision/modules/resnet18.py
60
61Similarly, the use of angle brackets makes it very unlikely that such a name
62will exist in the user's file system.
63
64## Don't serialize mangled names
65Mangling happens `on import`, and the results are never saved into a package.
66Assigning mangle parents on import means that we can enforce that mangle
67parents are unique within the environment doing the importing.
68
69It also allows us to avoid serializing (and maintaining backward
70compatibility for) this detail.
71