/aosp_15_r20/external/pytorch/docs/source/ |
H A D | distributed.rst | 4 Distributed communication package - torch.distributed 8 …Please refer to `PyTorch Distributed Overview <https://pytorch.org/tutorials/beginner/dist_overvie… 9 for a brief introduction to all features related to distributed training. 11 .. automodule:: torch.distributed 12 .. currentmodule:: torch.distributed 17 ``torch.distributed`` supports three built-in backends, each with 55 PyTorch distributed package supports Linux (stable), MacOS (stable), and Windows (prototype). 57 distributed (NCCL only when building with CUDA). MPI is an optional backend that can only be 79 - Use the NCCL backend for distributed **GPU** training 80 - Use the Gloo backend for distributed **CPU** training. [all …]
|
H A D | distributed.checkpoint.rst | 4 Distributed Checkpoint - torch.distributed.checkpoint 8 Distributed Checkpoint (DCP) support loading and saving models from multiple ranks in parallel. 19 .. automodule:: torch.distributed.checkpoint 21 .. currentmodule:: torch.distributed.checkpoint.state_dict_saver 27 .. currentmodule:: torch.distributed.checkpoint.state_dict_loader 32 … of the staging mechanisms used for asynchronous checkpointing (`torch.distributed.checkpoint.asyn… 34 .. automodule:: torch.distributed.checkpoint.staging 36 .. autoclass:: torch.distributed.checkpoint.staging.AsyncStager 39 .. autoclass:: torch.distributed.checkpoint.staging.BlockingAsyncStager 43 .. automodule:: torch.distributed.checkpoint.stateful [all …]
|
H A D | rpc.rst | 3 Distributed RPC Framework 6 The distributed RPC framework provides mechanisms for multi-machine model 23 …Please refer to `PyTorch Distributed Overview <https://pytorch.org/tutorials/beginner/dist_overvie… 24 for a brief introduction to all features related to distributed training. 29 The distributed RPC framework makes it easy to run functions remotely, supports 37 :meth:`~torch.distributed.rpc.rpc_sync` (synchronous), 38 :meth:`~torch.distributed.rpc.rpc_async` (asynchronous), and 39 :meth:`~torch.distributed.rpc.remote` (asynchronous and returns a reference 43 caller. The :meth:`~torch.distributed.rpc.remote` API is useful when the 49 :meth:`~torch.distributed.rpc.rpc_sync` and [all …]
|
H A D | conf.py | 102 # torch.distributed.autograd 104 # torch.distributed.checkpoint.state_dict 107 # torch.distributed.elastic.events 110 # torch.distributed.elastic.metrics 112 # torch.distributed.elastic.rendezvous.registry 114 # torch.distributed.launch 118 # torch.distributed.rpc 120 # torch.distributed.run 146 # torch.distributed.algorithms.ddp_comm_hooks 455 # torch.distributed.algorithms.ddp_comm_hooks.ddp_zero_hook [all …]
|
H A D | distributed.tensor.rst | 1 .. currentmodule:: torch.distributed.tensor 3 torch.distributed.tensor 7 ``torch.distributed.tensor`` is currently in alpha state and under 12 PyTorch DTensor (Distributed Tensor) 15 …Tensor offers simple and flexible tensor sharding primitives that transparently handles distributed 22 * `Tensor Parallel <https://pytorch.org/docs/main/distributed.tensor.parallel.html>`__ 25 .. automodule:: torch.distributed.tensor 28 write distributed program as if it's a **single-device program with the same convergence property**… 41 .. currentmodule:: torch.distributed.tensor 45 running them in a single device, allowing proper distributed computation for PyTorch operators. [all …]
|
H A D | fsdp.rst | 4 .. automodule:: torch.distributed.fsdp 6 .. autoclass:: torch.distributed.fsdp.FullyShardedDataParallel 9 .. autoclass:: torch.distributed.fsdp.BackwardPrefetch 12 .. autoclass:: torch.distributed.fsdp.ShardingStrategy 15 .. autoclass:: torch.distributed.fsdp.MixedPrecision 18 .. autoclass:: torch.distributed.fsdp.CPUOffload 21 .. autoclass:: torch.distributed.fsdp.StateDictConfig 24 .. autoclass:: torch.distributed.fsdp.FullStateDictConfig 27 .. autoclass:: torch.distributed.fsdp.ShardedStateDictConfig 30 .. autoclass:: torch.distributed.fsdp.LocalStateDictConfig [all …]
|
H A D | index.rst | 76 torch.distributed <distributed> 77 torch.distributed.tensor <distributed.tensor> 78 torch.distributed.algorithms.join <distributed.algorithms.join> 79 torch.distributed.elastic <distributed.elastic> 80 torch.distributed.fsdp <fsdp> 81 torch.distributed.tensor.parallel <distributed.tensor.parallel> 82 torch.distributed.optim <distributed.optim> 83 torch.distributed.pipelining <distributed.pipelining> 84 torch.distributed.checkpoint <distributed.checkpoint>
|
/aosp_15_r20/external/pytorch/test/ |
H A D | test_public_bindings.py | 267 "Inductor/Distributed modules hard fail on windows and macos", 335 "torch.testing._internal.distributed.common_state_dict", 336 "torch.testing._internal.distributed._shard.sharded_tensor", 337 "torch.testing._internal.distributed._shard.test_common", 338 "torch.testing._internal.distributed._tensor.common_dtensor", 339 "torch.testing._internal.distributed.ddp_under_dist_autograd_test", 340 "torch.testing._internal.distributed.distributed_test", 341 "torch.testing._internal.distributed.distributed_utils", 342 "torch.testing._internal.distributed.fake_pg", 343 "torch.testing._internal.distributed.multi_threaded_pg", [all …]
|
H A D | run_test.py | 24 import torch.distributed as dist 83 DISTRIBUTED_TEST_PREFIX = "distributed" 120 FSDP_TEST = [test for test in TESTS if test.startswith("distributed/fsdp")] 123 "distributed/nn/jit/test_instantiator", 124 "distributed/rpc/test_faulty_agent", 125 "distributed/rpc/test_tensorpipe_agent", 126 "distributed/rpc/test_share_memory", 127 "distributed/rpc/cuda/test_tensorpipe_agent", 128 "distributed/pipeline/sync/skip/test_api", 129 "distributed/pipeline/sync/skip/test_gpipe", [all …]
|
H A D | allowlist_for_publicAPI.json | 37 "torch.distributed.tensor.device_mesh": "torch.distributed.device_mesh" 59 "torch.distributed": [ 96 "torch.distributed.checkpoint.state_dict": [ 130 "torch.distributed.autograd": [ 135 "torch.distributed.elastic.events": [ 141 "torch.distributed.elastic.events.handlers": [ 147 "torch.distributed.elastic.metrics": [ 152 "torch.distributed.elastic.multiprocessing": [ 159 "torch.distributed.elastic.multiprocessing.redirects": [ 165 "torch.distributed.elastic.rendezvous": [ [all …]
|
/aosp_15_r20/external/pytorch/torch/distributed/ |
H A D | CONTRIBUTING.md | 1 # Contributing to PyTorch Distributed 5 …Distributed Overview](https://pytorch.org/tutorials//beginner/dist_overview.html) is a great start… 7 In this document, we mostly focus on some of the code structure for PyTorch distributed and impleme… 11 …distributed%22+label%3A%22topic%3A+bootcamp%22) and [here](https://github.com/pytorch/pytorch/issu… 22 - API layer: [torch/distributed/distributed_c10d.py](https://github.com/pytorch/pytorch/blob/main/t… 23 …Python Bindings: [torch/csrc/distributed/c10d/init.cpp](https://github.com/pytorch/pytorch/blob/ma… 24 …entations: [torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp](https://github.com/pytorch/pytorch/b… 28 - API layer: ([torch/distributed/_tensor/api.py](https://github.com/pytorch/pytorch/blob/main/torch… 31 #### Distributed Data Parallel (DDP) 33 … API layer: [torch/nn/parallel/distributed.py](https://github.com/pytorch/pytorch/blob/main/torch/… [all …]
|
H A D | launch.py | 3 Module ``torch.distributed.launch``. 5 ``torch.distributed.launch`` is a module that spawns up multiple distributed 12 The utility can be used for single-node distributed training, in which one or 15 each distributed process will be operating on a single GPU. This can achieve 17 multi-node distributed training, by spawning up multiple processes on each node 18 for well-improved multi-node distributed training performance as well. 23 In both cases of single-node distributed training or multi-node distributed 32 1. Single-Node multi-process distributed training 36 python -m torch.distributed.launch --nproc-per-node=NUM_GPUS_YOU_HAVE 40 2. Multi-Node multi-process distributed training: (e.g. two nodes) [all …]
|
/aosp_15_r20/external/pytorch/docs/source/rpc/ |
H A D | distributed_autograd.rst | 5 Distributed Autograd Design 8 This note will present the detailed design for distributed autograd and walk 10 :ref:`autograd-mechanics` and the :ref:`distributed-rpc-framework` before 17 nodes. This can be implemented using :mod:`torch.distributed.rpc` as follows: 22 import torch.distributed.rpc as rpc 41 The main motivation behind distributed autograd is to enable running a backward 42 pass on such distributed models with the ``loss`` that we've computed and 54 For distributed autograd, we need to keep track of all RPCs during the forward 70 - For :ref:`rref`, whenever we call :meth:`torch.distributed.rpc.RRef.to_here` 80 Distributed Autograd Context [all …]
|
H A D | rref.rst | 10 :ref:`distributed-rpc-framework` before proceeding. 17 counting under the hood. Conceptually, it can be considered as a distributed 19 :meth:`~torch.distributed.rpc.remote`. Each RRef is owned by the callee worker 20 of the :meth:`~torch.distributed.rpc.remote` call (i.e., owner) and can be used 24 :meth:`~torch.distributed.rpc.remote` call. 31 :meth:`~torch.distributed.rpc.rpc_sync`, 32 :meth:`~torch.distributed.rpc.rpc_async` or 33 :meth:`~torch.distributed.rpc.remote` invocation, and the owner will be notified 50 :meth:`~torch.distributed.rpc.rpc_sync`, 51 :meth:`~torch.distributed.rpc.rpc_async` or [all …]
|
/aosp_15_r20/external/pytorch/.ci/pytorch/ |
H A D | multigpu-test.sh | 16 time python test/run_test.py --verbose -i distributed/test_c10d_common 17 time python test/run_test.py --verbose -i distributed/test_c10d_gloo 18 time python test/run_test.py --verbose -i distributed/test_c10d_nccl 19 time python test/run_test.py --verbose -i distributed/test_c10d_spawn_gloo 20 time python test/run_test.py --verbose -i distributed/test_c10d_spawn_nccl 21 time python test/run_test.py --verbose -i distributed/test_compute_comm_reordering 22 time python test/run_test.py --verbose -i distributed/test_store 23 time python test/run_test.py --verbose -i distributed/test_symmetric_memory 24 time python test/run_test.py --verbose -i distributed/test_pg_wrapper 25 time python test/run_test.py --verbose -i distributed/rpc/cuda/test_tensorpipe_agent [all …]
|
/aosp_15_r20/external/pytorch/ |
H A D | build_variables.bzl | 491 "torch/csrc/distributed/c10d/Backend.cpp", 492 "torch/csrc/distributed/c10d/Backoff.cpp", 493 "torch/csrc/distributed/c10d/DMAConnectivity.cpp", 494 "torch/csrc/distributed/c10d/control_collectives/StoreCollectives.cpp", 495 "torch/csrc/distributed/c10d/FileStore.cpp", 496 "torch/csrc/distributed/c10d/Functional.cpp", 497 "torch/csrc/distributed/c10d/GlooDeviceFactory.cpp", 498 "torch/csrc/distributed/c10d/GroupRegistry.cpp", 499 "torch/csrc/distributed/c10d/Ops.cpp", 500 "torch/csrc/distributed/c10d/ParamCommsUtils.cpp", [all …]
|
H A D | .lintrunner.toml | 72 'distributed/c10d/*DMAConnectivity.*', 73 'distributed/c10d/*SymmetricMemory.*', 244 'torch/csrc/distributed/**/*', 552 'torch/csrc/distributed/c10d/init.cpp', 718 'torch/distributed/run.py', 872 'test/distributed/argparse_util_test.py', 873 'test/distributed/bin/test_script.py', 874 'test/distributed/elastic/agent/server/test/local_elastic_agent_test.py', 875 'test/distributed/elastic/multiprocessing/bin/test_script.py', 876 'test/distributed/elastic/multiprocessing/bin/zombie_test.py', [all …]
|
/aosp_15_r20/external/pytorch/torch/csrc/distributed/rpc/ |
H A D | request_callback_impl.cpp | 1 #include <torch/csrc/distributed/rpc/request_callback_impl.h> 4 #include <torch/csrc/distributed/autograd/context/container.h> 5 #include <torch/csrc/distributed/autograd/context/context.h> 6 #include <torch/csrc/distributed/autograd/engine/dist_engine.h> 7 #include <torch/csrc/distributed/autograd/rpc_messages/cleanup_autograd_context_req.h> 8 #include <torch/csrc/distributed/autograd/rpc_messages/cleanup_autograd_context_resp.h> 9 #include <torch/csrc/distributed/autograd/rpc_messages/propagate_gradients_req.h> 10 #include <torch/csrc/distributed/autograd/rpc_messages/propagate_gradients_resp.h> 11 #include <torch/csrc/distributed/autograd/rpc_messages/rpc_with_autograd.h> 12 #include <torch/csrc/distributed/autograd/rpc_messages/rpc_with_profiling_req.h> [all …]
|
H A D | init.cpp | 3 #include <torch/csrc/distributed/rpc/profiler/remote_profiler_manager.h> 4 #include <torch/csrc/distributed/rpc/profiler/server_process_global_profiler.h> 5 #include <torch/csrc/distributed/rpc/py_rref.h> 6 #include <torch/csrc/distributed/rpc/python_functions.h> 7 #include <torch/csrc/distributed/rpc/python_rpc_handler.h> 8 #include <torch/csrc/distributed/rpc/request_callback_impl.h> 9 #include <torch/csrc/distributed/rpc/rpc_agent.h> 10 #include <torch/csrc/distributed/rpc/rref_context.h> 11 #include <torch/csrc/distributed/rpc/tensorpipe_agent.h> 12 #include <torch/csrc/distributed/rpc/torchscript_functions.h> [all …]
|
/aosp_15_r20/external/pytorch/torch/distributed/tensor/ |
H A D | README.md | 6 We propose distributed tensor primitives to allow easier distributed computation authoring in SPMD(… 13 from torch.distributed._tensor import init_device_mesh, Shard, distribute_tensor 18 # i.e. torch.distributed.init_process_group(backend="nccl", world_size=world_size) 28 Today there are mainly three ways to scale up distributed training: Data Parallel, Tensor Parallel … 30 …distributed program just like authoring in a single node/device, without worrying about how to do … 32 …s one of the basic building blocks for distributed program translations and describes the layout o… 39 …SPMD programming model and the foundational building block for compiler-based distributed training. 45 …DistributedTensor API and a module level API to create a `nn.Module` with “distributed” parameters. 57 from torch.distributed._tensor import DTensor, Shard, Replicate, distribute_tensor, distribute_modu… 67 # distributed tensor returned will be sharded across the dimension specified in placements [all …]
|
/aosp_15_r20/external/pytorch/torch/distributed/rpc/ |
H A D | api.py | 92 "torch.distributed.rpc.init_rpc first." 175 >>> # xdoctest: +SKIP("distributed") 178 >>> import torch.distributed.rpc as rpc 198 This is similar to torch.distributed.all_gather(), but is using RPC. It 334 :meth:`~torch.distributed.rpc.rpc_async`, ``future.wait()`` should not 347 on both workers. Refer to :meth:`~torch.distributed.init_process_group` 358 >>> import torch.distributed.rpc as rpc 366 >>> import torch.distributed.rpc as rpc 423 Get :class:`~torch.distributed.rpc.WorkerInfo` of a given worker name. 424 Use this :class:`~torch.distributed.rpc.WorkerInfo` to avoid passing an [all …]
|
/aosp_15_r20/prebuilts/vndk/v34/common/NOTICE_FILES/external/freetype/ |
D | LICENSE | 303 modified, and distributed under the terms of the FreeType project 314 modified, and distributed under the terms of the FreeType project 325 modified, and distributed under the terms of the FreeType project 336 modified, and distributed under the terms of the FreeType project 347 modified, and distributed under the terms of the FreeType project 358 modified and distributed under the terms of the FreeType project 369 modified, and distributed under the terms of the FreeType project 380 and distributed under the terms of the FreeType project license, 393 modified, and distributed under the terms of the FreeType project 405 modified, and distributed under the terms of the FreeType project [all …]
|
/aosp_15_r20/external/freetype/ |
H A D | LICENSE | 303 modified, and distributed under the terms of the FreeType project 314 modified, and distributed under the terms of the FreeType project 325 modified, and distributed under the terms of the FreeType project 336 modified, and distributed under the terms of the FreeType project 347 modified, and distributed under the terms of the FreeType project 358 modified and distributed under the terms of the FreeType project 369 modified, and distributed under the terms of the FreeType project 380 and distributed under the terms of the FreeType project license, 393 modified, and distributed under the terms of the FreeType project 405 modified, and distributed under the terms of the FreeType project [all …]
|
/aosp_15_r20/external/pytorch/torch/csrc/distributed/autograd/ |
H A D | utils.cpp | 5 #include <torch/csrc/distributed/autograd/context/container.h> 6 #include <torch/csrc/distributed/autograd/functions/recvrpc_backward.h> 7 #include <torch/csrc/distributed/autograd/functions/sendrpc_backward.h> 8 #include <torch/csrc/distributed/autograd/utils.h> 9 #include <torch/csrc/distributed/rpc/profiler/remote_profiler_manager.h> 10 #include <torch/csrc/distributed/rpc/rpc_agent.h> 11 #include <torch/csrc/distributed/rpc/types.h> 14 namespace distributed { namespace 17 using torch::distributed::autograd::AutogradMetadata; 18 using torch::distributed::autograd::RpcWithAutograd; [all …]
|
/aosp_15_r20/libcore/ojluni/src/main/ |
H A D | LICENSE | 69 placed by the copyright holder saying it may be distributed under the terms of 141 code, which must be distributed under the terms of Sections 1 and 2 above 147 corresponding source code, to be distributed under the terms of Sections 1 161 distributed need not include anything that is normally distributed (in either 214 distributed through that system in reliance on consistent application of that 292 This program is distributed in the hope that it will be useful, but WITHOUT 336 Certain source files distributed by Oracle America and/or its affiliates are 373 This code is distributed in the hope that it will be useful, but WITHOUT 399 This code is distributed in the hope that it will be useful, but WITHOUT 425 This code is distributed in the hope that it will be useful, but WITHOUT [all …]
|