xref: /aosp_15_r20/external/tensorflow/SECURITY.md (revision b6fb3261f9314811a0f4371741dbb8839866f948)
1# Using TensorFlow Securely
2
3This document discusses how to safely deal with untrusted programs (models or
4model parameters), and input data. Below, we also provide guidelines on how to
5report vulnerabilities in TensorFlow.
6
7## TensorFlow models are programs
8
9TensorFlow's runtime system interprets and executes programs. What machine
10learning practitioners term
11[**models**](https://developers.google.com/machine-learning/glossary/#model) are
12expressed as programs that TensorFlow executes.  TensorFlow programs are encoded
13as computation
14[**graphs**](https://developers.google.com/machine-learning/glossary/#graph).
15The model's parameters are often stored separately in **checkpoints**.
16
17At runtime, TensorFlow executes the computation graph using the parameters
18provided. Note that the behavior of the computation graph may change depending
19on the parameters provided. TensorFlow itself is not a sandbox. When executing
20the computation graph, TensorFlow may read and write files, send and receive
21data over the network, and even spawn additional processes. All these tasks are
22performed with the permission of the TensorFlow process. Allowing for this
23flexibility makes for a powerful machine learning platform, but it has security
24implications.
25
26The computation graph may also accept **inputs**. Those inputs are the
27data you supply to TensorFlow to train a model, or to use a model to run
28inference on the data.
29
30**TensorFlow models are programs, and need to be treated as such from a security
31perspective.**
32
33## Running untrusted models
34
35As a general rule: **Always** execute untrusted models inside a sandbox (e.g.,
36[nsjail](https://github.com/google/nsjail)).
37
38There are several ways in which a model could become untrusted. Obviously, if an
39untrusted party supplies TensorFlow kernels, arbitrary code may be executed.
40The same is true if the untrusted party provides Python code, such as the
41Python code that generates TensorFlow graphs.
42
43Even if the untrusted party only supplies the serialized computation
44graph (in form of a `GraphDef`, `SavedModel`, or equivalent on-disk format), the
45set of computation primitives available to TensorFlow is powerful enough that
46you should assume that the TensorFlow process effectively executes arbitrary
47code. One common solution is to allow only a few safe Ops. While this is
48possible in theory, we still recommend you sandbox the execution.
49
50It depends on the computation graph whether a user provided checkpoint is safe.
51It is easily possible to create computation graphs in which malicious
52checkpoints can trigger unsafe behavior. For example, consider a graph that
53contains a `tf.cond` depending on the value of a `tf.Variable`. One branch of
54the `tf.cond` is harmless, but the other is unsafe. Since the `tf.Variable` is
55stored in the checkpoint, whoever provides the checkpoint now has the ability to
56trigger unsafe behavior, even though the graph is not under their control.
57
58In other words, graphs can contain vulnerabilities of their own. To allow users
59to provide checkpoints to a model you run on their behalf (e.g., in order to
60compare model quality for a fixed model architecture), you must carefully audit
61your model, and we recommend you run the TensorFlow process in a sandbox.
62
63## Accepting untrusted Inputs
64
65It is possible to write models that are secure in the sense that they can safely
66process untrusted inputs assuming there are no bugs. There are two main reasons
67to not rely on this: First, it is easy to write models which must not be exposed
68to untrusted inputs, and second, there are bugs in any software system of
69sufficient complexity. Letting users control inputs could allow them to trigger
70bugs either in TensorFlow or in dependencies.
71
72In general, it is good practice to isolate parts of any system which is exposed
73to untrusted (e.g., user-provided) inputs in a sandbox.
74
75A useful analogy to how any TensorFlow graph is executed is any interpreted
76programming language, such as Python. While it is possible to write secure
77Python code which can be exposed to user supplied inputs (by, e.g., carefully
78quoting and sanitizing input strings, size-checking input blobs, etc.), it is
79very easy to write Python programs which are insecure. Even secure Python code
80could be rendered insecure by a bug in the Python interpreter, or in a bug in a
81Python library used (e.g.,
82[this one](https://www.cvedetails.com/cve/CVE-2017-12852/)).
83
84## Running a TensorFlow server
85
86TensorFlow is a platform for distributed computing, and as such there is a
87TensorFlow server (`tf.train.Server`). **The TensorFlow server is meant for
88internal communication only. It is not built for use in an untrusted network.**
89
90For performance reasons, the default TensorFlow server does not include any
91authorization protocol and sends messages unencrypted. It accepts connections
92from anywhere, and executes the graphs it is sent without performing any checks.
93Therefore, if you run a `tf.train.Server` in your network, anybody with
94access to the network can execute what you should consider arbitrary code with
95the privileges of the process running the `tf.train.Server`.
96
97When running distributed TensorFlow, you must isolate the network in which the
98cluster lives. Cloud providers provide instructions for setting up isolated
99networks, which are sometimes branded as "virtual private cloud." Refer to the
100instructions for
101[GCP](https://cloud.google.com/compute/docs/networks-and-firewalls) and
102[AWS](https://aws.amazon.com/vpc/)) for details.
103
104Note that `tf.train.Server` is different from the server created by
105`tensorflow/serving` (the default binary for which is called `ModelServer`).
106By default, `ModelServer` also has no built-in mechanism for authentication.
107Connecting it to an untrusted network allows anyone on this network to run the
108graphs known to the `ModelServer`. This means that an attacker may run
109graphs using untrusted inputs as described above, but they would not be able to
110execute arbitrary graphs. It is possible to safely expose a `ModelServer`
111directly to an untrusted network, **but only if the graphs it is configured to
112use have been carefully audited to be safe**.
113
114Similar to best practices for other servers, we recommend running any
115`ModelServer` with appropriate privileges (i.e., using a separate user with
116reduced permissions). In the spirit of defense in depth, we recommend
117authenticating requests to any TensorFlow server connected to an untrusted
118network, as well as sandboxing the server to minimize the adverse effects of
119any breach.
120
121## Vulnerabilities in TensorFlow
122
123TensorFlow is a large and complex system. It also depends on a large set of
124third party libraries (e.g., `numpy`, `libjpeg-turbo`, PNG parsers, `protobuf`).
125It is possible that TensorFlow or its dependencies may contain vulnerabilities
126that would allow triggering unexpected or dangerous behavior with specially
127crafted inputs.
128
129### What is a vulnerability?
130
131Given TensorFlow's flexibility, it is possible to specify computation graphs
132which exhibit unexpected or unwanted behavior. The fact that TensorFlow models
133can perform arbitrary computations means that they may read and write files,
134communicate via the network, produce deadlocks and infinite loops, or run out
135of memory. It is only when these behaviors are outside the specifications of the
136operations involved that such behavior is a vulnerability.
137
138A `FileWriter` writing a file is not unexpected behavior and therefore is not a
139vulnerability in TensorFlow. A `MatMul` allowing arbitrary binary code execution
140**is** a vulnerability.
141
142This is more subtle from a system perspective. For example, it is easy to cause
143a TensorFlow process to try to allocate more memory than available by specifying
144a computation graph containing an ill-considered `tf.tile` operation. TensorFlow
145should exit cleanly in this case (it would raise an exception in Python, or
146return an error `Status` in C++). However, if the surrounding system is not
147expecting the possibility, such behavior could be used in a denial of service
148attack (or worse). Because TensorFlow behaves correctly, this is not a
149vulnerability in TensorFlow (although it would be a vulnerability of this
150hypothetical system).
151
152As a general rule, it is incorrect behavior for TensorFlow to access memory it
153does not own, or to terminate in an unclean way. Bugs in TensorFlow that lead to
154such behaviors constitute a vulnerability.
155
156One of the most critical parts of any system is input handling. If malicious
157input can trigger side effects or incorrect behavior, this is a bug, and likely
158a vulnerability.
159
160### Reporting vulnerabilities
161
162Please email reports about any security related issues you find to
163`[email protected]`. This mail is delivered to a small security team. For
164critical problems, you may encrypt your report (see below).
165
166Please use a descriptive subject line for your report email. After the initial
167reply to your report, the security team will endeavor to keep you informed of
168the progress being made towards a fix and announcement.
169
170In addition, please include the following information along with your report:
171
172*   Your name and affiliation (if any).
173*   A description of the technical details of the vulnerabilities. It is very
174    important to let us know how we can reproduce your findings.
175*   An explanation of who can exploit this vulnerability, and what they gain
176    when doing so -- write an attack scenario. This will help us evaluate your
177    report quickly, especially if the issue is complex.
178*   Whether this vulnerability is public or known to third parties. If it is,
179    please provide details.
180
181If you believe that an existing (public) issue is security-related, please send
182an email to `[email protected]`. The email should include the issue ID and
183a short description of why it should be handled according to this security
184policy.
185
186For each vulnerability, we try to ingress it as soon as possible, given the size
187of the team and the number of reports. If the vulnerability is not high impact,
188we will delay ingress during the period before a branch cut and the final
189release. For these cases, vulnerabilities will always be batched to be fixed at
190the same time as a quarterly release.
191
192If a vulnerability is high impact, we will acknowledge reception and issue
193patches within an accelerated timeline and not wait for the patch release.
194
195Once an issue is reported, TensorFlow uses the following disclosure process:
196
197* When a report is received, we confirm the issue and determine its severity,
198  according to the timeline listed above.
199* If we know of specific third-party services or software based on TensorFlow
200  that require mitigation before publication, those projects will be notified.
201* An advisory is prepared (but not published) which details the problem and
202  steps for mitigation.
203* The vulnerability is fixed and potential workarounds are identified.
204* Wherever possible, the fix is also prepared for the branches corresponding to
205  all releases of TensorFlow at most one year old. We will attempt to commit
206  these fixes as soon as possible, and as close together as possible.
207* Patch releases are published for all fixed released versions, a
208  notification is sent to [email protected], and the advisory is published.
209
210Note that we mostly do patch releases for security reasons and each version of
211TensorFlow is supported for only 1 year after the release.
212
213Past security advisories are listed below. We credit reporters for identifying
214security issues, although we keep your name confidential if you request it.
215
216#### Encryption key for `[email protected]`
217
218If your disclosure is extremely sensitive, you may choose to encrypt your
219report using the key below. Please only use this for critical security
220reports.
221
222```
223-----BEGIN PGP PUBLIC KEY BLOCK-----
224
225mQENBFpqdzwBCADTeAHLNEe9Vm77AxhmGP+CdjlY84O6DouOCDSq00zFYdIU/7aI
226LjYwhEmDEvLnRCYeFGdIHVtW9YrVktqYE9HXVQC7nULU6U6cvkQbwHCdrjaDaylP
227aJUXkNrrxibhx9YYdy465CfusAaZ0aM+T9DpcZg98SmsSml/HAiiY4mbg/yNVdPs
228SEp/Ui4zdIBNNs6at2gGZrd4qWhdM0MqGJlehqdeUKRICE/mdedXwsWLM8AfEA0e
229OeTVhZ+EtYCypiF4fVl/NsqJ/zhBJpCx/1FBI1Uf/lu2TE4eOS1FgmIqb2j4T+jY
230e+4C8kGB405PAC0n50YpOrOs6k7fiQDjYmbNABEBAAG0LVRlbnNvckZsb3cgU2Vj
231dXJpdHkgPHNlY3VyaXR5QHRlbnNvcmZsb3cub3JnPokBTgQTAQgAOBYhBEkvXzHm
232gOJBnwP4Wxnef3wVoM2yBQJaanc8AhsDBQsJCAcCBhUKCQgLAgQWAgMBAh4BAheA
233AAoJEBnef3wVoM2yNlkIAICqetv33MD9W6mPAXH3eon+KJoeHQHYOuwWfYkUF6CC
234o+X2dlPqBSqMG3bFuTrrcwjr9w1V8HkNuzzOJvCm1CJVKaxMzPuXhBq5+DeT67+a
235T/wK1L2R1bF0gs7Pp40W3np8iAFEh8sgqtxXvLGJLGDZ1Lnfdprg3HciqaVAiTum
236HBFwszszZZ1wAnKJs5KVteFN7GSSng3qBcj0E0ql2nPGEqCVh+6RG/TU5C8gEsEf
2373DX768M4okmFDKTzLNBm+l08kkBFt+P43rNK8dyC4PXk7yJa93SmS/dlK6DZ16Yw
2382FS1StiZSVqygTW59rM5XNwdhKVXy2mf/RtNSr84gSi5AQ0EWmp3PAEIALInfBLR
239N6fAUGPFj+K3za3PeD0fWDijlC9f4Ety/icwWPkOBdYVBn0atzI21thPRbfuUxfe
240zr76xNNrtRRlbDSAChA1J5T86EflowcQor8dNC6fS+oHFCGeUjfEAm16P6mGTo0p
241osdG2XnnTHOOEFbEUeWOwR/zT0QRaGGknoy2pc4doWcJptqJIdTl1K8xyBieik/b
242nSoClqQdZJa4XA3H9G+F4NmoZGEguC5GGb2P9NHYAJ3MLHBHywZip8g9oojIwda+
243OCLL4UPEZ89cl0EyhXM0nIAmGn3Chdjfu3ebF0SeuToGN8E1goUs3qSE77ZdzIsR
244BzZSDFrgmZH+uP0AEQEAAYkBNgQYAQgAIBYhBEkvXzHmgOJBnwP4Wxnef3wVoM2y
245BQJaanc8AhsMAAoJEBnef3wVoM2yX4wIALcYZbQhSEzCsTl56UHofze6C3QuFQIH
246J4MIKrkTfwiHlCujv7GASGU2Vtis5YEyOoMidUVLlwnebE388MmaJYRm0fhYq6lP
247A3vnOCcczy1tbo846bRdv012zdUA+wY+mOITdOoUjAhYulUR0kiA2UdLSfYzbWwy
2487Obq96Jb/cPRxk8jKUu2rqC/KDrkFDtAtjdIHh6nbbQhFuaRuWntISZgpIJxd8Bt
249Gwi0imUVd9m9wZGuTbDGi6YTNk0GPpX5OMF5hjtM/objzTihSw9UN+65Y/oSQM81
250v//Fw6ZeY+HmRDFdirjD7wXtIuER4vqCryIqR6Xe9X8oJXz9L/Jhslc=
251=CDME
252-----END PGP PUBLIC KEY BLOCK-----
253```
254
255### Known Vulnerabilities
256
257For a list of known vulnerabilities and security advisories for TensorFlow,
258[click here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/security/README.md).
259