1:mod:`cgi` --- Common Gateway Interface support
2===============================================
3
4.. module:: cgi
5   :synopsis: Helpers for running Python scripts via the Common Gateway Interface.
6   :deprecated:
7
8**Source code:** :source:`Lib/cgi.py`
9
10.. index::
11   pair: WWW; server
12   pair: CGI; protocol
13   pair: HTTP; protocol
14   pair: MIME; headers
15   single: URL
16   single: Common Gateway Interface
17
18.. deprecated-removed:: 3.11 3.13
19   The :mod:`cgi` module is deprecated
20   (see :pep:`PEP 594 <594#cgi>` for details and alternatives).
21
22   The :class:`FieldStorage` class can typically be replaced with
23   :func:`urllib.parse.parse_qsl` for ``GET`` and ``HEAD`` requests,
24   and the :mod:`email.message` module or
25   `multipart <https://pypi.org/project/multipart/>`_ for ``POST`` and ``PUT``.
26   Most :ref:`utility functions <functions-in-cgi-module>` have replacements.
27
28--------------
29
30Support module for Common Gateway Interface (CGI) scripts.
31
32This module defines a number of utilities for use by CGI scripts written in
33Python.
34
35The global variable ``maxlen`` can be set to an integer indicating the maximum
36size of a POST request. POST requests larger than this size will result in a
37:exc:`ValueError` being raised during parsing. The default value of this
38variable is ``0``, meaning the request size is unlimited.
39
40.. include:: ../includes/wasm-notavail.rst
41
42Introduction
43------------
44
45.. _cgi-intro:
46
47A CGI script is invoked by an HTTP server, usually to process user input
48submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element.
49
50Most often, CGI scripts live in the server's special :file:`cgi-bin` directory.
51The HTTP server places all sorts of information about the request (such as the
52client's hostname, the requested URL, the query string, and lots of other
53goodies) in the script's shell environment, executes the script, and sends the
54script's output back to the client.
55
56The script's input is connected to the client too, and sometimes the form data
57is read this way; at other times the form data is passed via the "query string"
58part of the URL.  This module is intended to take care of the different cases
59and provide a simpler interface to the Python script.  It also provides a number
60of utilities that help in debugging scripts, and the latest addition is support
61for file uploads from a form (if your browser supports it).
62
63The output of a CGI script should consist of two sections, separated by a blank
64line.  The first section contains a number of headers, telling the client what
65kind of data is following.  Python code to generate a minimal header section
66looks like this::
67
68   print("Content-Type: text/html")    # HTML is following
69   print()                             # blank line, end of headers
70
71The second section is usually HTML, which allows the client software to display
72nicely formatted text with header, in-line images, etc. Here's Python code that
73prints a simple piece of HTML::
74
75   print("<TITLE>CGI script output</TITLE>")
76   print("<H1>This is my first CGI script</H1>")
77   print("Hello, world!")
78
79
80.. _using-the-cgi-module:
81
82Using the cgi module
83--------------------
84
85Begin by writing ``import cgi``.
86
87When you write a new script, consider adding these lines::
88
89   import cgitb
90   cgitb.enable()
91
92This activates a special exception handler that will display detailed reports in
93the web browser if any errors occur.  If you'd rather not show the guts of your
94program to users of your script, you can have the reports saved to files
95instead, with code like this::
96
97   import cgitb
98   cgitb.enable(display=0, logdir="/path/to/logdir")
99
100It's very helpful to use this feature during script development. The reports
101produced by :mod:`cgitb` provide information that can save you a lot of time in
102tracking down bugs.  You can always remove the ``cgitb`` line later when you
103have tested your script and are confident that it works correctly.
104
105To get at submitted form data, use the :class:`FieldStorage` class. If the form
106contains non-ASCII characters, use the *encoding* keyword parameter set to the
107value of the encoding defined for the document. It is usually contained in the
108META tag in the HEAD section of the HTML document or by the
109:mailheader:`Content-Type` header.  This reads the form contents from the
110standard input or the environment (depending on the value of various
111environment variables set according to the CGI standard).  Since it may consume
112standard input, it should be instantiated only once.
113
114The :class:`FieldStorage` instance can be indexed like a Python dictionary.
115It allows membership testing with the :keyword:`in` operator, and also supports
116the standard dictionary method :meth:`~dict.keys` and the built-in function
117:func:`len`.  Form fields containing empty strings are ignored and do not appear
118in the dictionary; to keep such values, provide a true value for the optional
119*keep_blank_values* keyword parameter when creating the :class:`FieldStorage`
120instance.
121
122For instance, the following code (which assumes that the
123:mailheader:`Content-Type` header and blank line have already been printed)
124checks that the fields ``name`` and ``addr`` are both set to a non-empty
125string::
126
127   form = cgi.FieldStorage()
128   if "name" not in form or "addr" not in form:
129       print("<H1>Error</H1>")
130       print("Please fill in the name and addr fields.")
131       return
132   print("<p>name:", form["name"].value)
133   print("<p>addr:", form["addr"].value)
134   ...further form processing here...
135
136Here the fields, accessed through ``form[key]``, are themselves instances of
137:class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form
138encoding). The :attr:`~FieldStorage.value` attribute of the instance yields
139the string value of the field.  The :meth:`~FieldStorage.getvalue` method
140returns this string value directly; it also accepts an optional second argument
141as a default to return if the requested key is not present.
142
143If the submitted form data contains more than one field with the same name, the
144object retrieved by ``form[key]`` is not a :class:`FieldStorage` or
145:class:`MiniFieldStorage` instance but a list of such instances.  Similarly, in
146this situation, ``form.getvalue(key)`` would return a list of strings. If you
147expect this possibility (when your HTML form contains multiple fields with the
148same name), use the :meth:`~FieldStorage.getlist` method, which always returns
149a list of values (so that you do not need to special-case the single item
150case).  For example, this code concatenates any number of username fields,
151separated by commas::
152
153   value = form.getlist("username")
154   usernames = ",".join(value)
155
156If a field represents an uploaded file, accessing the value via the
157:attr:`~FieldStorage.value` attribute or the :meth:`~FieldStorage.getvalue`
158method reads the entire file in memory as bytes.  This may not be what you
159want.  You can test for an uploaded file by testing either the
160:attr:`~FieldStorage.filename` attribute or the :attr:`~FieldStorage.file`
161attribute.  You can then read the data from the :attr:`!file`
162attribute before it is automatically closed as part of the garbage collection of
163the :class:`FieldStorage` instance
164(the :func:`~io.RawIOBase.read` and :func:`~io.IOBase.readline` methods will
165return bytes)::
166
167   fileitem = form["userfile"]
168   if fileitem.file:
169       # It's an uploaded file; count lines
170       linecount = 0
171       while True:
172           line = fileitem.file.readline()
173           if not line: break
174           linecount = linecount + 1
175
176:class:`FieldStorage` objects also support being used in a :keyword:`with`
177statement, which will automatically close them when done.
178
179If an error is encountered when obtaining the contents of an uploaded file
180(for example, when the user interrupts the form submission by clicking on
181a Back or Cancel button) the :attr:`~FieldStorage.done` attribute of the
182object for the field will be set to the value -1.
183
184The file upload draft standard entertains the possibility of uploading multiple
185files from one field (using a recursive :mimetype:`multipart/\*` encoding).
186When this occurs, the item will be a dictionary-like :class:`FieldStorage` item.
187This can be determined by testing its :attr:`!type` attribute, which should be
188:mimetype:`multipart/form-data` (or perhaps another MIME type matching
189:mimetype:`multipart/\*`).  In this case, it can be iterated over recursively
190just like the top-level form object.
191
192When a form is submitted in the "old" format (as the query string or as a single
193data part of type :mimetype:`application/x-www-form-urlencoded`), the items will
194actually be instances of the class :class:`MiniFieldStorage`.  In this case, the
195:attr:`!list`, :attr:`!file`, and :attr:`filename` attributes are always ``None``.
196
197A form submitted via POST that also has a query string will contain both
198:class:`FieldStorage` and :class:`MiniFieldStorage` items.
199
200.. versionchanged:: 3.4
201   The :attr:`~FieldStorage.file` attribute is automatically closed upon the
202   garbage collection of the creating :class:`FieldStorage` instance.
203
204.. versionchanged:: 3.5
205   Added support for the context management protocol to the
206   :class:`FieldStorage` class.
207
208
209Higher Level Interface
210----------------------
211
212The previous section explains how to read CGI form data using the
213:class:`FieldStorage` class.  This section describes a higher level interface
214which was added to this class to allow one to do it in a more readable and
215intuitive way.  The interface doesn't make the techniques described in previous
216sections obsolete --- they are still useful to process file uploads efficiently,
217for example.
218
219.. XXX: Is this true ?
220
221The interface consists of two simple methods. Using the methods you can process
222form data in a generic way, without the need to worry whether only one or more
223values were posted under one name.
224
225In the previous section, you learned to write following code anytime you
226expected a user to post more than one value under one name::
227
228   item = form.getvalue("item")
229   if isinstance(item, list):
230       # The user is requesting more than one item.
231   else:
232       # The user is requesting only one item.
233
234This situation is common for example when a form contains a group of multiple
235checkboxes with the same name::
236
237   <input type="checkbox" name="item" value="1" />
238   <input type="checkbox" name="item" value="2" />
239
240In most situations, however, there's only one form control with a particular
241name in a form and then you expect and need only one value associated with this
242name.  So you write a script containing for example this code::
243
244   user = form.getvalue("user").upper()
245
246The problem with the code is that you should never expect that a client will
247provide valid input to your scripts.  For example, if a curious user appends
248another ``user=foo`` pair to the query string, then the script would crash,
249because in this situation the ``getvalue("user")`` method call returns a list
250instead of a string.  Calling the :meth:`~str.upper` method on a list is not valid
251(since lists do not have a method of this name) and results in an
252:exc:`AttributeError` exception.
253
254Therefore, the appropriate way to read form data values was to always use the
255code which checks whether the obtained value is a single value or a list of
256values.  That's annoying and leads to less readable scripts.
257
258A more convenient approach is to use the methods :meth:`~FieldStorage.getfirst`
259and :meth:`~FieldStorage.getlist` provided by this higher level interface.
260
261
262.. method:: FieldStorage.getfirst(name, default=None)
263
264   This method always returns only one value associated with form field *name*.
265   The method returns only the first value in case that more values were posted
266   under such name.  Please note that the order in which the values are received
267   may vary from browser to browser and should not be counted on. [#]_  If no such
268   form field or value exists then the method returns the value specified by the
269   optional parameter *default*.  This parameter defaults to ``None`` if not
270   specified.
271
272
273.. method:: FieldStorage.getlist(name)
274
275   This method always returns a list of values associated with form field *name*.
276   The method returns an empty list if no such form field or value exists for
277   *name*.  It returns a list consisting of one item if only one such value exists.
278
279Using these methods you can write nice compact code::
280
281   import cgi
282   form = cgi.FieldStorage()
283   user = form.getfirst("user", "").upper()    # This way it's safe.
284   for item in form.getlist("item"):
285       do_something(item)
286
287
288.. _functions-in-cgi-module:
289
290Functions
291---------
292
293These are useful if you want more control, or if you want to employ some of the
294algorithms implemented in this module in other circumstances.
295
296
297.. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False, separator="&")
298
299   Parse a query in the environment or from a file (the file defaults to
300   ``sys.stdin``).  The *keep_blank_values*, *strict_parsing* and *separator* parameters are
301   passed to :func:`urllib.parse.parse_qs` unchanged.
302
303   .. deprecated-removed:: 3.11 3.13
304      This function, like the rest of the :mod:`cgi` module, is deprecated.
305      It can be replaced by calling :func:`urllib.parse.parse_qs` directly
306      on the desired query string (except for ``multipart/form-data`` input,
307      which can be handled as described for :func:`parse_multipart`).
308
309
310.. function:: parse_multipart(fp, pdict, encoding="utf-8", errors="replace", separator="&")
311
312   Parse input of type :mimetype:`multipart/form-data` (for  file uploads).
313   Arguments are *fp* for the input file, *pdict* for a dictionary containing
314   other parameters in the :mailheader:`Content-Type` header, and *encoding*,
315   the request encoding.
316
317   Returns a dictionary just like :func:`urllib.parse.parse_qs`: keys are the
318   field names, each value is a list of values for that field. For non-file
319   fields, the value is a list of strings.
320
321   This is easy to use but not much good if you are expecting megabytes to be
322   uploaded --- in that case, use the :class:`FieldStorage` class instead
323   which is much more flexible.
324
325   .. versionchanged:: 3.7
326      Added the *encoding* and *errors* parameters.  For non-file fields, the
327      value is now a list of strings, not bytes.
328
329   .. versionchanged:: 3.10
330      Added the *separator* parameter.
331
332   .. deprecated-removed:: 3.11 3.13
333      This function, like the rest of the :mod:`cgi` module, is deprecated.
334      It can be replaced with the functionality in the :mod:`email` package
335      (e.g. :class:`email.message.EmailMessage`/:class:`email.message.Message`)
336      which implements the same MIME RFCs, or with the
337      `multipart <https://pypi.org/project/multipart/>`__ PyPI project.
338
339
340.. function:: parse_header(string)
341
342   Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a
343   dictionary of parameters.
344
345   .. deprecated-removed:: 3.11 3.13
346      This function, like the rest of the :mod:`cgi` module, is deprecated.
347      It can be replaced with the functionality in the :mod:`email` package,
348      which implements the same MIME RFCs.
349
350      For example, with :class:`email.message.EmailMessage`::
351
352          from email.message import EmailMessage
353          msg = EmailMessage()
354          msg['content-type'] = 'application/json; charset="utf8"'
355          main, params = msg.get_content_type(), msg['content-type'].params
356
357
358.. function:: test()
359
360   Robust test CGI script, usable as main program. Writes minimal HTTP headers and
361   formats all information provided to the script in HTML format.
362
363
364.. function:: print_environ()
365
366   Format the shell environment in HTML.
367
368
369.. function:: print_form(form)
370
371   Format a form in HTML.
372
373
374.. function:: print_directory()
375
376   Format the current directory in HTML.
377
378
379.. function:: print_environ_usage()
380
381   Print a list of useful (used by CGI) environment variables in HTML.
382
383
384.. _cgi-security:
385
386Caring about security
387---------------------
388
389.. index:: pair: CGI; security
390
391There's one important rule: if you invoke an external program (via
392:func:`os.system`, :func:`os.popen` or other functions with similar
393functionality), make very sure you don't pass arbitrary strings received from
394the client to the shell.  This is a well-known security hole whereby clever
395hackers anywhere on the web can exploit a gullible CGI script to invoke
396arbitrary shell commands.  Even parts of the URL or field names cannot be
397trusted, since the request doesn't have to come from your form!
398
399To be on the safe side, if you must pass a string gotten from a form to a shell
400command, you should make sure the string contains only alphanumeric characters,
401dashes, underscores, and periods.
402
403
404Installing your CGI script on a Unix system
405-------------------------------------------
406
407Read the documentation for your HTTP server and check with your local system
408administrator to find the directory where CGI scripts should be installed;
409usually this is in a directory :file:`cgi-bin` in the server tree.
410
411Make sure that your script is readable and executable by "others"; the Unix file
412mode should be ``0o755`` octal (use ``chmod 0755 filename``).  Make sure that the
413first line of the script contains ``#!`` starting in column 1 followed by the
414pathname of the Python interpreter, for instance::
415
416   #!/usr/local/bin/python
417
418Make sure the Python interpreter exists and is executable by "others".
419
420Make sure that any files your script needs to read or write are readable or
421writable, respectively, by "others" --- their mode should be ``0o644`` for
422readable and ``0o666`` for writable.  This is because, for security reasons, the
423HTTP server executes your script as user "nobody", without any special
424privileges.  It can only read (write, execute) files that everybody can read
425(write, execute).  The current directory at execution time is also different (it
426is usually the server's cgi-bin directory) and the set of environment variables
427is also different from what you get when you log in.  In particular, don't count
428on the shell's search path for executables (:envvar:`PATH`) or the Python module
429search path (:envvar:`PYTHONPATH`) to be set to anything interesting.
430
431If you need to load modules from a directory which is not on Python's default
432module search path, you can change the path in your script, before importing
433other modules.  For example::
434
435   import sys
436   sys.path.insert(0, "/usr/home/joe/lib/python")
437   sys.path.insert(0, "/usr/local/lib/python")
438
439(This way, the directory inserted last will be searched first!)
440
441Instructions for non-Unix systems will vary; check your HTTP server's
442documentation (it will usually have a section on CGI scripts).
443
444
445Testing your CGI script
446-----------------------
447
448Unfortunately, a CGI script will generally not run when you try it from the
449command line, and a script that works perfectly from the command line may fail
450mysteriously when run from the server.  There's one reason why you should still
451test your script from the command line: if it contains a syntax error, the
452Python interpreter won't execute it at all, and the HTTP server will most likely
453send a cryptic error to the client.
454
455Assuming your script has no syntax errors, yet it does not work, you have no
456choice but to read the next section.
457
458
459Debugging CGI scripts
460---------------------
461
462.. index:: pair: CGI; debugging
463
464First of all, check for trivial installation errors --- reading the section
465above on installing your CGI script carefully can save you a lot of time.  If
466you wonder whether you have understood the installation procedure correctly, try
467installing a copy of this module file (:file:`cgi.py`) as a CGI script.  When
468invoked as a script, the file will dump its environment and the contents of the
469form in HTML format. Give it the right mode etc., and send it a request.  If it's
470installed in the standard :file:`cgi-bin` directory, it should be possible to
471send it a request by entering a URL into your browser of the form:
472
473.. code-block:: none
474
475   http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
476
477If this gives an error of type 404, the server cannot find the script -- perhaps
478you need to install it in a different directory.  If it gives another error,
479there's an installation problem that you should fix before trying to go any
480further.  If you get a nicely formatted listing of the environment and form
481content (in this example, the fields should be listed as "addr" with value "At
482Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been
483installed correctly.  If you follow the same procedure for your own script, you
484should now be able to debug it.
485
486The next step could be to call the :mod:`cgi` module's :func:`test` function
487from your script: replace its main code with the single statement ::
488
489   cgi.test()
490
491This should produce the same results as those gotten from installing the
492:file:`cgi.py` file itself.
493
494When an ordinary Python script raises an unhandled exception (for whatever
495reason: of a typo in a module name, a file that can't be opened, etc.), the
496Python interpreter prints a nice traceback and exits.  While the Python
497interpreter will still do this when your CGI script raises an exception, most
498likely the traceback will end up in one of the HTTP server's log files, or be
499discarded altogether.
500
501Fortunately, once you have managed to get your script to execute *some* code,
502you can easily send tracebacks to the web browser using the :mod:`cgitb` module.
503If you haven't done so already, just add the lines::
504
505   import cgitb
506   cgitb.enable()
507
508to the top of your script.  Then try running it again; when a problem occurs,
509you should see a detailed report that will likely make apparent the cause of the
510crash.
511
512If you suspect that there may be a problem in importing the :mod:`cgitb` module,
513you can use an even more robust approach (which only uses built-in modules)::
514
515   import sys
516   sys.stderr = sys.stdout
517   print("Content-Type: text/plain")
518   print()
519   ...your code here...
520
521This relies on the Python interpreter to print the traceback.  The content type
522of the output is set to plain text, which disables all HTML processing.  If your
523script works, the raw HTML will be displayed by your client.  If it raises an
524exception, most likely after the first two lines have been printed, a traceback
525will be displayed. Because no HTML interpretation is going on, the traceback
526will be readable.
527
528
529Common problems and solutions
530-----------------------------
531
532* Most HTTP servers buffer the output from CGI scripts until the script is
533  completed.  This means that it is not possible to display a progress report on
534  the client's display while the script is running.
535
536* Check the installation instructions above.
537
538* Check the HTTP server's log files.  (``tail -f logfile`` in a separate window
539  may be useful!)
540
541* Always check a script for syntax errors first, by doing something like
542  ``python script.py``.
543
544* If your script does not have any syntax errors, try adding ``import cgitb;
545  cgitb.enable()`` to the top of the script.
546
547* When invoking external programs, make sure they can be found. Usually, this
548  means using absolute path names --- :envvar:`PATH` is usually not set to a very
549  useful value in a CGI script.
550
551* When reading or writing external files, make sure they can be read or written
552  by the userid under which your CGI script will be running: this is typically the
553  userid under which the web server is running, or some explicitly specified
554  userid for a web server's ``suexec`` feature.
555
556* Don't try to give a CGI script a set-uid mode.  This doesn't work on most
557  systems, and is a security liability as well.
558
559.. rubric:: Footnotes
560
561.. [#] Note that some recent versions of the HTML specification do state what
562   order the field values should be supplied in, but knowing whether a request
563   was received from a conforming browser, or even from a browser at all, is
564   tedious and error-prone.
565