1# Python Gazelle plugin 2 3[Gazelle](https://github.com/bazelbuild/bazel-gazelle) 4is a build file generator for Bazel projects. It can create new BUILD.bazel files for a project that follows language conventions, and it can update existing build files to include new sources, dependencies, and options. 5 6Gazelle may be run by Bazel using the gazelle rule, or it may be installed and run as a command line tool. 7 8This directory contains a plugin for 9[Gazelle](https://github.com/bazelbuild/bazel-gazelle) 10that generates BUILD files content for Python code. When Gazelle is run as a command line tool with this plugin, it embeds a Python interpreter resolved during the plugin build. 11The behavior of the plugin is slightly different with different version of the interpreter as the Python `stdlib` changes with every minor version release. 12Distributors of Gazelle binaries should, therefore, build a Gazelle binary for each OS+CPU architecture+Minor Python version combination they are targeting. 13 14The following instructions are for when you use [bzlmod](https://docs.bazel.build/versions/5.0.0/bzlmod.html). 15Please refer to older documentation that includes instructions on how to use Gazelle 16without using bzlmod as your dependency manager. 17 18## Example 19 20We have an example of using Gazelle with Python located [here](https://github.com/bazelbuild/rules_python/tree/main/examples/bzlmod). 21A fully-working example without using bzlmod is in [`examples/build_file_generation`](../examples/build_file_generation). 22 23The following documentation covers using bzlmod. 24 25## Adding Gazelle to your project 26 27First, you'll need to add Gazelle to your `MODULES.bazel` file. 28Get the current version of Gazelle from there releases here: https://github.com/bazelbuild/bazel-gazelle/releases/. 29 30 31See the installation `MODULE.bazel` snippet on the Releases page: 32https://github.com/bazelbuild/rules_python/releases in order to configure rules_python. 33 34You will also need to add the `bazel_dep` for configuration for `rules_python_gazelle_plugin`. 35 36Here is a snippet of a `MODULE.bazel` file. 37 38```starlark 39# The following stanza defines the dependency rules_python. 40bazel_dep(name = "rules_python", version = "0.22.0") 41 42# The following stanza defines the dependency rules_python_gazelle_plugin. 43# For typical setups you set the version. 44bazel_dep(name = "rules_python_gazelle_plugin", version = "0.22.0") 45 46# The following stanza defines the dependency gazelle. 47bazel_dep(name = "gazelle", version = "0.31.0", repo_name = "bazel_gazelle") 48 49# Import the python repositories generated by the given module extension into the scope of the current module. 50use_repo(python, "python3_9") 51use_repo(python, "python3_9_toolchains") 52 53# Register an already-defined toolchain so that Bazel can use it during toolchain resolution. 54register_toolchains( 55 "@python3_9_toolchains//:all", 56) 57 58# Use the pip extension 59pip = use_extension("@rules_python//python:extensions.bzl", "pip") 60 61# Use the extension to call the `pip_repository` rule that invokes `pip`, with `incremental` set. 62# Accepts a locked/compiled requirements file and installs the dependencies listed within. 63# Those dependencies become available in a generated `requirements.bzl` file. 64# You can instead check this `requirements.bzl` file into your repo. 65# Because this project has different requirements for windows vs other 66# operating systems, we have requirements for each. 67pip.parse( 68 name = "pip", 69 requirements_lock = "//:requirements_lock.txt", 70 requirements_windows = "//:requirements_windows.txt", 71) 72 73# Imports the pip toolchain generated by the given module extension into the scope of the current module. 74use_repo(pip, "pip") 75``` 76Next, we'll fetch metadata about your Python dependencies, so that gazelle can 77determine which package a given import statement comes from. This is provided 78by the `modules_mapping` rule. We'll make a target for consuming this 79`modules_mapping`, and writing it as a manifest file for Gazelle to read. 80This is checked into the repo for speed, as it takes some time to calculate 81in a large monorepo. 82 83Gazelle will walk up the filesystem from a Python file to find this metadata, 84looking for a file called `gazelle_python.yaml` in an ancestor folder of the Python code. 85Create an empty file with this name. It might be next to your `requirements.txt` file. 86(You can just use `touch` at this point, it just needs to exist.) 87 88To keep the metadata updated, put this in your `BUILD.bazel` file next to `gazelle_python.yaml`: 89 90```starlark 91load("@pip//:requirements.bzl", "all_whl_requirements") 92load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest") 93load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping") 94 95# This rule fetches the metadata for python packages we depend on. That data is 96# required for the gazelle_python_manifest rule to update our manifest file. 97modules_mapping( 98 name = "modules_map", 99 wheels = all_whl_requirements, 100) 101 102# Gazelle python extension needs a manifest file mapping from 103# an import to the installed package that provides it. 104# This macro produces two targets: 105# - //:gazelle_python_manifest.update can be used with `bazel run` 106# to recalculate the manifest 107# - //:gazelle_python_manifest.test is a test target ensuring that 108# the manifest doesn't need to be updated 109gazelle_python_manifest( 110 name = "gazelle_python_manifest", 111 modules_mapping = ":modules_map", 112 # This is what we called our `pip_parse` rule, where third-party 113 # python libraries are loaded in BUILD files. 114 pip_repository_name = "pip", 115 # This should point to wherever we declare our python dependencies 116 # (the same as what we passed to the modules_mapping rule in WORKSPACE) 117 # This argument is optional. If provided, the `.test` target is very 118 # fast because it just has to check an integrity field. If not provided, 119 # the integrity field is not added to the manifest which can help avoid 120 # merge conflicts in large repos. 121 requirements = "//:requirements_lock.txt", 122) 123``` 124 125Finally, you create a target that you'll invoke to run the Gazelle tool 126with the rules_python extension included. This typically goes in your root 127`/BUILD.bazel` file: 128 129```starlark 130load("@bazel_gazelle//:def.bzl", "gazelle") 131 132# Our gazelle target points to the python gazelle binary. 133# This is the simple case where we only need one language supported. 134# If you also had proto, go, or other gazelle-supported languages, 135# you would also need a gazelle_binary rule. 136# See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example 137gazelle( 138 name = "gazelle", 139 gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary", 140) 141``` 142 143That's it, now you can finally run `bazel run //:gazelle` anytime 144you edit Python code, and it should update your `BUILD` files correctly. 145 146## Usage 147 148Gazelle is non-destructive. 149It will try to leave your edits to BUILD files alone, only making updates to `py_*` targets. 150However it will remove dependencies that appear to be unused, so it's a 151good idea to check in your work before running Gazelle so you can easily 152revert any changes it made. 153 154The rules_python extension assumes some conventions about your Python code. 155These are noted below, and might require changes to your existing code. 156 157Note that the `gazelle` program has multiple commands. At present, only the `update` command (the default) does anything for Python code. 158 159### Directives 160 161You can configure the extension using directives, just like for other 162languages. These are just comments in the `BUILD.bazel` file which 163govern behavior of the extension when processing files under that 164folder. 165 166See https://github.com/bazelbuild/bazel-gazelle#directives 167for some general directives that may be useful. 168In particular, the `resolve` directive is language-specific 169and can be used with Python. 170Examples of these directives in use can be found in the 171/gazelle/testdata folder in the rules_python repo. 172 173Python-specific directives are as follows: 174 175| **Directive** | **Default value** | 176|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------| 177| `# gazelle:python_extension` | `enabled` | 178| Controls whether the Python extension is enabled or not. Sub-packages inherit this value. Can be either "enabled" or "disabled". | | 179| [`# gazelle:python_root`](#directive-python_root) | n/a | 180| Sets a Bazel package as a Python root. This is used on monorepos with multiple Python projects that don't share the top-level of the workspace as the root. See [Directive: `python_root`](#directive-python_root) below. | | 181| `# gazelle:python_manifest_file_name` | `gazelle_python.yaml` | 182| Overrides the default manifest file name. | | 183| `# gazelle:python_ignore_files` | n/a | 184| Controls the files which are ignored from the generated targets. | | 185| `# gazelle:python_ignore_dependencies` | n/a | 186| Controls the ignored dependencies from the generated targets. | | 187| `# gazelle:python_validate_import_statements` | `true` | 188| Controls whether the Python import statements should be validated. Can be "true" or "false" | | 189| `# gazelle:python_generation_mode` | `package` | 190| Controls the target generation mode. Can be "file", "package", or "project" | | 191| `# gazelle:python_generation_mode_per_file_include_init` | `false` | 192| Controls whether `__init__.py` files are included as srcs in each generated target when target generation mode is "file". Can be "true", or "false" | | 193| [`# gazelle:python_generation_mode_per_package_require_test_entry_point`](#directive-python_generation_mode_per_package_require_test_entry_point) | `true` | 194| Controls whether a file called `__test__.py` or a target called `__test__` is required to generate one test target per package in package mode. || 195| `# gazelle:python_library_naming_convention` | `$package_name$` | 196| Controls the `py_library` naming convention. It interpolates `$package_name$` with the Bazel package name. E.g. if the Bazel package name is `foo`, setting this to `$package_name$_my_lib` would result in a generated target named `foo_my_lib`. | | 197| `# gazelle:python_binary_naming_convention` | `$package_name$_bin` | 198| Controls the `py_binary` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | 199| `# gazelle:python_test_naming_convention` | `$package_name$_test` | 200| Controls the `py_test` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | 201| `# gazelle:resolve py ...` | n/a | 202| Instructs the plugin what target to add as a dependency to satisfy a given import statement. The syntax is `# gazelle:resolve py import-string label` where `import-string` is the symbol in the python `import` statement, and `label` is the Bazel label that Gazelle should write in `deps`. | | 203| [`# gazelle:python_default_visibility labels`](#directive-python_default_visibility) | | 204| Instructs gazelle to use these visibility labels on all python targets. `labels` is a comma-separated list of labels (without spaces). | `//$python_root$:__subpackages__` | 205| [`# gazelle:python_visibility label`](#directive-python_visibility) | | 206| Appends additional visibility labels to each generated target. This directive can be set multiple times. | | 207| [`# gazelle:python_test_file_pattern`](#directive-python_test_file_pattern) | `*_test.py,test_*.py` | 208| Filenames matching these comma-separated `glob`s will be mapped to `py_test` targets. | 209| `# gazelle:python_label_convention` | `$distribution_name$` | 210| Defines the format of the distribution name in labels to third-party deps. Useful for using Gazelle plugin with other rules with different repository conventions (e.g. `rules_pycross`). Full label is always prepended with (pip) repository name, e.g. `@pip//numpy`. | 211| `# gazelle:python_label_normalization` | `snake_case` | 212| Controls how distribution names in labels to third-party deps are normalized. Useful for using Gazelle plugin with other rules with different label conventions (e.g. `rules_pycross` uses PEP-503). Can be "snake_case", "none", or "pep503". | 213 214#### Directive: `python_root`: 215 216Set this directive within the Bazel package that you want to use as the Python root. 217For example, if using a `src` dir (as recommended by the [Python Packaging User 218Guide][python-packaging-user-guide]), then set this directive in `src/BUILD.bazel`: 219 220```starlark 221# ./src/BUILD.bazel 222# Tell gazelle that are python root is the same dir as this Bazel package. 223# gazelle:python_root 224``` 225 226Note that the directive does not have any arguments. 227 228Gazelle will then add the necessary `imports` attribute to all targets that it 229generates: 230 231```starlark 232# in ./src/foo/BUILD.bazel 233py_libary( 234 ... 235 imports = [".."], # Gazelle adds this 236 ... 237) 238 239# in ./src/foo/bar/BUILD.bazel 240py_libary( 241 ... 242 imports = ["../.."], # Gazelle adds this 243 ... 244) 245``` 246 247[python-packaging-user-guide]: https://github.com/pypa/packaging.python.org/blob/4c86169a/source/tutorials/packaging-projects.rst 248 249 250#### Directive: `python_default_visibility`: 251 252Instructs gazelle to use these visibility labels on all _python_ targets 253(typically `py_*`, but can be modified via the `map_kind` directive). The arg 254to this directive is a a comma-separated list (without spaces) of labels. 255 256For example: 257 258```starlark 259# gazelle:python_default_visibility //:__subpackages__,//tests:__subpackages__ 260``` 261 262produces the following visibility attribute: 263 264```starlark 265py_library( 266 ..., 267 visibility = [ 268 "//:__subpackages__", 269 "//tests:__subpackages__", 270 ], 271 ..., 272) 273``` 274 275You can also inject the `python_root` value by using the exact string 276`$python_root$`. All instances of this string will be replaced by the `python_root` 277value. 278 279```starlark 280# gazelle:python_default_visibility //$python_root$:__pkg__,//foo/$python_root$/tests:__subpackages__ 281 282# Assuming the "# gazelle:python_root" directive is set in ./py/src/BUILD.bazel, 283# the results will be: 284py_library( 285 ..., 286 visibility = [ 287 "//foo/py/src/tests:__subpackages__", # sorted alphabetically 288 "//py/src:__pkg__", 289 ], 290 ..., 291) 292``` 293 294Two special values are also accepted as an argument to the directive: 295 296+ `NONE`: This removes all default visibility. Labels added by the 297 `python_visibility` directive are still included. 298+ `DEFAULT`: This resets the default visibility. 299 300For example: 301 302```starlark 303# gazelle:python_default_visibility NONE 304 305py_library( 306 name = "...", 307 srcs = [...], 308) 309``` 310 311```starlark 312# gazelle:python_default_visibility //foo:bar 313# gazelle:python_default_visibility DEFAULT 314 315py_library( 316 ..., 317 visibility = ["//:__subpackages__"], 318 ..., 319) 320``` 321 322These special values can be useful for sub-packages. 323 324 325#### Directive: `python_visibility`: 326 327Appends additional `visibility` labels to each generated target. 328 329This directive can be set multiple times. The generated `visibility` attribute 330will include the default visibility and all labels defined by this directive. 331All labels will be ordered alphabetically. 332 333```starlark 334# ./BUILD.bazel 335# gazelle:python_visibility //tests:__pkg__ 336# gazelle:python_visibility //bar:baz 337 338py_library( 339 ... 340 visibility = [ 341 "//:__subpackages__", # default visibility 342 "//bar:baz", 343 "//tests:__pkg__", 344 ], 345 ... 346) 347``` 348 349Child Bazel packages inherit values from parents: 350 351```starlark 352# ./bar/BUILD.bazel 353# gazelle:python_visibility //tests:__subpackages__ 354 355py_library( 356 ... 357 visibility = [ 358 "//:__subpackages__", # default visibility 359 "//bar:baz", # defined in ../BUILD.bazel 360 "//tests:__pkg__", # defined in ../BUILD.bazel 361 "//tests:__subpackages__", # defined in this ./BUILD.bazel 362 ], 363 ... 364) 365 366``` 367 368This directive also supports the `$python_root$` placeholder that 369`# gazelle:python_default_visibility` supports. 370 371```starlark 372# gazlle:python_visibility //$python_root$/foo:bar 373 374py_library( 375 ... 376 visibility = ["//this_is_my_python_root/foo:bar"], 377 ... 378) 379``` 380 381 382#### Directive: `python_test_file_pattern`: 383 384This directive adjusts which python files will be mapped to the `py_test` rule. 385 386+ The default is `*_test.py,test_*.py`: both `test_*.py` and `*_test.py` files 387 will generate `py_test` targets. 388+ This directive must have a value. If no value is given, an error will be raised. 389+ It is recommended, though not necessary, to include the `.py` extension in 390 the `glob`s: `foo*.py,?at.py`. 391+ Like most directives, it applies to the current Bazel package and all subpackages 392 until the directive is set again. 393+ This directive accepts multiple `glob` patterns, separated by commas without spaces: 394 395```starlark 396# gazelle:python_test_file_pattern foo*.py,?at 397 398py_library( 399 name = "mylib", 400 srcs = ["mylib.py"], 401) 402 403py_test( 404 name = "foo_bar", 405 srcs = ["foo_bar.py"], 406) 407 408py_test( 409 name = "cat", 410 srcs = ["cat.py"], 411) 412 413py_test( 414 name = "hat", 415 srcs = ["hat.py"], 416) 417``` 418 419 420##### Notes 421 422Resetting to the default value (such as in a subpackage) is manual. Set: 423 424```starlark 425# gazelle:python_test_file_pattern *_test.py,test_*.py 426``` 427 428There currently is no way to tell gazelle that _no_ files in a package should 429be mapped to `py_test` targets (see [Issue #1826][issue-1826]). The workaround 430is to set this directive to a pattern that will never match a `.py` file, such 431as `foo.bar`: 432 433```starlark 434# No files in this package should be mapped to py_test targets. 435# gazelle:python_test_file_pattern foo.bar 436 437py_library( 438 name = "my_test", 439 srcs = ["my_test.py"], 440) 441``` 442 443[issue-1826]: https://github.com/bazelbuild/rules_python/issues/1826 444 445#### Directive: `python_generation_mode_per_package_require_test_entry_point`: 446When `# gazelle:python_generation_mode package`, whether a file called `__test__.py` or a target called `__test__`, a.k.a., entry point, is required to generate one test target per package. If this is set to true but no entry point is found, Gazelle will fall back to file mode and generate one test target per file. Setting this directive to false forces Gazelle to generate one test target per package even without entry point. However, this means the `main` attribute of the `py_test` will not be set and the target will not be runnable unless either: 4471. there happen to be a file in the `srcs` with the same name as the `py_test` target, or 4482. a macro populating the `main` attribute of `py_test` is configured with `gazelle:map_kind` to replace `py_test` when Gazelle is generating Python test targets. For example, user can provide such a macro to Gazelle: 449 450```starlark 451load("@rules_python//python:defs.bzl", _py_test="py_test") 452load("@aspect_rules_py//py:defs.bzl", "py_pytest_main") 453 454def py_test(name, main=None, **kwargs): 455 deps = kwargs.pop("deps", []) 456 if not main: 457 py_pytest_main( 458 name = "__test__", 459 deps = ["@pip_pytest//:pkg"], # change this to the pytest target in your repo. 460 ) 461 462 deps.append(":__test__") 463 main = ":__test__.py" 464 465 _py_test( 466 name = name, 467 main = main, 468 deps = deps, 469 **kwargs, 470) 471``` 472 473### Annotations 474 475*Annotations* refer to comments found _within Python files_ that configure how 476Gazelle acts for that particular file. 477 478Annotations have the form: 479 480```python 481# gazelle:annotation_name value 482``` 483 484and can reside anywhere within a Python file where comments are valid. For example: 485 486```python 487import foo 488# gazelle:annotation_name value 489 490def bar(): # gazelle:annotation_name value 491 pass 492``` 493 494The annotations are: 495 496| **Annotation** | **Default value** | 497|---------------------------------------------------------------|-------------------| 498| [`# gazelle:ignore imports`](#annotation-ignore) | N/A | 499| Tells Gazelle to ignore import statements. `imports` is a comma-separated list of imports to ignore. | | 500| [`# gazelle:include_dep targets`](#annotation-include_dep) | N/A | 501| Tells Gazelle to include a set of dependencies, even if they are not imported in a Python module. `targets` is a comma-separated list of target names to include as dependencies. | | 502 503 504#### Annotation: `ignore` 505 506This annotation accepts a comma-separated string of values. Values are names of Python 507imports that Gazelle should _not_ include in target dependencies. 508 509The annotation can be added multiple times, and all values are combined and 510de-duplicated. 511 512For `python_generation_mode = "package"`, the `ignore` annotations 513found across all files included in the generated target are removed from `deps`. 514 515Example: 516 517```python 518import numpy # a pypi package 519 520# gazelle:ignore bar.baz.hello,foo 521import bar.baz.hello 522import foo 523 524# Ignore this import because _reasons_ 525import baz # gazelle:ignore baz 526``` 527 528will cause Gazelle to generate: 529 530```starlark 531deps = ["@pypi//numpy"], 532``` 533 534 535#### Annotation: `include_dep` 536 537This annotation accepts a comma-separated string of values. Values _must_ 538be Python targets, but _no validation is done_. If a value is not a Python 539target, building will result in an error saying: 540 541``` 542<target> does not have mandatory providers: 'PyInfo' or 'CcInfo' or 'PyInfo'. 543``` 544 545Adding non-Python targets to the generated target is a feature request being 546tracked in [Issue #1865](https://github.com/bazelbuild/rules_python/issues/1865). 547 548The annotation can be added multiple times, and all values are combined 549and de-duplicated. 550 551For `python_generation_mode = "package"`, the `include_dep` annotations 552found across all files included in the generated target are included in `deps`. 553 554Example: 555 556```python 557# gazelle:include_dep //foo:bar,:hello_world,//:abc 558# gazelle:include_dep //:def,//foo:bar 559import numpy # a pypi package 560``` 561 562will cause Gazelle to generate: 563 564```starlark 565deps = [ 566 ":hello_world", 567 "//:abc", 568 "//:def", 569 "//foo:bar", 570 "@pypi//numpy", 571] 572``` 573 574 575### Libraries 576 577Python source files are those ending in `.py` but not ending in `_test.py`. 578 579First, we look for the nearest ancestor BUILD file starting from the folder 580containing the Python source file. 581 582In package generation mode, if there is no `py_library` in this BUILD file, one 583is created using the package name as the target's name. This makes it the 584default target in the package. Next, all source files are collected into the 585`srcs` of the `py_library`. 586 587In project generation mode, all source files in subdirectories (that don't have 588BUILD files) are also collected. 589 590In file generation mode, each file is given its own target. 591 592Finally, the `import` statements in the source files are parsed, and 593dependencies are added to the `deps` attribute. 594 595### Unit Tests 596 597A `py_test` target is added to the BUILD file when gazelle encounters 598a file named `__test__.py`. 599Often, Python unit test files are named with the suffix `_test`. 600For example, if we had a folder that is a package named "foo" we could have a Python file named `foo_test.py` 601and gazelle would create a `py_test` block for the file. 602 603The following is an example of a `py_test` target that gazelle would add when 604it encounters a file named `__test__.py`. 605 606```starlark 607py_test( 608 name = "build_file_generation_test", 609 srcs = ["__test__.py"], 610 main = "__test__.py", 611 deps = [":build_file_generation"], 612) 613``` 614 615You can control the naming convention for test targets by adding a gazelle directive named 616`# gazelle:python_test_naming_convention`. See the instructions in the section above that 617covers directives. 618 619### Binaries 620 621When a `__main__.py` file is encountered, this indicates the entry point 622of a Python program. A `py_binary` target will be created, named `[package]_bin`. 623 624When no such entry point exists, Gazelle will look for a line like this in the top level in every module: 625 626```python 627if __name == "__main__": 628``` 629 630Gazelle will create a `py_binary` target for every module with such a line, with 631the target name the same as the module name. 632 633If `python_generation_mode` is set to `file`, then instead of one `py_binary` 634target per module, Gazelle will create one `py_binary` target for each file with 635such a line, and the name of the target will match the name of the script. 636 637Note that it's possible for another script to depend on a `py_binary` target and 638import from the `py_binary`'s scripts. This can have possible negative effects on 639Bazel analysis time and runfiles size compared to depending on a `py_library` 640target. The simplest way to avoid these negative effects is to extract library 641code into a separate script without a `main` line. Gazelle will then create a 642`py_library` target for that library code, and other scripts can depend on that 643`py_library` target. 644 645## Developer Notes 646 647Gazelle extensions are written in Go. This gazelle plugin is a hybrid, as it uses Go to execute a 648Python interpreter as a subprocess to parse Python source files. 649See the gazelle documentation https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.md 650for more information on extending Gazelle. 651 652If you add new Go dependencies to the plugin source code, you need to "tidy" the go.mod file. 653After changing that file, run `go mod tidy` or `bazel run @go_sdk//:bin/go -- mod tidy` 654to update the go.mod and go.sum files. Then run `bazel run //:gazelle_update_repos` to have gazelle 655add the new dependenies to the deps.bzl file. The deps.bzl file is used as defined in our /WORKSPACE 656to include the external repos Bazel loads Go dependencies from. 657 658Then after editing Go code, run `bazel run //:gazelle` to generate/update the rules in the 659BUILD.bazel files in our repo. 660