Don't let code smells to be merged! Use pre-commit!

Feb 02, 2022

In my last post, I've talked about Code Governance, and recommended some tools that can help you achieve the start of your governance.

TLDR? Git hooks for your Python project. Check the gist here.

Today I'll do a quick view on a pre-commit set up for a Python project and show you how it works.

Git support the use of hooks. Hooks are automated scripts that can run in every step of your commit, push or pull actions. Check the official git documentation to know more about hooks.

Pre-commit is a tool that add some layers of abstraction above git hooks and help you share the hooks between repositories without the need to copy and paste scripts around.

As most of configuration-automation tools, Pre-commit is defined by a YAML file named .pre-commit-config.yaml. But first you need to install pre-commit on your machine, otherwise the definitions that you add to the config file will not work.

pip install --user pre-commit
cd your/repository
pre-commit install

After running those commands, you should be good to go.

On the Pre-commit docs there's a huge list of supported hooks that are available for you to use out of the box. And they are the ones that I'll be using in this tutorial.

A pre-commit config file is defined by a list of repos, because all the hooks are git repositories, and so makes sense to import them from their origin repositories, that also makes that your hooks are always up-to-date to the latest releases of the hooks. Basically, a hook is defined as follows:

repos:
  - repo: repo_url
    rev: version
    hooks:
      - id: name_of_the_hook

So let's go to the practice. The configuration that we'll use are a group of security/validation hooks. They check if your python code parse as valid code, if you are committing AWS credentials, checking if the docstring are correctly added to your class/function, toml/yaml files validations, if you are using type annotations and not ignoring pylint rules without specifying their code. You can check them all in the pre-commit hooks link mentioned earlier.

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
      - id: check-toml
      - id: check-yaml
      - id: check-docstring-first
      - id: detect-aws-credentials
      - id: detect-private-key
      - id: check-merge-conflict
      - id: check-ast

  - repo: https://github.com/pre-commit/pygrep-hooks 
    rev: v1.9.0 
    hooks:
      - id: python-check-mock-methods
      - id: python-use-type-annotations
      - id: python-check-blanket-noqa
      - id: python-no-eval

That configuration should make a good start for you. But as always, if we are good developers we should write tests. I remember as if we're yesterday an acronym called TAFT: Test all the fucking time!. And I try my best to do it.

For testing, I use a bit more of configuration: I need to add Tox and a Makefile to my project to make everything work smoothly. Tox is an amazing tool to test your project against several Python versions and isolates everyone in their correspondent virtualenvs.

Remind to future self: If you are using poetry, don't, don't ever call poetry inside tox. You will lose hours to discover the issue and Google will not be much of help.

In my workflow, I only use tox to make sure that flake8 will run without problems and to don't make my poetry-dev dependencies huge because of lint/compliance rules. So let's go to our tox.ini:

[tox]
envlist = py{36,38,39}
skipsdist = true
skip_missing_interpreters = true

[testenv]
description= Setup environment

[testenv:lint]
deps = 
    isort
    black
    autoflake
commands =
    autoflake --recursive .
    isort .
    black .

[testenv:compliance]
description= Run compliance tests
deps =
    flake8
    flake8-bandit
    flake8-bugbear
    flake8-builtins
    flake8-comprehensions
    flake8-string-format
    flake8-black
    flake8-logging-format
    flake8-isort
commands =
    flake8

[flake8]
exclude =
    .venv,
    .git,
    .tox,
    dist,
    doc,
    *lib/python*,
    *egg,
    build,
max-complexity = 10
max-line-length = 88
statistics = True
count = True
format = pylint

That configuration add flake8 and most of its plugins to run against our code, and output if is everything ok. To run tox with pre-commit, I use a Makefile configuration. However, why to use a Makefile if we can call tox directly with pre-commit, Lays? Because you may want to run some commands without the need to write the full command line or outside pre-commit (most likely).

The Makefile helps you to abstract a bunch of commands and just run make command to for the magic to happen. At the example below, you can see that I can use poetry to call tox and pytest or just call tox directly.

lint:
	tox -e lint

run_tests:
	poetry run pytest

run_compliance:
	poetry run tox -e compliance

setup_pre_commit:
	pre-commit install --hook-type pre-commit --hook-type pre-push --hook-type commit-msg

The next step is to call the make commands from pre-commit. Let's add one more block to our config:

 - repo: local
    hooks:
    - id: lint and format
      name: Lint and format files
      entry: make lint
      language: system
      types: [python]
    - id: compliance and quality
      name: Run for compliance and quality checks
      entry: make run_compliance
      language: system
      types: [python]
    - id: tests
      name: Run tests
      entry: make run_tests
      language: system
      types: [python]

The first thing to notice that it's different from the available hooks is that the repo is local, that means that it will only run the commands in your terminal without importing the hook from some remote repository.

The above block means if any python file is modified that the hooks will run in the declared order. If any hook fail, pre-commit doesn't allow you to commit that change until you fix all the errors and try to commit again.

You may want to avoid running the tests in every commit (it's a bit annoying if the suite test is huge), so it's possible to configure the stage where pre-commit should run.

Personally I like that the commits of every project that I've work uses the Conventional Commits pattern, so I use commitzen hook to validate if the commit-message respects the rules, so let's add one more hook:

  - repo: https://github.com/commitizen-tools/commitizen
    rev: v2.20.3
    hooks:
      - id: commitizen
        stages: ["commit-msg"]

The defined stage says that this hook will only run when I'm committing changes. If you want that pre-commit runs in stages different from the commit one, you need to install the configuration needed, that's why in the line 10 of the Makefile I have a command to set up the commit-msg and pre-push hooks. This makes that commitzen run to validate the commit message and only allows the push to the remote repository if all the rules are passing.

I know that by now you must be thinking that all this process is annoying, and like I said in the last post it can add bureaucracy, but I really believe that this kind of configurations and furthermore governance it will make our code more trusted and the maintenance easier.

Check this gist to find all the examples. If you have any comments, please leave in the gist comments section!

That's all folks.

Lays’s on the Clouds

Don't let code smells to be merged! Use pre-commit!