Skip to content

unraw-re-pattern (RUF039)

Fix is sometimes available.

This rule is unstable and in preview. The --preview flag is required for use.

What it does

Reports the following re and regex calls when their first arguments are not raw strings:

  • For regex and re: compile, findall, finditer, fullmatch, match, search, split, sub, subn.
  • regex-specific: splititer, subf, subfn, template.

Why is this bad?

Regular expressions should be written using raw strings to avoid double escaping.

Fix safety

The fix is unsafe if the string/bytes literal contains an escape sequence because the fix alters the runtime value of the literal while retaining the regex semantics.

For example

# Literal is `1\n2`.
re.compile("1\n2")

# Literal is `1\\n2`, but the regex library will interpret `\\n` and will still match a newline
# character as before.
re.compile(r"1\n2")

Fix availability

A fix is not available if either

  • the argument is a string with a (no-op) u prefix (e.g., u"foo") as the prefix is incompatible with the raw prefix r
  • the argument is a string or bytes literal with an escape sequence that has a different meaning in the context of a regular expression such as \b, which is word boundary or backspace in a regex, depending on the context, but always a backspace in string and bytes literals.

Example

re.compile("foo\\bar")

Use instead:

re.compile(r"foo\bar")