Best practices when using regular expressions

If you need to use a regex match, consider following these best practices:

  • Don’t use regular expressions to specify alternatives. For example, if you want to match on a limited number of paths, like http://www.example.com/(choice1|choice2|choice3)/, create separate rules for each option instead:
    • http://www.example.com/choice1/
    • http://www.example.com/choice2/
    • http://www.example.com/choice3/

      In this case, while using the regular expression reduces the number of rules you have, it increases the cost significantly. Remember, the cost to evaluate one regular expression is often 100 times more expensive than the corresponding set of rules rewritten without regular expressions.

  • Review the list of rules for the entire policy version and sort based on the following order of precedence:
    • Protocol (HTTP/HTTPS)
    • Hostname
    • Path
    • Query String
  • If you have to use regular expressions in a rule, include a combination of hostname, path, and query string matches whenever possible to reduce the cost. For example:
    Example Match Structure Substitution Pattern for Redirect
    You want to extract the product ID from a query string parameter, and redirect using the ID as a path parameter.
    1. Query String match: prod_id=*
    2. Regex match:^https?://host1.example.com/path1(?:.*)[?&]prod_id(?:=([^&]*))?
    https://host2.example.com/products/\1
    You want to capture everything after /path1/* on host1.example.com and re-route to /path2/ on host2.example.com.
    1. Path match: /path1/*
    2. Regex match:^https?://host1.example.com/path1/?(.*)?
    https://host2.example.com/path2/\1