Rogue regex trips Cloudflare worldwide

By on
Rogue regex trips Cloudflare worldwide

"Very painful for our customers."

Reverse proxy and content delivery network Cloudflare has revealed that a coding fumble was behind yesterday's half hour global outage, not an attack as some had speculated.

Cloudflare chief technology officer John Graham-Cumming posted an initial brief post-mortem about the outage, saying the company is "incredibly sorry that this incident occurred".

Graham-Cumming said outage that started just before midnight Australian time was caused by a single misconfigured rule within the company's web application firewall (WAF).

The rule contained a regular expression used to catch and act on certain conditions, which caused processor usage to spike to 100 percent on Cloudflare's machines worldwide, a problem the company's engineers had not experienced before.

Thanks to the excessive CPU utilisation, customers received 502 Bad Gateway errors when they tried to access sites proxied by Cloudflare.

Cloudflare had deployed new WAF rules to block inline Javascript used in attacks.

While the rules were deployed in simulated mode to identify and log issues, with no customer traffic being logged so as to measure false positives, the rogue regex meant traffic dropped 82 per cent worldwide until Cloudflare issued a global kill to remove the WAF rulesets.

Graham-Cumming said incidents like the above one are "very painful for our customers" and added that Cloudflare's testing procedures were insufficient in this case.

He promised a review and changes to Cloudflare's testing and deployment process to avoid future incidents of the same kind.

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © . All rights reserved.

Most Read Articles

Log In

  |  Forgot your password?