Watch Your Logs! Stop the Bad Guys.

If you’re running a WordPress site with the Apache web server and haven’t looked at your server logs, it’s time you did. Unless you’ve installed some fairly heavyweight server and network protections you’re probably going to see that a lot of the traffic to your web site are hackers looking for vulnerabilities they can exploit. Log entries like this:

116.193.76.165 - - [17/Nov/2020:00:55:37 -0500] "GET //wp-login.php HTTP/1.1" 200 6032 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
116.193.76.165 - - [17/Nov/2020:00:55:40 -0500] "POST //wp-login.php HTTP/1.1" 200 6131 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
116.193.76.165 - - [17/Nov/2020:00:55:42 -0500] "POST //xmlrpc.php HTTP/1.1" 403 3523 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
107.180.88.41 - - [17/Nov/2020:00:55:46 -0500] "GET //wp-login.php HTTP/1.1" 200 6032 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
107.180.88.41 - - [17/Nov/2020:00:55:47 -0500] "POST //wp-login.php HTTP/1.1" 200 6131 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"
107.180.88.41 - - [17/Nov/2020:00:55:48 -0500] "POST //xmlrpc.php HTTP/1.1" 403 3523 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"

While the image of a dateless, greasy-haired incel with nothing better to do than break into your hobby site may pop into your head, in fact most such web server probes are usually done by robot scripts. “Script kiddies”, with possibly no real computer skills, download script kits off hacker web sites. These bot scripts do the dirty work for them, for which these wannabe terrorists believe they earn bragging rights when they’re successful. One script with enough bandwidth could hit thousands of web sites before lunch. Don’t feel “special”. They’re just probing for site vulnerabilities that smarter brains than theirs can exploit.

Securing a website isn’t a monolithic thing. There are too many potential attack vectors for an all-in-one solution but there are dozens of things you can do, from securing directories writable by the web server user to sanitizing user input to carefully vetting any interactive software you host for anonymous users. I’ll get into some of them later but what I like best is snuffing these assholes before they get to your content.

Blocking the IP addresses of known offenders in a black list isn’t technically challenging. Before I took notice of Ubuntu Linux’s UFW (Uncomplicated Firewall), I did this within the Apache web server using this set of directives within a domain’s configuration:

<Directory /your/physical/web/directory/filepath>
   Options -Indexes
   <RequireAll>
      Require all granted
      Include conf/your_banned_ip_list.txt
   </RequireAll>
</Directory>

An external text file under /etc/apache2/conf contained a list of IP addresses not to allow on the site. It took the form of

Require not ip IP_ADDRESS

It worked, sorta, but it had to be implemented in every virtual domain. And sometimes it didn’t work at all and for reasons that never appeared in Apache’s error logs. Mainly, running a block file isn’t what Apache was built for. It added unnecessary overhead to every page load and slowed down delivery.

This job is more in the wheelhouse of a firewall. Hardware-based firewalls are the fastest and most efficient solution and you may even have one in your commodity router although its maintenance requires logging into a separate device and the tools available are generally terse and basic. And most will block traffic to everything downstream of it, which may or may not be what you want. Server-based software firewalls appeared very early in Linux with iptables. It unfortunately had a fairly steep learning curve. To promote greater use of its software firewall, Linux introduced UFW, which is basically a friendly wrapper around iptables.

Installing and configuring ufw is pretty simple. These instructions are based on Ubuntu 18.04 but should mostly apply to any flavor of Linux. The first thing you want to do after you enable it…

sudo ufw enable

is to set up global defaults, which in most cases will be to enable all outbound traffic and disable all inbound traffic. You’ll be creating exceptions to these two rules in a minute.

sudo ufw default deny incoming
sudo ufw default allow outgoing

You’ll probably want to accept incoming SSH connections.

sudo ufw allow ssh

UFW has a few pre-configured connection “app types” like ssh so it already knows what they need. That will open outside SSH connections on the default port 22. If you want to run SSH on another port, like 2222, you would use this command instead:

sudo ufw allow 2222

That’s pretty straight-forward, right?

Providing that UFW is enabled you can see your rules with

sudo ufw status [verbose]

Let’s go ahead and enable your web server for outside connections on the standard ports 80 (http) and 443 (https).

sudo ufw allow 'Apache Full'

You can also enable just port 80 (‘Apache’) or just port 443 (‘Apache Secure’).

To see a list of preconfigured applications:

 sudo ufw app list

Does that mean you can add custom configurations? Yes. This is done in the files under /etc/ufw/applications.d. We’re going to add one for blocking selected IP addresses to your web server in either raw or CIDR format in the existing /etc/ufw/applications.d/apache2-utils.ufw.profile. Add this to the bottom of that file and restart UFW.

[Web User]
title=Web User domain controller for both port 80 and 443
description=This is normally used to block IPs from Apache
ports=80,443/tcp

With this app rule we can start banning the troublemakers.

sudo ufw reject from IP_ADDRESS to any app 'Web User' comment 'Wordpress hacker'

Wait! It didn’t work! There’s one caveat to keep in mind with UFW. It processes its rules in chronological order. So if you’ve already opened up outside connections to your web server with the “Apache Full” app rule above it will ignore subsequent “deny/reject” rules.

What you want to do is insert that rule before it gets to the default rule. In this case we want to insert it as the first rule UFW sees:

sudo ufw insert 1 reject from IP_ADDRESS to any app 'Web User' comment 'Wordpress hacker'

Use this command to see your rules with line numbers.

sudo ufw status numbered

So you’re probably wondering which IP addresses to ban from your web site. That’s up to you. There are prefab online lists of IP addresses of the usual suspects but I don’t recommend them because the most dangerous hackers rarely hack from their own networks but on victim servers they’ve already compromised. You could be blackholing an innocent party and only mildly inconveniencing a hacker.

What I do is look for patterns of hacker behavior in the logs, such as something repeatedly requesting /wp-login.php. If you see four or five of those in a row it usually indicates that someone is trying to crack your account credentials (you did remember not to use “admin” as your admin username, right?) Also lots of failed requests within a very short span of time, especially for files you don’t have on your site. One can be reasonably certain it’s a hacker script if you see something like this. No human types this fast.

92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.git/config HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.env HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /sftp-config.json HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.ftpconfig HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.remote-sync.json HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:48 -0400] "GET /.vscode/ftp-sync.json HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /.vscode/sftp.json HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /deployment-config.json HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /ftpsync.settings HTTP/1.1" 404 279
92.63.196.29 - - [07/May/2021:04:18:49 -0400] "GET /wp-includes/wlwmanifest.xml HTTP/1.1" 404 279

Be careful though: legitimate search engine crawlers exhibit behavior similar to this. You can tell the difference because they typically identify themselves in your logs as “googlebot” and “bingbot” and they don’t choke a server with a tommy gun of requests but pad their requests with several seconds between them [1]. Hacker scripts have no such manners.

Hackers also have favorite targets on WordPress sites, like /xmlrpc.com. For 95+% of WordPress sites this PHP file should be disabled and only re-enabled if WordPress complains during a software update. It’s a dangerous legacy utility which is why hackers target it.

I’ll get into more security tips later. In the meantime I can’t recommend more highly WordPress security plugins like Wordfence and iThemes. The free versions are pretty awesome. Wordfence clued me into a hacker who’d “owned” an Apache-writable directory buried deep in my uploads hierarchy and had created a mini-site for spamming.

One newer search engine, Huawei’s AspiegelBot a/k/a Petalbot, is an exception. It slammed my server with so many requests that I had to block it. It’s pissed off a lot of web admins.