Analysing nefarious ssh access attempts

Tags: linux, projects, security

Published on
« Previous post: Towards Topological Machine Learning — Next post: On writing »

Maintaining my own server has taught me a lot of things over the past years—among other things, I increased my respect for those brave system administrators that have to deal with more than one server, always making sure that no bad stuff happens to its contents. In this article, I want to offer you a brief look behind the veil and show you the strange things my poor server has to deal with on a daily basis.

Analysing IP addresses

More precisely, we are looking at potentially nefarious ssh access attempts. I consider a connection attempt to be nefarious when an invalid password is used to access the server. Since all my users employ public keys to access the server most of the time, this is a pretty solid criterion and easy to evaluate. First, let’s gather some evidence. All the authorization attempts are logged in /var/log/auth.log (or its archived versions). This file has a very simple structure:

Feb 25 00:07:21 myrddin sshd[22071]: Failed password for root from X.X.X.X port 493 ssh2
Feb 25 00:07:28 myrddin sshd[22075]: Invalid user ftpadmin from X.X.X.X port 567

As you can see, I redacted the IP addresses to protect the, well, probably innocent. Moreover, we observe that the error message is slightly different, depending on whether someone tried to access the server using an existing user, or an invalid one. Some Python magic (see the link at the end of the article) results in an enumeration of all connection requests, sorted by IP addresses. Out of 983892 total log entries (spanning roughly one month), 167911, i.e. roughly 17% of them, concern a potentially nefarious connection attempt. In this calculation, the subsequent messages of sshd complaining about an invalid password or an invalid user have been ignored, so the actual percentage is even higher. This is somewhat disconcerting already.

My surprise was even larger when I checked which IP addresses were responsible for these requests. It turns out that three unique IP addresses are responsible for more than 25% of all invalid password requests. I will not list them here for privacy reasons—all of them belong to ISPs from China. Two of them are under the auspices of China Telecom, while the third one is registered to belong to China Unicom. While both ISPs list a contact e-mail address for reporting abuse, I doubt that contacting them will prove to be effective. If these access attempts do not stop, I will try writing them a nice letter. In the meantime, why not collect more data? nmap has some nice OS detection capabilities:

sudo nmap -O -Pn X.X.X.X

For the two China Telecom IP addresses, only a single port is open to me, namely 25. Despite the port number, this appears to contain an ssh server, to which I probably will not have access. nmap guesses wildly that this might be a FreeBSD 6.2-RELEASE system, but it acknowledges that the OS results might be somewhat off. The China Unicom IP address, by contrast, proves to be more interesting:

Nmap scan report for X.X.X.X
Host is up (0.18s latency).
Not shown: 977 closed ports
PORT     STATE    SERVICE
22/tcp   open     ssh
25/tcp   open     smtp
80/tcp   filtered http
111/tcp  filtered rpcbind
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
199/tcp  filtered smux
445/tcp  filtered microsoft-ds
593/tcp  filtered http-rpc-epmap
901/tcp  filtered samba-swat
1025/tcp filtered NFS-or-IIS
1034/tcp filtered zincite-a
1068/tcp filtered instl_bootc
1434/tcp filtered ms-sql-m
3128/tcp filtered squid-http
4444/tcp filtered krb524
5800/tcp filtered vnc-http
5900/tcp filtered vnc
6006/tcp open     X11:6
6129/tcp filtered unknown
6667/tcp filtered irc
6669/tcp filtered irc
8080/tcp filtered http-proxy
Device type: general purpose|WAP|storage-misc|broadband router
Running (JUST GUESSING): Linux 3.X|4.X|2.6.X|2.4.X (95%), Asus embedded (92%), HP embedded (91%)
OS CPE: cpe:/o:linux:linux_kernel:3 cpe:/o:linux:linux_kernel:4 cpe:/o:linux:linux_kernel cpe:/h:asus:rt-ac66u cpe:/h:hp:p2000_g3 cpe:/o:linux:linux_kernel:3.4 cpe:/o:linux:linux_kernel:2.6.22 cpe:/o:linux:linux_kernel:2.4
Aggressive OS guesses: Linux 3.10 - 4.11 (95%), Linux 3.13 (95%), Linux 3.13 or 4.2 (95%), Linux 4.2 (95%), Linux 4.4 (95%), Linux 3.16 (94%), Linux 3.16 - 4.6 (94%), Linux 3.12 (93%), Linux 3.2 - 4.9 (93%), Linux 3.8 - 3.11 (93%)
No exact OS matches for host (test conditions non-ideal).
Network Distance: 16 hops

OS detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 58.88 seconds

Again, no exact matches there, but the number of open and filtered ports is interesting. The SSH server responds with OpenSSH_6.9p1 Ubuntu-2 pat and apparently permits logins via certificate and passwords. Hence, in theory, I could now play the same game and try out user and password combinations, but I do not want to stoop to that level.

Analysing user names

Let us rather do a more interesting analysis, namely tabulating user name combinations over all failed nefarious requests. The top five valid user name requests are:

  1. root (98.42%)
  2. backup (0.29%)
  3. www-data (0.14%)
  4. ghost (0.11%)
  5. nobody (0.08%)

It is pretty clear that most scripts target root specifically. Of course, none of these users is allowed to login via ssh anyway on my server, but an attacker does not know that. My main take-away of this list is that automated hacking attempts have become relatively fine-tuned these days. I only experimented with ghost for about a month, so it was interesting to see that some scripts already include this platform.

As for the invalid user name requests, the distribution is quite strange. The top ten requests are:

  1. admin (4.57%)
  2. test (3.81%)
  3. user (3.07%)
  4. ubuntu (2.64%)
  5. ftpuser (2.40%)
  6. postgres (1.19%)
  7. oracle (1.10%)
  8. nagios (1.08%)
  9. git (0.92%)
  10. teamspeak (0.89%)

The distribution is much more uniform, in some sense—some access attempts really go through a lot of interesting two-character user names, maybe because they are targeting a specific market. Woe to those who install one of these services on their server and have it facing the outside world.

Analysing countries

Finally, I want to visualize where most of the nefarious connection attempts are coming from. This is a prime occasion to bring out a Choropleth map! Numerous services and packages exist for accomplishing this task, but in the end, I went with plotly because their code interfaces nicely with Python. I only had to create a table in which I collected countries, their ISO-3166 code, and the number of requests. The former two pieces of information can be easily obtained from a GeoIP database. To use this with plotly, I merely had to convert the two-character country code to a three-character one. This resulted in the following map, for which the shading corresponds to the number of failed access attempts.

A world map of all failed ssh access attempts

To make small-scale differences visible, I used logarithmic scaling. This means that there are roughly ten times more failed password attempts occurring from Chinese IP addresses than from U.S. ones, for example. This is pretty sobering to see.

Code & coda

It is certainly a humbling experience to see what a typical server has to deal with every day. While none of these access attempts are probably targeting myself directly, it still feels somewhat odd to be subjected to these things. I will have to think carefully about how to handle this; probably, it is now time for fail2ban, or changing ssh access ports.

Alternatively, I could try to be nice and send these users some messages in which I express my desire not to attacked. If you are one of these persons, please stop doing what you are doing. If you are interested in performing the same analysis for your own server, have a look at Auceps, the collection of scripts I wrote for analysing logs.

Stay safe, until next time!