Overview
This article describes how to set up Domovoi, a Python and Bash repository that created to enhance the protection offered by Pi-Hole. Domovoi will send text alerts via Twilio whenever PiHole detects that someone on the network has accessed a previously-unseen or seldom-seen domain, and whether the request was permitted or blocked. Additionally, it can perform geolocation on the IP addresses returned by the downstream DNS server to decide whether a DNS query should be blocked or permitted based on which country corresponds to the IP address. The repository is named after Domovoi, the butler and bodyguard of the titular character and protagonist of the Artemis Fowl novels by Eoin Colfer, some of my favorite novels.
Background/Problem: Improving DNS Filtering
A few months ago, I started using PiHole on my home network. It was very useful for blocking advertisements on my devices, particularly those that don’t have ad-blockers available. I noticed my streaming services’ quality improve just from blocking requests to several domains that were purely for advertising purposes. However, since PiHole operates on a blacklist/whitelist system for domain names or patterns of domain names, it occasionally blocked websites that were necessary. When this happened, I had to log into the Pi-Hole admin console and iteratively filter by time to see which new sites were blocked only recently, which was time-consuming and led to some false positives that took longer than expected to resolve.
Additionally, in searching through these requests, I found that making a request to one website in a web browser or using an app on a smartphone often would result in requests to multiple publicly registered domains, such as content delivery networks’ domains, and some of these were clearly advertising-centric domains that Pi-Hole was not blocking. This is understandable since new domains are created every day and Pi-Hole can’t assess them all instantly. However, since I did not consciously send requests to these websites, and some of them were only for advertisting purposes, I wanted to be able to identify and block these. However, some of them were necessary for the websites I had interest in to function as intended, so I needed some way of identifying only the newly-seen websites and getting alerts for them so I could look into them (and only them) manually.
Solution: Text Alerts for Previously Unseen or Seldom-Seen Websites
The selected solution was to create a separate process that ran on a cron job on my raspberry pi. This job, on its first invocation, logged into the PiHole admin interface, downloaded the last 30 days worth of DNS queries (both blocked and allowed) and cached this information (including the time the data was fetched) in two files, one for blocked fully qualified domain names and one for permitted domains. The reason for having permitted domains instead of permitted FQDNs is that there were many more permitted FQDNs than blocked FQDNs, so to avoid sending out more text messages than I could reasonably inspect manually, only new, permitted domains were considered instead of new, permitted FQDNs. This didn’t usually impede the inspections, since the domain was often enough for to judge on whether a site was trustworthy or not.
Then, every five minutes, the cron job would spin back up, log into the PiHole admin interface again, request all the DNS queries from the fetch timestamp to the current time, and compare this list with the list of cached DNS queries on disk. If there were any permitted DNS queries whose publicly registered domain names (e.g., tumblr.com or go.com) were not in the list of the last 30 days’ previously permitted queries, then a text would be sent to my phone with the list of previously unseen, publicly registered domain names. Additionally, any previously-unseen, blocked queries whose fully qualified domain names (e.g., ladygaga.tumblr.com or abcnews.go.com) had not been seen in the cached blocked domains list would be included in a separate text message. The two files would them be updated again with the fetch time, the newly-seen domains would be added to their respective files, and any domains more than 30 days old will be removed from the file. This comprised half of what Domovoi did, but another challenge quickly arose from the information these messages uncovered.
Problem: Blocking Connections to Untrusted Countries I had been using Domovoi for several weeks when I got a text that Domovoi had seen a request to the previously-unseen domain of adtarget.com.tr recently. Upon looking up the domain at duckduckgo.com, I found that adtarget.com.tr was a Turkish domain for an advertising service. This domain had been permitted by Pi-Hole, which was not ideal since Pi-Hole is supposed to block sites like this, but there is a good chance that the domain just hadn’t been added to the default Gravity blacklist yet. The more disturbing outcome was that I could have been unintentionally making requests to more nefarious countries than Turkey all this time without even knowing it. I looked online for some resources about IP-based geolocation and blocking with Pi-Hole. All the resources I found said that’s more of a job for a firewall than Pi-Hole. I expected to see that, but given that most modern applications would reach out to domains instead of single IP addresses, I wanted to see if I could “tack on” IP-based gelocation and blocking to Pi-Hole anyway.
Solution: DNS resolver with IP Geolocation
Given this, the solution selected was to take the work from the Windows GeoIP Firewall project and port it over to my current project. However, that work only covered Europe and Asia, and I wanted to block connections from Communist Cuba also. So to save time, I got enough data to do country-level geolocation from IP2Location and a small Python DNS resolver using this helpful Github Gist as a baseline. I then changed my Pi-Hole to point to my custom DNS resolver, which in turn pointed to my Unbound DNS resolver, and tried visiting some blocked sites in Vietnam and Communist China (vietnam.gov.vn and weibo.com to be specific). Both requests were blocked successfully. I then tried visiting whitehouse.gov and www.gov.uk and got through without issue. A few more days of using this solution without issue and I had found a solution to my problem.
Caveats and Future work
It would be helpful to integrate Twilio alerts with the GeoIP blocking logic. However, twisted (the python library the GeoIP blocker uses) appears to have issues making DNS requests on the same thread that processes a given DNS request, so this will take more work to implement. Additionally, I want to create a third function for blocking malicious websites based on their ratings from virustotal.com or other websites, as Pi-Hole’s Gravity blacklist is more geared toward advertising sites than hosts for malware.