How I misused a German server to collect and publish the only ENF data available of the European grid since 2019.
In every audio sample, there are obviously noises. Most commonly, they're the ones we hear. However, in every recorded (in real life) audio sample, at about 50 or 60 Hz, there's another incredibly important noise: the power grid. Power grids in the United States run at about 50Hz, and those elsewhere tend to run at about 60Hz. They slightly adjust their frequency every second, in no particular pattern, to match power demand. Since every single grid has these unique sounds, we can, in theory, accurately match any audio sample to an exact time (and general region) of recording. We can even verify the authenticity of any audio sample! This approach, using electrical network frequency (ENF) analysis, is currently (allegedly) used by only a few government-level actors. Open-source intelligence tooling developers at Bellingcat are working on a public implementation of the technique. However, available ENF datasets are very limited and out-of-date, but crucial for the approach to work.
This is where a high schooler with too much free time can help. At
mainsfrequency.com, there is a widget that shows the current ENF of the European
grid:
If this website has this data updating constantly, surely I can get
it too, right? Needless to say, a European ENF dataset that's
up-to-date would have quite an impact on the usefulness of ENF
analysis. Checking the network tab in dev tools, we see
second-by-second requests to some server at
netzfrequenzmessung.de
:
curl 'https://netzfrequenzmessung.de:9081/frequenz02c.xml?c=1279246' -H 'User-Agent: {redacted}' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' -H 'Content-Type: text/plain'
What's this c
query
parameter? It was pretty simple to find in the source code:
function AjaxAufruf() {
if( req) {
if ("withCredentials" in req){ // nur wenn Browser CORS unterstützt gibt es Credentials
req.open( "GET", url0+"?c="+Math.round(Math.random()*100000)*31, true ); // verschiedene Namen da IE sonst cacht und nix erneuert
req.onreadystatechange = CallbackFkt;
req.setRequestHeader('Content-Type', "text/plain");
try{ req.send( null ); } catch(e) { console.log('Fehler: '+e); }
}
}
}
Code as found originally, with modified indentation.
Now, I don't speak German, but I do know how to use Google
Translate! It seems that
c
is used to work around IE
request caching. After playing with it for a little, I found that
c
is actually also used to
verify the "authenticity" of the request. Using the same number
twice too often returns a
429 Too Many Requests
(but
it seems to reset every so often). From here, I started trying
random c
-values. Only
positive integers are generated by the website's code, so I tried
negative numbers (random multiples of
Forwarded
headers to avoid
rate limits. In Python, the code is basically
def get_enf_data():
url = "https://netzfrequenzmessung.de:9081/frequenz02c.xml?c=" + get_c()
ip = get_ip()
headers = {
"Accept": "*/*",
"Accept-Language": "en-US,en;q=0.5",
"Connection": "keep-alive",
"Host": "www.mainsfrequency.com",
"Referer": "https://www.mainsfrequency.com/", # trust me bro
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": getUA(),
"Forwarded": "for=" + ip,
"X-Forwarded-For": ip,
}
response = requests.get(url, headers=headers)
# parse XML, write to CSV...
Running this function about once a second on
HackClub's Nest
(a free service for high schoolers) gives fewer than ten seconds of
data loss per day!
The following graph looks very "this is the ENF signature of a
power grid over a timeframe of one hour, plotted with
matplotlib
"-y:
* — For rate limits that last less than three seconds, we average
the value that was skipped. I found that averaging a gap more than
one second could lead to unpredictable results, so I didn't average
those errors. The ten-seconds-of-data-loss statement includes both
filled and unfilled errors.*
Rendering a full day (it was an accident...) took a pretty long
time, but looks very believable:
Cool! We now have day-by-day CSV data of the European grid's ENF
signature. Every month, I also run a script to compile all of the
CSV into a single Parquet file to share with the world (but mostly
the people at Bellingcat).
tl;dr: open-source is great and breaking things is fun.
p.s. what data I collected is on my GitHub. If you want to run the
code, use the Sep 24. 2024 version of
europe.py
or the most
up-to-date version of
swissgrid.py
.
Update 9/20/2024: so they found out that I was abusing their servers and banned me (also changed the data format...). I reached out to them asking if I could just buy the data, but the price was quite large ($500) and outdated (most of the data they offered to sell was already online). Seeing as it's not a large company that can afford to keep servers up 24/7, I stopped the script... but if you need German ENF data and don't really care about ethics, the code is easy enough to run and can be found in the repo.
Update 10/13/2024: I was bored at school and modified the code to basically do the same thing but from swissgrid.ch; I got rate-limited in 2 minutes. If you're serious about running this code, use Tor or a proxy service to change your IP frequently.