BSc thesis

Behavioural Anomaly Detection in Network Traffic via Suricata IDS and ELK Stack: A Sandbox-Based Approach to Identifying APT Lateral Movement Patterns

Sofia Dubchak, Igor Sikorsky Kyiv Polytechnic Institute, ФТІ, Department of Mathematical Methods of Information Protection. Defended June 2025. Grade: 92/100 (A).

Abstract

Modern advanced persistent threat (APT) campaigns increasingly rely on “living-off-the-land” techniques and patient lateral movement that produce few high-confidence indicators in isolation. This thesis investigates a behavioural-detection approach in which the cumulative pattern of low- severity events — rather than any single event — surfaces the attack.

A sandbox laboratory is constructed using the Suricata intrusion detection system, the Elastic Stack (Elasticsearch, Logstash, Kibana), and Filebeat, deployed via Docker Compose for reproducibility. Approximately thirty detection rules are authored, mapped to MITRE ATT&CK techniques, and tuned against a curated set of replay pcaps drawn from the public Malware- Traffic-Analysis dataset and from synthetic captures generated within the lab.

The empirical chapter evaluates the rule-set against three replayed APT scenarios — PowerShell-based credential harvesting, DNS-tunnelling exfiltration, and Active Directory enumeration — and discusses the trade-off between false-positive volume and detection sensitivity. The thesis concludes that behavioural cumulative-scoring approaches outperform single-rule detection in the studied scenarios but require continuous tuning against the defended environment to remain useful, a finding consistent with the broader literature on detection engineering.

Why this topic

I picked it because the Ukrainian threat landscape has been a forced lab for behavioural-detection thinking since 2014 and especially since 2022 — CERT-UA’s bulletins on UAC-0010 / Gamaredon, UAC-0057 / GhostWriter, and similar campaigns are full of the kind of “quietly persistent” lateral movement that single-rule detection misses. The thesis tries to take that kind of public-bulletin TTP description and ask: can you actually build a detection that catches this pattern, in a way that’s tunable for a real SOC?

The answer is “sometimes, with continuous work” — which is the boring correct answer, but it changed how I think about detection engineering as a discipline rather than a one-off rule-writing exercise.

Public artefacts

Repository: github.com/palianytsia-200/suricata-elk-lab — the docker-compose lab, ~30 MITRE-tagged Suricata rules, and the eve.json parser used in the empirical chapter.
CTF write-ups at github.com/palianytsia-200/ctf-writeups — supporting practice for the rule-tuning instinct.

The full thesis text is held by the KPI ФТІ archive; if you’re a KPI student or a cybersec professional with a teaching context, contact the department for academic-use copies.

Keywords

intrusion detection · Suricata · ELK Stack · MITRE ATT&CK · behavioural anomaly detection · APT lateral movement · detection engineering · PowerShell credential harvesting · DNS tunnelling · Active Directory enumeration · UAC-0010 · Gamaredon

What I’d do differently in retrospect

A short list of things I’d change if I were starting the thesis today:

Start earlier on the false-positive tuning chapter. I underestimated how much wall-clock time the tuning loop takes — a single rule might need 5–10 iteration cycles against the home-lab traffic before it stops firing on chocolatey installs and Auth0 magic links. Allocate a real month for tuning, not the optimistic two weeks I planned.
Use a wider replay corpus. I leaned heavily on Malware-Traffic- Analysis pcaps because they’re well-documented, but they’re also curated for clarity — real captures from a noisy enterprise are messier. If there’s a v2, I’d add CERT-UA-published indicators (when they include pcaps, which is rare) and try to capture genuine home- lab “user” traffic over a longer window.
Add a comparison baseline against an EDR vendor’s out-of-the-box ruleset. It would have been useful to show “here’s how a default commercial Suricata Pro / Crowdsec / Snort ruleset performs on the same scenarios, and here’s where my behavioural ruleset diverges.”
Include a Bro/Zeek-based baseline. Suricata is a great IDS but Zeek’s logging-first model is conceptually different. A side-by-side on the same pcaps would be a fair comparison and I think Zeek would have done better on the lateral-movement scenarios specifically.

If a future MSc happens, those are the directions.