<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://palianytsia-200.pp.ua/feed.xml" rel="self" type="application/atom+xml" /><link href="https://palianytsia-200.pp.ua/" rel="alternate" type="text/html" /><updated>2026-05-10T21:31:40-07:00</updated><id>https://palianytsia-200.pp.ua/feed.xml</id><title type="html">Sofia Dubchak</title><subtitle>Junior cybersecurity at Brights (Kyiv). KPI ФТІ &apos;25, BSc Cybersecurity. Notes on detection engineering, ISO 27001 / 42001, and Russian APT activity (mostly UAC-0010 / Gamaredon).</subtitle><entry><title type="html">SANS AI Cybersecurity Summit 2026 — three things I’m bringing back to Brights</title><link href="https://palianytsia-200.pp.ua/2026/04/25/sans-ai-summit-arlington-takeaways/" rel="alternate" type="text/html" title="SANS AI Cybersecurity Summit 2026 — three things I’m bringing back to Brights" /><published>2026-04-25T07:30:00-07:00</published><updated>2026-04-25T07:30:00-07:00</updated><id>https://palianytsia-200.pp.ua/2026/04/25/sans-ai-summit-arlington-takeaways</id><content type="html" xml:base="https://palianytsia-200.pp.ua/2026/04/25/sans-ai-summit-arlington-takeaways/"><![CDATA[<p>I’m just back in Kyiv from the SANS AI Cybersecurity Summit (Arlington
VA, April 20–21). Brights sponsored me as part of the company’s
ISO 42001 readiness investment, and it was — to my own surprise —
worth every Z of the budget they put behind me.</p>

<p>This is my “what I’m actually going to do differently on Monday” post.
Three takeaways that map directly onto the AI-risk work I’ve been
running since September.</p>

<h2 id="quick-orientation">Quick orientation</h2>

<p>SANS AI Cyber Summit ran two days at the Hilton Arlington Rosslyn,
plus pre-summit training options I didn’t take (Brights paid for the
summit only, which was the right scope for a junior). The speaker
list for the summit-proper included some heavyweights:</p>

<ul>
  <li><strong>Bruce Schneier — “Integrous AI”</strong> (Day 1 keynote)</li>
  <li><strong>Jacob Klein, Anthropic — “This Is Not a Forecast”</strong></li>
  <li><strong>Sounil Yu — “Claw and Order”</strong> (riff on “Cyber Defense Matrix”
applied to AI systems)</li>
  <li><strong>Anne Neuberger — “Our Machine-Speed Mandate”</strong> (former Deputy
National Security Advisor; the policy / national-security framing)</li>
  <li><strong>Julie Davila — “The Boring Seams”</strong> (Day 2 — practical AI
security at scale; one of the best non-keynote talks of the summit)</li>
  <li><strong>Diana Kelley — “Cram It Up Your Cramhole, LaFleur”</strong> (provocatively
titled; the actual content was about adversarial prompt injection
taxonomy)</li>
  <li><strong>BG Reid J. Novotny — “Beyond the Hype”</strong> (military-cyber framing)</li>
  <li><strong>Pliny the Liberator — “Sailing Towards Vesuvius”</strong> (Day 2 closer —
mostly red-team-on-LLMs theatre, lower information density than I’d
hoped)</li>
  <li><strong>Yevhen Pervushyn (Red Asgard) — solo session</strong> (Day 2 — the
Ukrainian-perspective slot. We chatted briefly after; he was
generous with time for a junior asking basic questions.)</li>
</ul>

<p>I’ll write up notes on individual talks separately if I get to them
(unlikely — I’m behind on threat-intel digests). For this post I want
to focus on the three meta-lessons I want to carry back into Brights’
ISO 42001 work.</p>

<h2 id="1-the-ai-risk-taxonomy-is-settling-slowly">1. The AI-risk taxonomy is settling, slowly</h2>

<p>Schneier’s keynote made a point I’ve been circling around in the
Brights crosswalk work without articulating clearly: the AI-security
field has spent ~3 years arguing about <em>which framework</em> (NIST AI RMF,
ISO 42001, EU AI Act, OECD AI Principles, the various corporate
“responsible AI” white papers) and is now starting to converge on a
shared taxonomy underneath. The frameworks differ in form but are
substantively quite similar:</p>

<ul>
  <li><strong>Inputs</strong>: data lineage, training-set governance, IP/PII filtering
before training, adversarial-input detection at runtime.</li>
  <li><strong>Models</strong>: alignment, evaluation, explainability, vulnerability
surface (prompt injection, data poisoning, model extraction).</li>
  <li><strong>Outputs</strong>: harmful-content filtering, decision-explainability,
audit logging, human-in-the-loop where stakes warrant.</li>
  <li><strong>Lifecycle</strong>: deprecation, retraining, drift monitoring, post-
deployment incident response.</li>
</ul>

<p>ISO 42001’s 38 controls fit that 4-bucket structure cleanly once you
read past the language. So does NIST AI RMF 1.0 (Govern / Map /
Measure / Manage). So does the EU AI Act (categorised by risk-tier).</p>

<p><strong>What I’m doing differently:</strong> restructuring the Brights crosswalk
matrix from “ISO 42001 control → existing engineering practice” into
“shared 4-bucket category → ISO 42001 control + NIST AI RMF
function + EU AI Act risk-tier mapping → existing engineering
practice”. More work upfront; much less work when the next AI
governance standard ships and we have to add another column.</p>

<h2 id="2-boring-seams-is-exactly-the-right-framing-for-production-ai-security">2. “Boring seams” is exactly the right framing for production AI security</h2>

<p>Julie Davila’s talk was the operationally-richest one of the summit.
The thesis: most AI-security disasters aren’t the spectacular
prompt-injection-as-RCE we red-team about — they’re the boring seams
between AI systems and the rest of the production environment.
Specifically:</p>

<ul>
  <li><strong>Auth boundaries</strong>: an LLM agent that has its OWN auth context
to a downstream API can do things the human-on-behalf-of can’t.
This is auth-as-confused-deputy at AI scale.</li>
  <li><strong>Data exfiltration via context</strong>: an LLM that has access to
customer A’s data AND is exposed to customer B’s prompt is one
jailbreak away from cross-customer leakage. The fix is data-
silo-by-customer at the prompt-construction layer.</li>
  <li><strong>Logging gaps</strong>: most logging stacks weren’t designed to handle
multi-thousand-token prompt + response pairs at scale. Storing all
the inputs/outputs is expensive; not storing them is an
audit-evidence gap.</li>
  <li><strong>Cost-runaway</strong>: an unguarded LLM endpoint is a denial-of-wallet
vulnerability. Rate-limiting is mandatory but often forgotten in
internal-only deployments.</li>
</ul>

<p><strong>What I’m doing differently:</strong> adding a “boring seams checklist”
section to the Brights AI-risk pre-deployment review template.
Five items above plus the three from Anne Neuberger’s keynote
(machine-speed-incident-response readiness, AI-system-incident
runbook, supply-chain governance for model artifacts). Going to
test it against the AI-services product the dev team is building
this quarter.</p>

<h2 id="3-ukrainian-context-matters-here">3. Ukrainian context matters here</h2>

<p>Yevhen’s session was the only one that explicitly addressed the
Ukrainian situation. The shape of his argument: the Russian APT
operators (UAC-0010, UAC-0050, etc.) are already using LLM-assisted
content generation for spear-phish landing pages and HTML lures —
he showed concrete examples of generated lure content where the
language tells (specific Ukrainian-vs-Russian word choices, missing
post-2022 referent shifts) point at LLM generation. The defensive
implication: detection rules that key on linguistic markers
(“дякую за ваш звіт” vs the more colloquial “дякую за репорт” in
context of a tech professional) may help fingerprint LLM-generated
phish vs human-written.</p>

<p>This is downstream of detection-engineering work I’m already doing
at Brights but it’s a good additional dimension.</p>

<p>I cornered Yevhen briefly after his session to ask one specific
question — what does CERT-UA’s working relationship with EU CERTs
look like in 2026, post-EUCC-rollout? His answer was nuanced and
I’m not going to half-paraphrase it here, but the takeaway for me
was: there are real research / collaboration opportunities for
junior researchers in UA who are willing to write English-language
content on UA-side threat observations. Nothing to act on
immediately, but a thread to keep tugging.</p>

<h2 id="logistical-notes">Logistical notes</h2>

<ul>
  <li>Hilton Arlington Rosslyn was a good venue. The walk to Foggy
Bottom Metro was maybe ~15 minutes; a few of us went into DC for
dinner on Day 1.</li>
  <li>US visa for UA citizens — got mine via Warsaw consulate in
Q1 2026 ahead of this trip. Plan ~3 months out from departure if
you’re going through Warsaw.</li>
  <li>Worth combining BSides SF (San Francisco, March) and SANS AI
Summit (Arlington, April) in a single US trip if you can get the
visa to support it. I did this and it kept the per-event cost
reasonable for Brights.</li>
</ul>

<h2 id="honest-junior-perspective-caveat">Honest junior-perspective caveat</h2>

<p>A SANS summit is somewhere between a conference and a recruiting
event. The keynotes are excellent; the breakout sessions are
variable; the “networking” is important if you’re senior and
optional-but-useful if you’re junior. As a junior I got a lot
out of it because I went into specific sessions with specific
questions; I imagine if I’d shown up without that I’d have come
back with a worse report.</p>

<p>The Schneier “Integrous AI” keynote alone is going to be referenced
in every cybersec conversation I have at Brights for the next year.
Worth the whole trip just for that. I’ll re-watch the recording
when SANS releases it.</p>

<p>Слава Україні. 🇺🇦</p>]]></content><author><name>Sofia Dubchak</name></author><category term="sans" /><category term="ai-risk" /><category term="iso-42001" /><category term="conference" /><category term="brights" /><category term="nist-ai-rmf" /><summary type="html"><![CDATA[Just back from SANS AI Cybersecurity Summit in Arlington VA. The Schneier and Anne Neuberger keynotes were both better than expected. Three things I'm taking back to the ISO 42001 readiness work at Brights.]]></summary></entry><entry><title type="html">First month at Brights — what an ISO 27001 ISMS function actually does day-to-day</title><link href="https://palianytsia-200.pp.ua/2025/09/15/first-month-at-brights-iso-27001/" rel="alternate" type="text/html" title="First month at Brights — what an ISO 27001 ISMS function actually does day-to-day" /><published>2025-09-15T12:10:00-07:00</published><updated>2025-09-15T12:10:00-07:00</updated><id>https://palianytsia-200.pp.ua/2025/09/15/first-month-at-brights-iso-27001</id><content type="html" xml:base="https://palianytsia-200.pp.ua/2025/09/15/first-month-at-brights-iso-27001/"><![CDATA[<p>I joined Brights as a Junior Cybersecurity Specialist in early September —
their first dedicated cybersec hire after they got ISO/IEC 27001:2022
certified earlier in the year. I’d been here as a summer intern in 2024
doing OWASP Top 10 review on three internal apps, so this isn’t a
green-fields onboarding. But the full-time cybersec role is its own
distinct shape, and I want to write down what I’ve actually been doing
this month before it starts feeling normal and I forget how new it all
felt.</p>

<h2 id="the-thing-i-had-wrong-about-isms-work">The thing I had wrong about ISMS work</h2>

<p>I expected ISO 27001 ISMS work to be <strong>document-shuffling and policy
maintenance</strong>. Bureaucracy. Update the risk register, file the
audit-evidence folders, write a new policy when something breaks, repeat.</p>

<p>It’s not zero of that. But the actual day-to-day is much more
<strong>cross-team negotiation</strong> — translating between what the standard
requires, what the team’s existing practices actually do, and what’s
realistic to change without breaking working processes.</p>

<p>A concrete example from this month: ISO 27001 Annex A.5.7 (Threat
intelligence) says the org should “collect and analyse information
relating to information security threats”. Brights had nothing formal
in place — informal Slack channels where senior engineers occasionally
shared CVE alerts, but no structured intake / triage / action loop.</p>

<p>The ISO-checkbox version of “fix this gap” would be: write a
threat-intel policy, create a SharePoint page, declare done.</p>

<p>The actually-useful version is: figure out what threat intel a 50-person
shop can realistically consume (we’re not staffing a 24/7 SOC team), how
to triage it with junior-level effort (i.e. mine), what specific
CVE/IOC subscriptions matter for our stack (Node, .NET, AWS, Azure, the
specific OSS dependencies the dev teams use), and how to push something
actionable to dev leads when something matters for our actual code.</p>

<p>The ISO checkbox is downstream of doing the practical thing. If you do
the practical thing, the documentation writes itself. If you do the
documentation thing first, you have a binder full of policies that nobody
references. (Senior cybersec friends have been telling me this for years
and I half-believed them. Now I believe them.)</p>

<h2 id="the-split">The split</h2>

<p>Roughly:</p>

<ul>
  <li><strong>~30% SOC / detection / vuln management.</strong> Internal log review (we
have Suricata on the dev/staging perimeter, ELK ingest pipelines I
built up from the intern-summer Filebeat work), vulnerability scanning
on internal/customer-facing assets, triaging CVE alerts against our
dependency tree.</li>
  <li><strong>~30% ISMS / compliance / audit support.</strong> Risk register
maintenance, gap-tracking against Annex A controls, prep for the
external surveillance audit in November, vendor security-questionnaire
review.</li>
  <li><strong>~25% AI risk / ISO 42001 readiness.</strong> Building a NIST AI RMF
crosswalk against our existing 27001 controls, mapping which AI-risk
controls Brights already implicitly handles via existing engineering
practice and which need new process.</li>
  <li><strong>~15% awareness training / human stuff.</strong> Phishing-test rollout
prep (we run them quarterly), responding to “is this email
suspicious?” tickets from sales and customer-support team members.</li>
</ul>

<p>The 25% on AI risk is what justifies the conference-travel sponsorship I
got — Brights is positioning to add AI-services to their offering and
the founders correctly identified that going from “27001 cert” to
“42001 cert” requires real work, not just an additional binder. I’m
the person doing that work, which is fortunate because it’s the most
intellectually interesting part of the role.</p>

<h2 id="practical-things-i-did-this-month">Practical things I did this month</h2>

<p>A few specific items that landed:</p>

<ol>
  <li>
    <p><strong>Migrated the Suricata + ELK lab</strong> from my BSc thesis docker-compose
into the Brights staging environment, on a dedicated VLAN. The
ruleset is ~30 thesis rules + ~10 Brights-specific (we have a few
Node + .NET specific ones now). Filebeat ships eve.json into the
Elastic cluster. Nothing fancy, working baseline.</p>
  </li>
  <li>
    <p><strong>Wrote a one-page “ISO 27001 control → engineering practice”
crosswalk</strong> for the dev leads. Most engineering teams don’t have
time to read the full standard. I read it, summarised the 93
Annex A controls into “what does this actually mean for someone
writing Node code at a 50-person agency”, attached examples. Got
useful feedback (“we already do this implicitly”, “we don’t do this
at all”, “we do a worse version of this”) that’s now driving the
gap-track.</p>
  </li>
  <li>
    <p><strong>Built a quarterly threat-intel digest template.</strong> Sources: CERT-
UA bulletins (UA threat angle), CISA KEV catalog (urgent CVE), our
AWS/Azure security advisories, npm/NuGet security advisories
(matched to our actual dependency tree). Triaged outputs go to
dev leads in a 1-page Friday email. First issue ships next Friday.</p>
  </li>
  <li>
    <p><strong>Started the ISO 42001 readiness gap analysis.</strong> This is going
to take months. I’ve spent the September window reading the
standard cover-to-cover, mapping the 38 controls, and starting the
AI-RMF crosswalk. Not done. Probably won’t be done by Christmas.</p>
  </li>
</ol>

<h2 id="what-im-still-bad-at">What I’m still bad at</h2>

<p>A short and humbling list:</p>

<ul>
  <li><strong>Saying “I don’t know”</strong> in cross-team meetings. I have an
instinct to bluff because I’m the only cybersec person in the room.
My manager has caught me doing it and pushed back gently — “if
you’re not sure, say so and look it up later”. She’s right, I’m
trying.</li>
  <li><strong>Estimating effort.</strong> I told the founders the AI-42001 readiness
work would take “two months”. Actual answer is “six to nine months,
depending on how much engineering time we get from the AI-services
team”. I underestimated by 3-4×.</li>
  <li><strong>The audit narrative.</strong> ISO audits aren’t pass/fail tests; they’re
guided tours where the auditor decides whether you understand your
own ISMS well enough to be trusted to operate it. The November
surveillance audit will be my first time presenting the ISMS to an
external auditor and I’m noticeably nervous about it.</li>
</ul>

<h2 id="small-joys">Small joys</h2>

<ul>
  <li>The team is genuinely interested in security as a topic, not just
as a checkbox. The dev leads ask good questions in security
reviews. The CTO has a personal home-lab CTF habit. The CEO read
Schneier’s <em>Liars and Outliers</em> over the summer and references it.
This is a much better starting point than I expected.</li>
  <li>Free coffee is unlimited.</li>
  <li>Hybrid setup (3 days in office, 2 remote) is genuinely the right
pace for me right now.</li>
</ul>

<p>I’ll write more as I have more to say. Probably the next post is
either the November surveillance-audit retrospective, or whatever I
take away from BSides SF in March.</p>

<p>(I picked Brights specifically because the cybersec function was
nascent enough that I’d get to <em>build</em> parts of it, not just <em>operate</em>
parts of it. That call has held up after one month — first-junior-hire
roles trade ambiguity for ownership, and I’m getting both in
appropriate doses.)</p>

<p>Слава Україні. 🇺🇦</p>]]></content><author><name>Sofia Dubchak</name></author><category term="brights" /><category term="iso-27001" /><category term="isms" /><category term="grc" /><category term="junior" /><category term="career" /><summary type="html"><![CDATA[I've been at Brights for a month as their first cybersec hire. The ISO 27001 ISMS work is way less bureaucracy and way more cross-team negotiation than I expected. Notes for future-junior-me.]]></summary></entry><entry><title type="html">Thesis defended — what worked, what I’d change, what comes next</title><link href="https://palianytsia-200.pp.ua/2025/06/10/thesis-defended-and-what-comes-next/" rel="alternate" type="text/html" title="Thesis defended — what worked, what I’d change, what comes next" /><published>2025-06-10T04:20:00-07:00</published><updated>2025-06-10T04:20:00-07:00</updated><id>https://palianytsia-200.pp.ua/2025/06/10/thesis-defended-and-what-comes-next</id><content type="html" xml:base="https://palianytsia-200.pp.ua/2025/06/10/thesis-defended-and-what-comes-next/"><![CDATA[<p>I defended my BSc thesis last Friday. Title’s a long one:</p>

<blockquote>
  <p><em>Behavioural Anomaly Detection in Network Traffic via Suricata IDS and ELK
Stack: A Sandbox-Based Approach to Identifying APT Lateral Movement
Patterns.</em></p>
</blockquote>

<p>The full abstract sits on <a href="/thesis/">the thesis page</a> and
the supporting code is at
<a href="https://github.com/palianytsia-200/suricata-elk-lab">github.com/palianytsia-200/suricata-elk-lab</a>.
This post is the slightly more honest “what really happened during the
year” supplement that doesn’t go in the academic version.</p>

<p>Final grade: 92/100 (A). Solid but not top-of-cohort — there are 4–5
ФТІ ‘25 graduates who scored higher and they all earned it. I’m okay
with where I landed.</p>

<h2 id="what-worked">What worked</h2>

<p><strong>Picking a topic where I could actually replay traffic.</strong> Half the
behavioural-detection literature feels theoretical because the
authors don’t have access to attacker traffic to test against. By
deliberately constraining the empirical chapter to public Malware-
Traffic-Analysis pcaps + a few synthetic captures from my home lab,
I bounded the project to something I could finish. The temptation to
expand to “and also test against real enterprise traffic” was strong;
resisting it kept the deadline reachable.</p>

<p><strong>Docker-Compose for the lab.</strong> Reproducibility is the boring word
that academic supervisors keep saying and I sort of nodded along
without internalising. Then around chapter 4 my lab broke, I had to
rebuild it on a different laptop, and I lost a week. After that, the
Docker-Compose was non-negotiable. The whole lab now spins up in 90
seconds on a fresh machine.</p>

<p><strong>Dropping a chapter early.</strong> My original plan had a chapter on
Active Directory enumeration detection that turned out to require an
AD environment I didn’t have a clean way to build. My supervisor
suggested cutting it in chapter 3 review. I argued for keeping it.
She was right; I should have cut earlier. The trimmed thesis is more
focused and the AD chapter would have been weak.</p>

<p><strong>Starting the empirical chapter early.</strong> I ran rule-tuning loops in
parallel with chapter 2 writing. By the time I needed to write up
chapter 4 (empirical), I had three months of tuning iteration
already done.</p>

<h2 id="what-id-change">What I’d change</h2>

<p><strong>Wider replay corpus.</strong> I leaned heavily on Malware-Traffic-
Analysis pcaps because they’re well-documented, but they’re also
curated for clarity. A real enterprise capture would be messier and
the ruleset’s false-positive rate against it would have been more
illustrative. If there’s an MSc continuation, that’s the v2 angle.</p>

<p><strong>More time on the “why” of behavioural cumulative scoring.</strong> I
described the technique well enough but didn’t go deep enough on the
theoretical scaffolding (Bayesian belief updating? hidden Markov
model framing?). My committee asked exactly this question during
defence and I had a competent but not great answer.</p>

<p><strong>A baseline against Zeek.</strong> Suricata is a great IDS but Zeek’s
logging-first model is conceptually different and would probably
have been a fairer comparison than a vanilla-Suricata baseline. Did
not have time. Future work.</p>

<p><strong>Earlier supervisor review on the empirical chapter.</strong> I waited too
long to show drafts. When I finally sent the empirical chapter four
weeks before defence, my supervisor flagged a methodology issue (I
was conflating “rule did fire” with “rule fired correctly” without
clean ground-truth labels) that took two weeks to fix. Lesson:
supervisor review every two weeks, not every two months.</p>

<h2 id="lessons-that-stuck">Lessons that stuck</h2>

<p>A few things that I think will follow me into actual SOC work:</p>

<ul>
  <li>
    <p><strong>Detection engineering is a tuning loop, not a writing
exercise.</strong> Writing a rule takes 30 minutes; tuning it to
acceptable false-positive levels in a target environment takes
weeks. A SOC that ships rules without budgeting for the tuning is
going to drown in alerts. (Brights doesn’t have a SOC yet — I’m
walking in to build one — and I keep this as a principle.)</p>
  </li>
  <li>
    <p><strong>Cumulative-scoring approaches outperform single-rule detection,
with constant tuning.</strong> This is the conclusion of my empirical
chapter and I believe it: the meta-rule “five low-severity events
in a 60-second window from the same source” caught attacker
patterns that any individual rule missed. But it requires
continuous environment-specific tuning — turn that knob too tight
and you false-positive on benign automation; too loose and you
miss real attacks. The tuning loop is the work.</p>
  </li>
  <li>
    <p><strong>Replay-pcap-driven empirical work is wildly underutilized in
academic cyber programmes.</strong> Most thesis-level work I’ve seen at
KPI is theoretical or simulation-based. There’s huge room for
pure-replay-pcap experimental work, and the public corpus
(MTA, the various .pcap libraries on GitHub) is generous. If
you’re a future ФТІ student looking for a thesis topic, this is a
mineable seam.</p>
  </li>
</ul>

<h2 id="whats-next">What’s next</h2>

<p>Returning to Brights as a full-time Junior Cybersecurity Specialist
in September. They’ve been an absurdly generous host of an
academic-research-flavoured intern (me, 2024 summer) and seem
genuinely interested in growing the cybersec function around the
ISO 27001 / 42001 work. I’ll get to actually deploy some of the
detection rules from the thesis against a real (small) production
environment, which is exactly the kind of practical follow-on the
thesis lacked.</p>

<p>The summer between defence and start date is for: catching up on
sleep, reading the books I procrastinated on for six months
(<em>Secure by Design</em>, <em>Building Secure &amp; Reliable Systems</em>, the
Anton Chuvakin newsletter archive), and a small Carpathian hiking
trip.</p>

<p>Слава Україні. 🇺🇦</p>]]></content><author><name>Sofia Dubchak</name></author><category term="thesis" /><category term="kpi" /><category term="detection-engineering" /><category term="career" /><summary type="html"><![CDATA[Defended my BSc on June 6th. A few notes on what worked methodologically, what I'd change in retrospect, and the small career decisions on the other side.]]></summary></entry><entry><title type="html">Writing a Suricata rule for the double-letter alliterative C2 URL pattern</title><link href="https://palianytsia-200.pp.ua/2025/01/08/suricata-rule-double-letter-alliterative-c2/" rel="alternate" type="text/html" title="Writing a Suricata rule for the double-letter alliterative C2 URL pattern" /><published>2025-01-08T09:45:00-08:00</published><updated>2025-01-08T09:45:00-08:00</updated><id>https://palianytsia-200.pp.ua/2025/01/08/suricata-rule-double-letter-alliterative-c2</id><content type="html" xml:base="https://palianytsia-200.pp.ua/2025/01/08/suricata-rule-double-letter-alliterative-c2/"><![CDATA[<p>Following up on <a href="/2024/11/12/cert-ua-uac-0010-pterodo-update/">the November post</a>
about UAC-0010 / Gamaredon’s bare-IP plain-HTTP beacon URL pattern, I
spent a couple of January evenings tightening up a Suricata rule that
catches the <em>shape</em> of the path without hard-coding the operator’s
specific verb list. Here’s the writeup.</p>

<h2 id="the-pattern-restated">The pattern, restated</h2>

<p>Pterodo HTML / VBS implants from late-2024 onwards beacon to URLs of
the form:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>http://&lt;bare-IPv4&gt;/&lt;verb&gt;&lt;suffix&gt;?-&lt;DD&gt;-&lt;MM&gt;
</code></pre></div></div>

<p>Where:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">&lt;bare-IPv4&gt;</code> = a literal IP, no Host header pointing to a hostname.
This is the high-fidelity bit — bare-IP HTTP traffic from a workstation
to an external IP is almost never legitimate in 2025. (Cloud control
planes, CRL distribution, and a few niche update mechanisms aside.)</li>
  <li><code class="language-plaintext highlighter-rouge">&lt;verb&gt;</code> = a short alpha string. Observed: <code class="language-plaintext highlighter-rouge">Svvr</code>, <code class="language-plaintext highlighter-rouge">SSsr</code>, <code class="language-plaintext highlighter-rouge">Akad</code>,
<code class="language-plaintext highlighter-rouge">Akk</code>, <code class="language-plaintext highlighter-rouge">Gpps</code>, <code class="language-plaintext highlighter-rouge">Mouuds</code>. Five of six are double-letter alliterative
(<code class="language-plaintext highlighter-rouge">vv</code>, <code class="language-plaintext highlighter-rouge">Ss</code>, <code class="language-plaintext highlighter-rouge">kk</code>, <code class="language-plaintext highlighter-rouge">pp</code>, <code class="language-plaintext highlighter-rouge">uu</code>). The doubled-letter is consistent with
Gamaredon’s well-documented alliterative-naming TTP — the same
pattern produces <code class="language-plaintext highlighter-rouge">riontos.ru</code>-style apex names.</li>
  <li><code class="language-plaintext highlighter-rouge">&lt;suffix&gt;</code> = <code class="language-plaintext highlighter-rouge">Htm</code>, <code class="language-plaintext highlighter-rouge">Ua</code>, <code class="language-plaintext highlighter-rouge">U</code>, or empty. Sometimes encodes campaign
variant or victim cohort.</li>
  <li><code class="language-plaintext highlighter-rouge">&lt;DD&gt;-&lt;MM&gt;</code> = the campaign date, baked into the URL path at
generation time. Two-digit zero-padded.</li>
</ul>

<h2 id="the-rule">The rule</h2>

<pre><code class="language-suricata">alert http any any -&gt; any any (
    msg:"GAMAREDON Pterodo bare-IP beacon URL (alliterative path + DD-MM date)";
    flow:established,to_server;
    http.method; content:"GET";
    http.host; pcre:"/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}(:80)?$/";
    http.uri; pcre:"/^\/[A-Z][A-Za-z]{2,7}-\d{2}-\d{2}\/?$/";
    classtype:trojan-activity;
    sid:9001502;
    rev:2;
    metadata:mitre T1071.001 T1568.002;
    reference:url,cert.gov.ua/article/...;
)
</code></pre>

<p>The two <code class="language-plaintext highlighter-rouge">pcre</code> clauses are the meaty bit:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">http.host</code> matches a bare IPv4 (with optional explicit <code class="language-plaintext highlighter-rouge">:80</code>). This
is the hard-to-evade signal — Pterodo’s HTML/VBS landers are
hard-coded to fetch bare-IP URLs, and changing that requires
re-tooling.</li>
  <li><code class="language-plaintext highlighter-rouge">http.uri</code> matches a path of: leading slash, capital letter, 2–7
more letters (mixed case), dash, two digits, dash, two digits,
optional trailing slash. That covers the observed verbs and
suffixes without enumerating them.</li>
</ul>

<p>I deliberately did NOT enumerate the specific verb list (<code class="language-plaintext highlighter-rouge">Svvr</code>,
<code class="language-plaintext highlighter-rouge">SSsr</code>, etc.) — operators rotate verbs across campaigns, and a rule
keyed on a fixed list would fail closed on the next rotation. Better
to match the path <em>shape</em> and accept some false-positive volume.</p>

<h2 id="false-positive-notes">False-positive notes</h2>

<p>Things this rule will fire on that aren’t Pterodo:</p>

<ul>
  <li><strong>Cloud metadata endpoint requests</strong> that some misconfigured agents
make to literal <code class="language-plaintext highlighter-rouge">169.254.169.254</code> (AWS, GCP) — the IP-only Host
header is right, but those URIs are typically not in the
<code class="language-plaintext highlighter-rouge">&lt;verb&gt;-DD-MM</code> shape, so the URI regex saves us. <strong>Negligible FP.</strong></li>
  <li><strong>Internal IP-addressed health checks</strong> — <code class="language-plaintext highlighter-rouge">http://10.0.0.5/Health-01-15</code>
could match. Add a <code class="language-plaintext highlighter-rouge">! $HOME_NET</code> filter on <code class="language-plaintext highlighter-rouge">http.host</code>. (My version
scopes destination as <code class="language-plaintext highlighter-rouge">any</code> because home-lab purposes; for
enterprise, scope it.)</li>
  <li><strong>CRL / OCSP distribution</strong> — some old CA setups serve CRLs from
bare IPs. URIs are typically <code class="language-plaintext highlighter-rouge">/crl/&lt;name&gt;.crl</code>, not the
<code class="language-plaintext highlighter-rouge">&lt;verb&gt;-DD-MM</code> shape, so the URI regex filters this out too. <strong>Low.</strong></li>
  <li><strong>Misconfigured monitoring</strong> — Nagios / Icinga / Zabbix sometimes
emit traffic to bare IPs with whatever URI the operator typed in. If
you get a fire from this, it’s exactly because the agent is
misconfigured and worth investigating regardless. <strong>Possible.</strong></li>
</ul>

<h2 id="tuning">Tuning</h2>

<p>In my home-lab (the Suricata-elk-lab Docker-Compose I shipped with my
<a href="https://github.com/palianytsia-200/suricata-elk-lab">BSc thesis support repo</a>),
the rule fires on:</p>

<ul>
  <li>Replayed Pterodo pcaps from MTA — yes, every time, as expected.</li>
  <li>A week of my own home-network traffic — zero hits.</li>
  <li><code class="language-plaintext highlighter-rouge">nmap</code> scans I run as part of CTF practice — zero hits (nmap doesn’t
do <code class="language-plaintext highlighter-rouge">&lt;verb&gt;-DD-MM</code> URIs).</li>
  <li><code class="language-plaintext highlighter-rouge">curl</code> to bare-IP API endpoints I happen to use — zero hits.</li>
</ul>

<p>So in a low-noise environment, false positives look like zero. In a
real SOC the actual answer depends on your environment — please tune
before deploying.</p>

<h2 id="coverage-gaps">Coverage gaps</h2>

<p>Things this rule misses:</p>

<ul>
  <li><strong>Beacon URLs that don’t have the <code class="language-plaintext highlighter-rouge">-DD-MM</code> suffix</strong> — some Pterodo
variants (older, e.g. early-2024) used path forms without the date.
My rule won’t catch those. Different rule.</li>
  <li><strong>Variants with dotted-DD.MM or ISO date in the path</strong> — possible
operator rotation. Adapt the regex.</li>
  <li><strong>HTTPS-wrapped variants</strong> — if Gamaredon wraps the beacon in TLS,
Suricata sees only the SNI/handshake, not the URI. Pterodo’s
late-2024 wave is plain-HTTP-only, but that could change. The
TLS-handshake-side detection is a separate rule and a harder
problem.</li>
</ul>

<h2 id="usefulness-in-context">Usefulness in context</h2>

<p>For a small SOC (Brights-sized, ~50 staff) the rule is essentially
free to run. The bare-IP HTTP outbound is rare enough in modern
enterprise that it’s a high-confidence alert on its own; combining
with the URI regex makes it specific to the Pterodo shape. Cumulative
behavioural scoring makes this stronger — if a workstation also
recently downloaded an HTML attachment with a <code class="language-plaintext highlighter-rouge">Scan_X_Y_Z_NNNN</code> name,
you have a high-confidence Gamaredon detonation.</p>

<p>(Caveat: I’m a junior. This is rule-writing as a learning exercise,
not as production guidance from someone with a decade of detection
engineering. Run any rule through your own tuning loop before trusting
it in alerts.)</p>

<h2 id="thanks">Thanks</h2>

<p>To Florian Roth’s <a href="https://github.com/Neo23x0/signature-base">signature-base</a>
project for being the canonical example of “how to structure a
public-facing detection rule”. Most of the formatting choices here
are imitations of his.</p>]]></content><author><name>Sofia Dubchak</name></author><category term="suricata" /><category term="detection-engineering" /><category term="gamaredon" /><category term="uac-0010" /><category term="blue-team" /><summary type="html"><![CDATA[Following up on the Gamaredon URL-pattern observation from November — turning it into an actually-shippable Suricata rule, with false-positive notes.]]></summary></entry><entry><title type="html">Reading CERT-UA’s UAC-0010 / Gamaredon update — what jumped out</title><link href="https://palianytsia-200.pp.ua/2024/11/12/cert-ua-uac-0010-pterodo-update/" rel="alternate" type="text/html" title="Reading CERT-UA’s UAC-0010 / Gamaredon update — what jumped out" /><published>2024-11-12T11:30:00-08:00</published><updated>2024-11-12T11:30:00-08:00</updated><id>https://palianytsia-200.pp.ua/2024/11/12/cert-ua-uac-0010-pterodo-update</id><content type="html" xml:base="https://palianytsia-200.pp.ua/2024/11/12/cert-ua-uac-0010-pterodo-update/"><![CDATA[<p>I spent an evening this week going carefully through the latest CERT-UA
bulletin on UAC-0010 (also known as Gamaredon, Primitive Bear, Trident
Ursa). It’s part of my thesis-prep reading — I’m writing on
behavioural anomaly detection in network traffic and Gamaredon is the
most loudly-active APT in the public CERT-UA corpus, so it’s a natural
case study.</p>

<p>A few things jumped out that are worth turning into actual detection
content for someone running a small-shop SOC. None of these are novel
research; they’re me reading carefully and trying to translate “the
bulletin says X” into “here’s what you’d watch for”.</p>

<h2 id="pterodo-file-naming-schemas-keep-getting-reused">Pterodo file-naming schemas keep getting reused</h2>

<p>Across the recent CERT-UA dispatches the HTML / VBS / SFX / RAR
filenames have a bunch of recurring shapes:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Scan_&lt;digit&gt;_&lt;digit&gt;_&lt;digit&gt;_&lt;NNNN&gt;_&lt;DD&gt;.&lt;MM&gt;.&lt;YYYY&gt;.htm[l]</code> — the
“Scan” prefix is recent (added late-Apr 2024 per CERT-UA’s wave
comparison). Most bulletins from the prior 12 months use the bare
<code class="language-plaintext highlighter-rouge">&lt;digit&gt;_&lt;digit&gt;_&lt;digit&gt;_&lt;NNNN&gt;_&lt;DD&gt;.&lt;MM&gt;.&lt;YYYY&gt;.&lt;ext&gt;</code> form.</li>
  <li>The <code class="language-plaintext highlighter-rouge">&lt;DD&gt;.&lt;MM&gt;.&lt;YYYY&gt;</code> date convention is <strong>Russian / European</strong>
format (dot separators, day-first). Operator-side automation is
emitting filenames in Russian-locale date format, which is itself a
mild-fidelity signal — most Western scanners would emit ISO date,
not dotted-DDMMYYYY.</li>
  <li>File sizes for the HTML landers cluster in the 250–260 KB range —
consistent with HTML files plus embedded resources.</li>
</ul>

<p>These are CERT-UA’s published indicators; not me re-deriving from
samples. But the regex shape is stable enough across bulletins that
it’s worth a static-content rule. Suricata sketch (untested at scale —
this is a thinking-out-loud rule, not a production-deploy one):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>alert http any any -&gt; $HOME_NET any (msg:"UAC-0010 Pterodo Scan_X_Y_Z filename pattern";
  flow:established,to_client; http.uri;
  pcre:"/Scan_\d+_\d+_\d+_\d{4}_\d{2}\.\d{2}\.20\d{2}\.html?/i";
  classtype:trojan-activity; sid:9001500; rev:1; reference:url,cert.gov.ua/article/...;)
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">2</code> in <code class="language-plaintext highlighter-rouge">\d{2}\.\d{2}\.20</code> is a poor hack to anchor on 20XX years.
Probably wants <code class="language-plaintext highlighter-rouge">\.(202[4-9])\.</code> instead. As written this rule will fire
on benign legitimate file uploads named with similar conventions — it
needs <code class="language-plaintext highlighter-rouge">to_client</code> flow direction, attachment-content-type checks, and
some host-filter to be useful in production. Take it as a sketch.</p>

<h2 id="beacon-url-pattern-is-the-more-useful-tell">Beacon URL pattern is the more useful tell</h2>

<p>Per CERT-UA’s IOCs the recent waves carry beacon URLs like
<code class="language-plaintext highlighter-rouge">http://&lt;bare-IP&gt;/&lt;verb&gt;&lt;suffix&gt;?-&lt;DD&gt;-&lt;MM&gt;</code> where:</p>

<ul>
  <li>Verb is one of a small set of operator codenames (<code class="language-plaintext highlighter-rouge">Svvr</code>, <code class="language-plaintext highlighter-rouge">SSsr</code>,
<code class="language-plaintext highlighter-rouge">Akad</code>, <code class="language-plaintext highlighter-rouge">Akk</code>, <code class="language-plaintext highlighter-rouge">Gpps</code>, <code class="language-plaintext highlighter-rouge">Mouuds</code>)</li>
  <li>Suffix is <code class="language-plaintext highlighter-rouge">Htm</code>, <code class="language-plaintext highlighter-rouge">Ua</code>, or empty</li>
  <li><code class="language-plaintext highlighter-rouge">&lt;DD&gt;-&lt;MM&gt;</code> is the campaign date</li>
</ul>

<p>Five of those six verbs feature <strong>double-letter alliteration</strong> (<code class="language-plaintext highlighter-rouge">vv</code>,
<code class="language-plaintext highlighter-rouge">Ss</code>, <code class="language-plaintext highlighter-rouge">kk</code>, <code class="language-plaintext highlighter-rouge">pp</code>, <code class="language-plaintext highlighter-rouge">uu</code>) — that lines up with Gamaredon’s well-
documented alliterative-naming TTP, the same one that produces
<code class="language-plaintext highlighter-rouge">riontos.ru</code> style apex names with a recurring first letter. Whoever
designs the operator-side URL generator is consistent.</p>

<p>The detection-engineering useful bit isn’t the specific verb list (the
operators rotate them), it’s the <strong>bare-IP plain-HTTP beacon to a
double-letter URL path</strong>, which is a very high-signal shape for
“something Pterodo-flavoured is calling home”. A Suricata sketch that
matches the <em>shape</em>, not the specific verbs:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>alert http any any -&gt; any any (msg:"UAC-0010 plain-HTTP bare-IP beacon (alliterative path)";
  flow:established,to_server;
  http.host; pcre:"/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/";
  http.uri; pcre:"/^\/([A-Z][a-z]?)\1[A-Za-z]{1,4}-\d{2}-\d{2}/";
  classtype:trojan-activity; sid:9001501; rev:1;)
</code></pre></div></div>

<p>That regex matches paths starting with one capital + one lowercase +
the same letter again (the alliteration), followed by 1-4 letters,
then <code class="language-plaintext highlighter-rouge">-DD-MM</code>. Imperfect — won’t catch the <code class="language-plaintext highlighter-rouge">Akk</code> / <code class="language-plaintext highlighter-rouge">Akad</code> shapes
because those don’t have lowercase before the doubled letter. Better
shape probably involves an explicit verb-list of the form
<code class="language-plaintext highlighter-rouge">(Svvr|SSsr|Akad|Akk|Gpps|Mouuds)</code> followed by suffix and date. Tune
to your environment’s noise.</p>

<h2 id="what-im-going-to-put-in-the-thesis">What I’m going to put in the thesis</h2>

<p>Both shapes (the filename schema, and the beacon URL pattern) are
candidate examples for the empirical chapter — they’re cumulative-
behavioural signals where a single hit isn’t conclusive, but a
combination (“workstation downloaded a <code class="language-plaintext highlighter-rouge">Scan_X_Y_Z_NNNN_DD.MM.YYYY.htm</code>
attachment AND beaconed to a bare-IP URL with alliterative path
within 60s”) is high-confidence.</p>

<p>CERT-UA’s bulletins are extraordinarily generous publications and the
quality is high. If you’re a UA cybersec student or a junior SOC
analyst, the
<a href="https://cert.gov.ua/articles/all">CERT-UA bulletin archive</a>
is one of the highest-density learning resources I know of for
practising “TTP description → detection rule” translation.</p>

<p>(I’ve published the Suricata-elk-lab Docker-Compose I used for the
thesis empirical chapter at
<a href="https://github.com/palianytsia-200/suricata-elk-lab">github.com/palianytsia-200/suricata-elk-lab</a>.
Not production-ready — student work. But useful as a reproducible
bench.)</p>]]></content><author><name>Sofia Dubchak</name></author><category term="gamaredon" /><category term="uac-0010" /><category term="cert-ua" /><category term="detection-engineering" /><category term="suricata" /><category term="pterodo" /><summary type="html"><![CDATA[A close re-read of the late-2024 CERT-UA bulletin on UAC-0010 / Gamaredon and what's worth turning into Suricata rules for a small-shop SOC.]]></summary></entry></feed>