Vulnerability management, the platonic ideal

I've never seen an ideal vulnerability management (vuln mgmt) program, but this is what I imagine one is.

Vuln mgmt is a machine that converts known risks into fixes. Its an unglamorous but essential part of a security program and I have a soft spot for it. The business value in finding a vuln is only captured when we fix the vuln.

As Kobe says, “jobs not finished”

Vuln mgmt is entirely downstream of organizational priorities and incentives.

Problem shape

Vuln mgmt is fixing security bugs in

other peoples code (you apply the fix)
Your code (you develop and apply the fix)

Winning is:

Urgent stuff gets fixed quickly
Less urgent stuff fixed eventually (and isn't forgotten)
You get better over time

The components

Vuln mgmt steps: Inventory -> vuln inputs -> prioritize -> fix (or accept risk) -> verify -> extract lessons -> repeat.

Inventory

You need an inventory across:

Corporate (phones, laptops, software on those things)
Infra (cloud, containers, configs)
Product (your code + dependencies)

This has to stay up to date, ideally via trigger events, e.g.:

Signed new contract
Deployed software to fleet
Checked new 3rd party library into source control

The more automatic this is, the more likely it will stay accurate.

Ownership

Everything in the inventory needs an owner, ideally a team/oncall but there should always be a clear path to a specific person. Also must be kept updated as people/teams change.

Vuln discovery

A function that produces vetted vulns to go fix. Match against your inventory.

There are many ways to find vulns (people, scanners, bug bounty, vuln feeds) but the output has to be trusted by the person who will go fix it. Sometimes a single vuln (ex: log4j) exists in multiple places (ex: the log4j library, and some network appliance that has log4j) so recognizing that should happen here.

Prioritization

An organization-specific score for how bad the vuln is, driving the urgency and completeness of the fix. Start with CVSS but then add in a blend of:

Reachability (internet-facing vs internal)
Your orgs threat model / what org-specific crown jewels the system touches

Example: A CVE 9.8 in an internal admin tool behind SSO is less urgent than a CVE 5 in your unauthenticated public API.

This is both useful but also a credibility test. It shows the org you were thoughtful about what actually matters vs mindlessly shoveling work at your coworkers.

Fixing

There are short-term fixes (apply patch quickly) and long-term fixes (move off a library, redesign something). A good program can do both, which again returns to organizational priorities/incentives.

Sometimes the right answer is to simply accept the risk. When accepting the risk it should be recorded, with an owner, and some way to revisit it later. The time to make the risk acceptance decision should be short, ex: a 10 min discussion in a chat room.

Verification

99% of the time applied the patch means you fixed the vuln, but its good to verify. Sometimes patches break other things. A rich test suite helps.

Priorities / Incentives / People

Vuln mgmt is entirely downstream of organizations priorities and incentives.

Ideal is:

No one saying "security is asking us to do stuff" but "we fix our bugs"
Reinforcement via any performance review/bonuses/etc. I vaguely recall microsofts secure future initiative goaled VPs on their security bugs, stuff like that works.
Vuln mgmt is just a thing the org does, like brushing your teeth.
Metrics/ graphs/ SLAs /scorecards are mostly useful to apply and maintain organizational pressure, but they are a weak additional form of incentives
My favorite anecdote, Italian tax collectors a/b tested the wording they used and increased their rate of lawful tax payment, at scale (asking 2,000 people to do a thing) these matter.

Conclusion

If anyone is doing even half of the above, let me know! This is my experience in big (30k+) companies and I imagine its a lot different at other places, hopefully simpler and easier.