Saturday, February 7, 2026

Zero Trust Architecture - Implementation Challenges Analysis on a Paper

In one of the assigned readings, I read the paper  "A systematic literature review on the implementation and challenges of Zero Trust Architecture across domains" by Mustaq Ali et al., which reviews about 74 studies from 2016 to 2025. It explores how Zero Trust Architecture (ZTA) has been used across various technical and organizational scenarios and highlights the challenges encountered in these implementations.

In a previous course I took at Harvard (CSCIE-155, Networks & Security), I read Project Zero Trust by George Finney, which presents Zero Trust Architecture as a story. I recommend this one because it narrates the problem beautifully, in novel style: a hacker holds a fitness company hostage by stealing PII and threatening to make it public, and how the company's responders are beginning to implement Zero Trust Architecture to combat the attack and prevent future infiltrations and hacks. This paper spotlights the same core principles in its introduction. According to Finney (2022), ZTA is a strategy, not just a tool, yet the industry often treats it as a tool or framework. This difference is important and a key theme in Mushtaq et al.'s (2025) literature review.

Finney (2022) outlines six fundamental principles that define ZTA as a strategy rather than a product:
The first principle is to identify and define the protected surfaces. Instead of securing the whole network at once, organizations should focus on what needs protection most—the "Crown Jewels." Finney (2022) calls these DAAS: Data, Assets, Applications, and Services, which are the most important resources the organization must protect.




The second principle is to map transaction flows. This entails understanding how data flows within the organization, documenting how users and systems interact with protected surfaces, and defining normal traffic patterns (Finney, 2022).

Third, the focus moves to designing the network. This requires a careful, robust approach to micro-segmentation, creating a custom environment for each protected surface with micro-perimeters. Finney (2022) explains this as the "Kipling Method" Gateway, which is a specific entry point designed for each surface.

Fourth, organizations need to set a Zero Trust policy. The "Kipling Method," as Finney (2022) describes, involves formulating detailed rules about Who, What, When, Where, Why, and How traffic is allowed. Each access decision adheres to clear, context-based criteria.

The fifth and sixth principles go together: monitor and keep visibility using analytics, and keep improving over time. A security team that checks and logs all traffic in real time can spot problems and find ways to improve. This monitoring leads to regular reviews of protected surfaces and policies, helping organizations adjust to new threats and changing business needs (Finney, 2022).
With Finney's (2022) strategy in mind, the results of this literature review stand out. Mushtaq et al. (2025) show that, although the industry uses the language of Zero Trust, it has only partially put its ideas into practice across 74 studies.

The Gap Between Principle and Practice

Mushtaq et al. (2025) found that most ZTA implementations in all areas focus mainly on the basics: authentication, authorization, and access control. These are important, but they are only part of what a real Zero Trust setup needs.

What's consistently missing? Continuous auditing and monitoring, automated policy orchestration, and environmental or context-aware perception (Mushtaq et al., 2025). Mapped against Finney's (2022) principles, this means the industry has made progress on establishing policies and verifying identity (principles three and four), but has largely neglected the transaction flow mapping, immediate monitoring, and continuous improvement cycles (principles two, five, and six) that make Zero Trust a living strategy rather than a static configuration.
In other words, most organizations have locked the front door but have not installed cameras, alarms, or systems to detect when something is wrong inside. They have treated Zero Trust as just a tool, which is exactly the mistake Finney (2022) warns about.

Where It's Working — and Where It Isn't

Cloud and enterprise environments have made the most progress toward mature ZTA implementations (Mushtaq et al., 2025). This makes sense — these domains have mature tooling, well-defined architectural patterns, and the resources to commit to comprehensive security redesigns. Remote work acceleration during and after the pandemic pushed many enterprises to adopt Zero Trust principles out of necessity, and cloud providers have built native support into their platforms.
The story changes dramatically when you look at other domains. Two stand out as particularly challenging.


IoT: Too Constrained for Full Zero Trust

The Internet of Things represents one of the most difficult frontiers for Zero Trust adoption. Mushtaq et al. (2025) identified 11 IoT-focused studies, making it the second-most-studied domain — a reflection of both its importance and its complexity. The fundamental problem is resource constraints. IoT devices — sensors, embedded controllers, industrial monitors — often run on minimal processing power, limited memory, and constrained battery life. The cryptographic operations that Zero Trust demands (continuous authentication, encrypted communications, and token validation) can overburden these devices or drain them faster than they can be maintained (Mushtaq et al., 2025).

The authors identify a major gap: the lack of lightweight cryptographic solutions customized to these environments (Mushtaq et al., 2025). Standard enterprise-grade security protocols simply don't translate to a temperature sensor running on a microcontroller. Until the security community develops cryptographic approaches that are simultaneously robust and resource-efficient, IoT Zero Trust implementations will remain experimental and incomplete.

There's also the scale problem. An enterprise might manage thousands of user accounts. An IoT deployment may include tens of thousands of devices, each requiring its own identity and generating its own trust signals. The orchestration challenge alone is staggering, and most current solutions don't handle it well. Through Finney's (2022) lens, identifying and defining protected surfaces in an IoT environment — where DAAS elements are distributed across thousands of constrained devices — becomes exponentially more complex than in a traditional enterprise setting.

Healthcare: Where Compliance and Architecture Collide

Healthcare was the third-most-studied domain, with 7 studies, and it presents a distinct yet equally instructive set of challenges (Mushtaq et al., 2025). Here, the issue isn't primarily about device constraints — it's about the collision between Zero Trust principles and regulatory reality.
Healthcare organizations operate under strict frameworks such as HIPAA in the United States and the GDPR in Europe. These regulations have specific requirements around data access, audit trails, patient consent, and breach notification. Mushtaq et al. (2025) found that most ZTA implementations in healthcare struggle to fully conform to these frameworks, particularly in data administration, continuous auditing, and the explainability of automated access decisions.
Consider the tension: Zero Trust calls for dynamic, context-aware access decisions — a system might grant or revoke access to patient records based on real-time signals such as device health, location, or behavioral patterns. But HIPAA necessitates clear, auditable justification for every access decision. When an AI-driven trust engine denies a clinician access to a patient's records during a critical moment, the organization needs to explain exactly why — and the current generation of context-aware trust engines often can't provide that level of transparency.

This is where Finney's (2022) Kipling Method becomes both essential and difficult to execute. Writing granular rules based on Who, What, When, Where, Why, and How is precisely what healthcare regulators demand — but doing so dynamically, at scale, across a hospital's sprawling ecosystem of electronic health records, medical devices, telemedicine platforms, pharmacy systems, and insurance integrations is still a largely unsolved challenge.

What Needs to Happen Next

Mushtaq et al. (2025) don't just catalog problems—they point to a clear set of priorities for the field, many of which correspond directly with the strategic vision Finney (2022) articulated.
First, lightweight cryptography needs to move from research curiosity to production reality. Without it, Zero Trust will remain impractical for the fastest-growing categories of connected devices (Mushtaq et al., 2025).

Second, context-aware trust engines need to become more sophisticated and more transparent. Dynamic access decisions are powerful, but only if they can be audited, explained, and consistent with the regulatory contexts where they operate (Mushtaq et al., 2025).

Third, orchestration cannot be an afterthought. The hardest part of Zero Trust is not checking a single request, but rather maintaining a clear, enforceable policy across multiple systems simultaneously (Mushtaq et al., 2025). Finally, regulatory integration should be planned from the beginning, not added after the system is built. The difference between what Zero Trust systems do and what regulations require them to record is a major barrier to adoption (Mushtaq et al., 2025).


The Bottom Line


Zero Trust is the right approach. The idea of "never trust, consistently verify" makes sense in a world devoid of clear boundaries and rife with threats. However, this review shows that the industry is still in its early stages. Most implementations focus solely on access control and overlook the monitoring, orchestration, and compliance features that enable Zero Trust to function effectively (Mushtaq et al., 2025).

As Finney (2022) reminds us, Zero Trust is a strategy. It is a cycle of identifying what matters most, understanding how it is accessed, building protections, writing explicit policies, monitoring everything, and continually improving. The 74 studies reviewed by Mushtaq et al. (2025) show that the industry has started this journey, but still has a long way to go. The areas where this is most important—IoT, healthcare, and industrial systems—are also where the risks are highest. Getting Zero Trust right in these fields is not only a technical task. It is essential.

References

Finney, G. (2022). Project Zero Trust: A story about a strategy for aligning security and the business. Wiley.

Mushtaq, S., Mohsin, M., & Mushtaq, M. M. (2025). A systematic literature review on the implementation and challenges of Zero Trust Architecture across domains. Sensors, 25(19), 6118.





Monday, February 2, 2026

On Neumann's Paper "Towards Total-System Trust worthiness"- We're Building Houses of Cards

As one of my New Year's goals, I have committed to writing a few blog posts a month, and this is a good opportunity to use class work to express my rants and ramblings while getting some learning done in the process.  

Peter G. Neumann is a legend in computer security for good reason. His article "Toward Total-System Trustworthiness" names something most of us in technology leadership sense but rarely articulate: we're playing a losing game. Every patch, every wrapper, every clever workaround adds another card to a structure that was never designed to bear the weight we're placing on it.

The Southwest Airlines meltdown brought this into sharp relief. A classmate in the discussion posts that I answered to earleir today pointed out that their catastrophic failure during the winter storm wasn't a technology problem—it was an archaeology problem. Southwest had essentially wrapped an old 1990s-era scheduling system called SkySolver in newer interfaces, hoping the wrapper would compensate for foundations that were never updated for modern scale. When the storm hit, the sheer volume of data overwhelmed the underlying logic, and no amount of clever interfacing could save it.

Neumann calls this the "patch-on-patch" approach. I call it "technical debt" from days in software engineering and leading developers or product development teams, which come with haunting reminders of bug fixing and facing the music from unhappy customers, all coming due with compound interest.

Why Total-System Trustworthiness Remains Elusive

After twenty-plus years leading technology operations across global organizations, I've come to believe there are four fundamental reasons why achieving true system trustworthiness remains aspirational at best—especially when you're simultaneously responsible for keeping the lights on.

The "Less Untrustworthy" Objective: The "Less Untrustworthy" Objective. People tend to use simple binary categories when evaluating systems because they believe these systems exist in only two states: secure or insecure, and trustworthy or broken. Neumann presents this concept as a gradient that transforms the entire system. The main priority should be to minimize untrustworthy actions because we understand that humans will always fall short of achieving complete trust with one another. Medical practice requires drug interaction screening instead of achieving absolute treatment success. In systems, it means assuming your components will fail and engineering the resilience to absorb it.

The Legacy Trap: The Legacy Trap. Neumann supports a complete system reset approach, which becomes necessary when organizations attempt to add security features to their existing, outdated systems. The process of building a dormitory becomes similar to this situation because the poured foundation creates physical boundaries that determine the building's shape. The fundamental basis of computer systems is rooted in the x86 architecture and the C programming language, which were developed when cyber warfare as we know it today did not exist. A sinking foundation will never achieve perfection because any attempt to fix it through retroactive correction will be unsuccessful. We can only shore it up and resolve to build the next one differently.

Anticipating the "Space Aliens: I am reminded of a computer game I used to play :) Anticipating the "Space Aliens."Security professionals defend against space aliens using a humorous method that illustrates a basic threat modeling principle that seems ridiculous at first. Neumann explains that we cannot predict all environmental elements, including floods, earthquakes, and zero-day exploits, but we should design systems that continue to function properly during decline. A dependable system produces clear, limited failures rather than complete system breakdowns.

Designing with Humility: The most crucial element runs counter to the industry's conventional values, as it requires organizations to delay their responses, even though speed and self-assurance are typically prized. Recognizing the unresolvability of complex systems makes us more likely to implement observation systems that detect failures and compartmentalization, thereby preventing a single-floor slab crack from collapsing the entire roof structure, and to use formal methods to verify our ability to control specific system components.

The Leadership Paradox

The main thing that prevents me from sleeping is that technology leaders receive existing systems rather than designing their own. The situation forces us to take on dual responsibilities: maintaining existing systems and establishing their trustworthiness. The problem stems from organizational and philosophical factors rather than being a technical issue. Neumann shows that we need to stop using short-term security fixes and instead actively discuss fundamental system vulnerabilities. Our organization will protect against future system failures through our commitment to humility and our practice of sharing all operational procedures with others.

Our systems will inevitably experience failure, according to the question. We need to understand whether we can develop systems that will collapse yet still enable us to survive.