AI is Not a Tool, it is Cognitive Infrastructure!

Posted in Artificial Intelligence, Business Analysis, Change Management, Product Development on 12th April 2026

There is a quiet ‘category error’ sitting at the centre of much of the AI conversation right now, and it is shaping decisions that will outlive the people making them.

We keep calling AI a tool.

Tools are picked up and put down. Tools sit in a drawer until needed. A hammer does not change what a wall is, what a house means, or how a neighbourhood feels. The framing is comfortable because it puts us in charge: we choose, we wield, we set down. I would argue, though, that it is the wrong frame. It is wrong in the way that calling electricity “a better candle” was wrong in 1890. The artefact is recognisable; the substrate it creates is not.

What we are actually building, and stitching into the daily fabric of work and life, is starting to behave more like cognitive infrastructure. Something that increasingly mediates how we perceive, decide, coordinate, and remember. You do not “use” infrastructure the way you use a tool. You live inside it. The question of who designs it, who maintains it, and who gets to question it becomes a different kind of question entirely. This distinction is not academic. It changes the strategy.

This article on artificial intelligence as a cognitive infrastructure framework is written by Rohit Mahadevu, AI and Digital Transformation Consultant at Texavi Innovative Solutions. Read on...

Where the framing is breaking

Watch how organisations are deploying AI right now, and a pattern emerges that should worry anyone paying attention. A legal team buys a contract review assistant. A marketing team buys a content generator. A support team buys a triage layer. Each procurement is treated as a discrete software purchase, with a discrete ROI, slotted into existing workflows. The implicit theory is: same business, faster execution. The McKinsey reports get written. The board gets a chart. A productivity figure appears, though it often captures less than leaders think. Beneath the surface, something else can be going on. The contract assistant may quietly reshape what counts as a “reviewed” contract. The content generator may shift the floor of what a brand voice actually is. The triage layer may change what kinds of customer pain ever reach a human. None of these shifts show up on the procurement form. None of them is owned by anyone. They are second-order effects of treating a substrate change as a feature upgrade.

The deeper observation is this: we evaluate AI on tasks, but it increasingly influences judgment by shaping what is surfaced, summarised, and prioritised. A spreadsheet does not influence what you choose to measure. A model can. Once a system starts shaping the option space, it has moved out of the tool category and into something that, more honestly, resembles part of the operating environment.

The Stanford AI Index Report 2024 captures the scale of this shift, noting that “AI’s influence on society has never been more pronounced” while flagging that responsible-AI evaluations remain inconsistent across major developers (Stanford HAI, 2024). The 2025 edition adds further detail: AI-related incidents are rising, standardised responsible-AI assessments remain rare among major industrial developers, and a meaningful gap persists between recognising risk and acting on it (Stanford HAI, 2025). Governance is still trying to find the right room to meet in. That gap looks less like a coordination problem and more like a category problem. It is hard to govern what has not been correctly named.

The substrate shift

It helps, briefly, to look at what serious infrastructure shifts have actually looked like. Electricity did not arrive as a better lamp. It arrived as a redefinition of what a building could contain (refrigeration, elevators, assembly lines) and then of the night itself as productive time. The economic historian Paul David showed that the productivity gains from electrification took shape over a period measured in decades, because factories had to be physically and organisationally rebuilt around the new substrate before the old layouts stopped wasting it (David, 1990). The internet followed its own version of this pattern.

The pattern is familiar. A general-purpose technology does not simply make the old world faster. It eventually makes a different world legible.

What is unusual about AI, and what the tool framing obscures, is that the substrate it touches is not energy or information transmission. It is increasingly entangled with cognition itself. The thing being augmented, redistributed, and partly automated is the act of thinking: noticing, framing, deciding, narrating. Andrew Ng’s older observation that “AI is the new electricity” only lands if we take seriously what electricity actually did, not just that it was useful (Ng, 2017).

If cognition is part of the substrate, then “AI strategy” cannot live entirely inside an IT roadmap. It has to be considered at the level of how an organisation, a profession, or a society thinks.

The Cognitive Infrastructure Model

I want to offer a frame I have been working with, partly to test it, partly because I suspect that the absence of a shared vocabulary in this space may be contributing to misalignment between deployment and design.

I call it the Cognitive Infrastructure Model (CIM). It is a working scaffold, shaped by direct experience across organisational and system-level AI deployments, rather than a claimed industry standard, and I offer it here to open a conversation the tool framing tends to close. It is a stack of five layers, each operating at a different scale, each with its own characteristic failure modes and design responsibilities. Any deployment of AI tends to land somewhere on this stack, whether the deployers have named the layer or not.

Cognitive Infrastructure of AI_Mahadevu_Rohit_TEXAVI.png

Layer 1: Cognitive Augmentation

This is the layer closest to the individual. It is where AI changes how a single person thinks: their first draft, their first hypothesis, their first read of a situation. The change here is subtle and behavioural. People do not always notice when their starting point has shifted, only that they are getting to “good enough” faster. Over time, this can compound into something more consequential: a slow erosion of the muscle of forming an independent first take.

Recent research begins to make the mechanism more concrete, though the evidence base is still emerging and should be interpreted carefully. A 2025 MIT Media Lab working paper by Kosmyna and colleagues used EEG measurements on participants writing essays and reported lower neural connectivity and engagement among those who relied on a large language model from the outset compared with those writing unaided (Kosmyna et al., 2025); as a non-peer-reviewed preprint with a limited sample (n=54), the findings are suggestive rather than settled. A CHI 2025 paper by researchers at Microsoft Research and Carnegie Mellon, based on a survey of 319 knowledge workers, found that higher self-reported confidence in generative AI was associated with reduced critical thinking effort, while higher confidence in one’s own expertise was associated with more (Lee et al., 2025). Pointing in a similar direction, Gerlich (2025) reports a negative correlation between frequent use of AI tools and self-reported critical thinking, mediated by cognitive offloading, though this finding should be interpreted cautiously, as it relies on self-reported measures and is published in a lower-tier open-access journal.

The design question at this layer is therefore worth asking explicitly: not “how do we make people more productive?” but “what cognitive capacities do we want to preserve, and which are we willing to atrophy in exchange for speed?” That question is rarely asked, which is partly why it rarely gets a good answer.

Layer 2: Interaction

This is the layer of interface and conversation. It is where most product design effort currently lives, and where most of the visible novelty sits: chat, voice, agents, copilots. The interesting shift here is not the interface itself. It is the redefinition of what an interface is for. Software used to be a place where you executed an intent you already had. Conversational systems are increasingly a place where people can also discover what their intent actually was. That is a different relationship with a machine, and it changes what good design even means. The old heuristics (clarity, efficiency, minimal clicks) are not wrong, but they are insufficient on their own. Increasingly, we are designing for sense-making, not just task completion.

Layer 3: Operational Intelligence

Move up from the individual and the interface, and you reach the layer where AI starts running parts of an organisation’s nervous system: forecasting, scheduling, routing, anomaly detection, and internal knowledge retrieval. This is the layer most “AI transformation” decks stop at, because it offers the cleanest ROI story. It is also where the category error can do the most damage. When a model becomes the default lens through which a company sees its own operations, its blind spots can become organisational ones. The risk is not bad predictions. It is a slow narrowing of what the organisation is even capable of noticing.

Layer 4: System Orchestration

Above operations sit the layer of coordination across systems: supply chains, healthcare networks, energy grids, and financial markets. Here, AI is not augmenting a single workflow; it is mediating the handoffs between them. This is where agentic systems and inter-organisational AI start to matter. It is also the layer with the least mature governance, because the failure modes are emergent. No single actor owns the cascade. Long-term AI risk research suggests that some of the most consequential AI risks may come not from individual models being wrong, but from coupled systems behaving in ways that produce collectively bad outcomes (Critch and Krueger, 2020).

Layer 5: Societal and Policy

The outermost layer is the one where the substrate becomes civic: information ecosystems, education, labour markets, and the basic question of what counts as authored, authorised, or true. The World Economic Forum’s Future of Jobs Report 2025, based on a survey of more than 1,000 employers, projects that 170 million new roles could be created and 92 million displaced by 2030, with 39% of workers’ core skills expected to change in the same period (World Economic Forum, 2025). The deeper shift at this layer is epistemic. When a meaningful share of public reasoning is mediated by systems whose internals are opaque even to their creators, the social contract around knowledge starts to wobble. Geoffrey Hinton, on leaving Google in 2023, told the New York Times that “it is hard to see how you can prevent the bad actors from using it for bad things” and warned of an internet flooded with content people could no longer reliably distinguish as real (Metz, 2023). Not collapse. Wobble. Many institutions are still responding to this as a tool-use problem rather than a knowledge-system problem.

The point of the CIM is not that each layer needs its own task force. It is that decisions at one layer often have consequences at another, and a great deal of current practice involves making layer-1 and layer-3 decisions while assuming layer-5 is somebody else’s problem.

What it looks like on the ground

A few composite pictures, drawn from the kinds of deployments I have watched up close. The details are stylised; the patterns are recognisable to most people working in the field.

In one governance framework I worked on, the initial focus was predictable: acceptable use, risk classification, and approval flows. The structure was sound, and it mapped cleanly onto how organisations typically think about new technology. What became more interesting over time was not what the policy prohibited, but what it quietly enabled. Teams began relying on AI-generated summaries as their starting point for decision-making. The policy was written to regulate outputs. The system was already reshaping inputs.

The shift was subtle. Nothing in the policy explicitly instructed teams to defer to the model. But the presence of a fast, coherent first draft changed the cognitive baseline. Review replaced generation. Verification replaced formation. The organisation had not formally changed its thinking. In practice, it already had.

A different example. A design studio starts using generative tools across early-stage concepting. The work gets faster. Over time, the work also starts to converge, not toward bad work, but toward a kind of average sophistication. The studio’s distinctive voice, built over a decade through friction and disagreement, begins to be smoothed by a substrate that, by its training, tends toward the centre of the distribution. This is consistent with the experimental finding by Doshi and Hauser (2024) in Science Advances that generative AI raised the creativity of individual short stories on average but reduced the collective diversity of ideas across groups. The studio’s response is not to ban the tools. It is to redesign the creative process so that the divergent phase happens before the AI is introduced, not after. A small change in sequencing, a meaningful change in output.

A third. A community health team in a low-resource setting uses an AI triage layer to extend the reach of a small clinical staff. The narrow metrics improve: more patients seen, faster routing. Over time, though, the team notices that rare, ambiguous cases are where things start to slip. The model is confident in the common; the humans had been the ones holding the long tail. This pattern is recognised in the literature on clinical AI evaluation. The systematic review by Liu et al. (2019) in The Lancet Digital Health found that reported performance comparisons between deep learning systems and clinicians often have limitations related to study design, benchmark conditions, and generalisability. The independent validation of the widely deployed Epic Sepsis Model by Wong et al. (2021) in JAMA Internal Medicine reported substantially worse real-world performance than the vendor’s published metrics, including missed cases. The team’s response is to deliberately route a fraction of “high-confidence” cases back to human review, treating it as a kind of cognitive immune system. Productivity drops on paper. Clinical safety improves.

The thread across all three is the same. The shift is not from manual to automated. It is from a system in which judgment was distributed across humans to one in which judgment is partly held by a substrate. The teams that did well were the ones that noticed early and designed for it.

Where this breaks

If the tool framing persists, a few specific failure modes are likely to become more common.

The first is monoculture. When most organisations build on a small number of foundation models, the diversity of organisational thinking can begin to track the diversity of the underlying training distributions. Kleinberg and Raghavan (2021) show formally, in PNAS, that shared algorithmic decision-making can reduce collective accuracy even when it improves individual accuracy. Bommasani et al. (2022) extend the analysis to foundation models and find that data-sharing reliably exacerbates outcome homogenisation, while the effects of foundation model-sharing are more mixed. Deloitte’s 2024 State of Generative AI in the Enterprise survey series documents widespread enterprise experimentation with generative AI, though it does not directly test algorithmic monoculture (Deloitte, 2024). The survey was not designed to test those effects directly, consistent with conditions under which monoculture effects can emerge. A market in which most players operate on similar substrates may simultaneously become more efficient and more fragile.

The second is invisible governance debt. Every untracked deployment at layers 1–3 is a small, unrecorded change to how an organisation thinks. Audit and compliance functions, designed for an era of discrete software, do not always see it. The Stanford 2025 AI Index documents a sharp rise in AI-related incidents alongside continued underuse of standardised responsible-AI evaluations among major industrial developers (Stanford HAI, 2025). Debt of this kind tends to accumulate quietly until it surfaces as a regulatory event, a reputational event, or a decision-quality event that is difficult to trace back to a single cause.

The third, and the one that worries me most, is the decoupling of confidence from competence. Cognitive infrastructure can make it easier to produce work that looks well-formed without the underlying understanding that was once required. Lee et al. (2025) describe knowledge workers shifting from generating their own analyses to verifying AI output, with self-reported critical thinking effort declining as confidence in the AI increased. At scale, this can lead to organisations where the appearance of analysis is abundant, and the underlying capacity for analysis is thinner than it looks. The problem is rarely visible until it is needed.

None of these is a reason to slow down. There are reasons to design more deliberately than the current pace encourages.

The institutional lag

Companies are at least feeling the pressure to figure this out. The institutional layer (regulators, universities, professional bodies) is moving on a different clock entirely.

Education is the clearest case. The curriculum debate around AI is mostly happening at the level of “should students be allowed to use it?” That is a layer-1 question. The layer-5 question, what cognitive capacities a society needs to preserve when a non-trivial fraction of its reasoning is being substrate-mediated, is barely being asked at scale. Programmes such as MIT’s Responsible AI for Social Empowerment and Education (RAISE) initiative treat AI literacy as a civic competency rather than a purely technical skill, but they remain the exception (MIT RAISE, 2024). In England, the Department for Education’s 2024 policy paper on generative AI in education is mostly framed around safe use, data protection, and assessment integrity, rather than longer-run questions about cognitive development (DfE, 2024).

Policy has the opposite shape of failure. It is moving, sometimes quickly, but at the wrong layer. The EU AI Act classifies systems into four risk categories, from prohibited to minimal, and concentrates obligations at the high-risk end based on intended purpose and use case (European Parliament and Council, 2024). This can be read, at least partly, as a tool-oriented regulatory framing: AI is treated as something deployed in defined contexts, rather than as a substrate that reshapes the systems it enters. Useful as a floor. Insufficient on its own as a ceiling.

Professional bodies in medicine, law, accounting, and design are arguably the quietly important actors here because they set the standards that define competent practice in their fields. Most are still working out their position. Some, in medicine, are moving faster. The General Medical Council’s 2024 guidance applies existing professional standards to AI-enabled tools, emphasising that clinicians must use such systems safely, exercise professional judgement, and raise concerns where patient safety may be affected, and that clinicians are responsible for their professional judgement and for adverse-incident reporting (GMC, 2024). That is not yet a settled framework for AI-mediated clinical reasoning, but it begins to shift the question from “is the model accurate?” toward “what does it mean to be a good clinician in a system where the model is part of the cognition?”.

The mismatch between changes in substrate speed and in institution speed is itself a design problem. It will not be solved by any single actor. It does, though, get harder to address the longer the underlying category is misread.

The shift in thinking

The contrast, stripped down, looks like this. The old frame asks: how do we use AI to do what we already do, faster? The new frame asks: what becomes possible, and what becomes fragile, when cognition is partly held by a substrate we share?

The old frame measures: productivity, cost, and time saved. The new frame also measures judgment quality, cognitive diversity, second-order effects, and what gets quietly displaced from the loop. The old frame deploys AI.

The new frame designs with it, around it, and for the human capacities that need to remain load-bearing on the other side.

Neither frame is romantic. Both involve hard decisions. The second one, in my view, is more honest about what is actually happening.

A note on where this comes from

I have spent enough time at the seams between strategy, design, AI systems, and the messier human side of how organisations actually behave to be suspicious of clean stories. The Cognitive Infrastructure Model is offered here as a working scaffold rather than a final answer. I think it is useful because it forces a conversation that the tool framing tends to prevent. The teams I have seen handle this best are the ones that stopped treating AI as a procurement question and started treating it as a question about what kind of thinking they want their organisation, and the people in it, to be capable of in five years.

That is a strategy question. It is also a design question. And, increasingly, it is a civic one.

Closing

The next decade of AI will probably not be decided by who has the largest model or the slickest interface. It is more likely to be decided by who correctly names what is happening and designs for it at the layer where it is actually happening. The IMF’s January 2024 staff discussion note on generative AI and the future of work makes a related point in the language of policy: the gains can be substantial, but the distributional and institutional design choices around AI will matter at least as much for outcomes as the raw capability curve (Cazzaniga et al., 2024). The tool framing is comfortable, and on present evidence, it is going to cost us. Cognitive infrastructure is a less comfortable frame, but it has the advantage of being closer to what we are actually building.

In many contexts, AI is already beginning to function as infrastructure, not because anyone formally decided it should, but because the deployment patterns of the last few years have moved it there. We do still get to choose whether we design it deliberately or inherit it by default. The difference between those two paths is, I suspect, one of the more consequential choices this generation of leaders, designers, and institutions will make.

It is worth making it on purpose.

References

Bommasani, R., Creel, K. A., Kumar, A., Jurafsky, D., and Liang, P. (2022). Picking on the Same Person: Does Algorithmic Monoculture Lead to Outcome Homogenization? Advances in Neural Information Processing Systems 35 (NeurIPS 2022).

Cazzaniga, M., Jaumotte, F., Li, L., Melina, G., Panton, A. J., Pizzinelli, C., Rockall, E., and Tavares, M. M. (2024). Gen-AI: Artificial Intelligence and the Future of Work. IMF Staff Discussion Note SDN/2024/001. Washington, DC: International Monetary Fund.

Critch, A. and Krueger, D. (2020). AI Research Considerations for Human Existential Safety (ARCHES). arXiv preprint, arXiv:2006.04948.

David, P. A. (1990). The Dynamo and the Computer: An Historical Perspective on the Modern Productivity Paradox. American Economic Review, 80(2), Papers and Proceedings, 355–361.

Deloitte AI Institute (2024). The State of Generative AI in the Enterprise: Quarterly Survey Series, Q1–Q4 2024. Deloitte Touche Tohmatsu Limited. Department for Education (2024). Generative Artificial Intelligence (AI) in Education: Departmental Policy Paper. London: GOV.UK.

Doshi, A. R. and Hauser, O. P. (2024). Generative Artificial Intelligence Enhances Individual Creativity but Reduces the Collective Diversity of Novel Content. Science Advances, 10(28), eadn5290.

European Parliament and Council of the European Union (2024). Regulation (EU) 2024/1689 of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act). Official Journal of the European Union.

General Medical Council (2024). Good Medical Practice 2024, together with the GMC guidance on Artificial Intelligence and Innovative Technologies. London: General Medical Council.

Gerlich, M. (2025). AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking. Societies, 15(1), 6. Kleinberg, J. and Raghavan, M. (2021). Algorithmic Monoculture and Social Welfare. Proceedings of the National Academy of Sciences, 118(22), e2018340118.

Kosmyna, N. et al. (2025). Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Tasks. MIT Media Lab working paper (preprint, not peer-reviewed at time of writing).

Lee, H.-P. (Hank), Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., and Wilson, N. (2025). The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects from a Survey of Knowledge Workers. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery.

Liu, X., Faes, L., Kale, A. U. et al. (2019). A Comparison of Deep Learning Performance against Health-Care Professionals in Detecting Diseases from Medical Imaging: A Systematic Review and Meta-Analysis. The Lancet Digital Health, 1(6), e271–e297.

Metz, C. (2023). ‘The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead. The New York Times, 1 May 2023.

MIT RAISE (2024). Responsible AI for Social Empowerment and Education Initiative. Cambridge, MA: Massachusetts Institute of Technology. Ng, A. (2017). ‘AI Is the New Electricity.’ Talks at GSB, Stanford Graduate School of Business, 2 February 2017.

Stanford Institute for Human-Centered Artificial Intelligence (2024). Artificial Intelligence Index Report 2024. Stanford, CA: Stanford University.

Stanford Institute for Human-Centered Artificial Intelligence (2025). Artificial Intelligence Index Report 2025. Stanford, CA: Stanford University.

Wong, A., Otles, E., Donnelly, J. P. et al. (2021). External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. JAMA Internal Medicine, 181(8), 1065–1070.

World Economic Forum (2025). The Future of Jobs Report 2025. WEF Insight Report, January 2025. Geneva: World Economic Forum.