This paper argues that the discipline of architecture currently lacks a robust, critical methodology for engaging with generative AI, predominantly treating the technology as a superficial "vending machine" where textual inputs yield discrete, decoupled visual outputs. To move beyond this paradigm in which architects act as "epistemic coroners" evaluating inert, post-mortem imagery, the author proposes a shift toward "ontological navigation," urging designers to engage with AI as a dynamic, emergent process.
Keywords: Generative AI, Process Philosophy, Generative AI in Architecture, Design Methodology, Ontology of Technology, Computational Emergence
Generative AI is here. Not as a future prospect or a speculative tool, but as a present condition that is reshaping our world. The architectural engagement with these systems has already begun, the continuation of this engagement is imminent. At the present moment however, architects don’t have a robust, critical methodology regarding AI. We prompt and receive, treating generative AI as a vending machine. Inputs go in and outputs come out. These outputs are then interpreted as discrete objects, decoupled from the system that produced them.
This framing misses something: for the first time, architects have access to a technical system that does not merely represent emergence, but operates as it. There is a philosophical tradition concerned with this topic of emergence, with processes, hybridized logics, discovered forms, and dynamic systems. This school of philosophy is generally known as Process Philosophy. Architecture has had a long standing crush on the concepts this philosophy offers. Hadid, Eisenman, Oxman and Menges are examples of architects who built architectures that looked like these processes. Deleuze, Leibniz, De Landa, and Whitehead, amongst others, have become conceptual wellsprings for architectural discourse and practice because they formalize these processes in a way architects can understand. AI doesn't look like these processes, it is these processes. It is a working technical system whose operational logic structurally mirrors process philosophy.
I trace a lineage of these ideas and overlay them onto the structural mechanics of AI systems. The result shows what AI actually is: a field to navigate, a flow to enter, a process to design. This paper does not offer a methodology. Instead, following Venturi's example in Complexity and Contradiction, it notices something, names it, draws parallels, and then poses provocations. This is a soft manifesto that prompts architecture to reconsider AI.
This paper does not claim that AI is conscious, nor that AI = ontology. It recognizes that AI exists within a larger framework of computation. My claim is that architects are not thinking deeply enough about how they use AI. To support this claim I reframe AI as a process condition instead of a tool. This will be accomplished in five parts. One, an outline of the current way architecture engages with AI. Two, an in depth explanation of how AI works on a mechanical level. Three, illustrating a series of mirrorings between AI systems and process philosophy. Four, offering some cultural examples as heuristic devices that make these mirrorings tangible. And five, offering provocations that will prompt architectural discourse to reframe the way it engages with AI systems.
Before proceeding, we must establish the current conditions of AI in architectural practice and discourse. The most visible manifestation is the proliferation of commercial AI education. Social media feeds are flooded with "AI masterclasses" promising magic prompts that will "unlock AI's full potential." These commodified tutorials represent the shallowest engagement with generative AI systems. This is the AI-as-vending-machine paradigm, where the right textual input produces the right visual output. This logic is embedded in most consumer-facing AI interfaces. Adobe's implementation, along with platforms like Midjourney and DALL-E, collapse the entire generative system into a text prompt box. There is limited access to the underlying system. Architects then, have no choice but to endlessly prompt and select from vast quantities of outputs until they find something approximating their initial vision. Language models operate similarly, with control limited almost entirely to prompt engineering. Recent developments in custom agents or specialized chatbots allow users to define system prompts for specific tasks. This is a modest expansion of control, but remains fundamentally within the vending machine paradigm.
A different mode of engagement emerges with open-source platforms like Automatic1111 and ComfyUI. These interfaces allow access to the inner workings of AI systems. In these environments architects can swap encoders, modify sampling processes, load different models, select different weights, and train custom parameters that directly alter how the model generates. Architects can combine pre-trained models, train their own models, fine-tune them on specific datasets, and build custom processing nodes that reshape the probability distributions the system navigates. This is the AI-as-navigable-field paradigm, where design intelligence resides in constructing the conditions of generation rather than evaluating its outputs.
Current architectural practice engages AI along this gradient, though heavily weighted toward the vending machine end. Many prominent offices use AI primarily for rapid visualization and iteration, generating 16,000 images and choosing one. This approach treats AI as a tool for aesthetic exploration, leaving the generative process itself unexamined. A commercial ecosystem has emerged around this type of engagement, with practitioners pivoting from design to AI education and startups developing AI-assisted tools for cost estimation, code compliance, and masterplanning. These applications are useful but they treat AI as a problem-solver for discrete tasks rather than as a system whose operational logic might fundamentally reshape design thinking itself.
Matias del Campo's work represents the most rigorous theoretical engagement with generative AI in architectural discourse to date1. His concept of latent space as territory and his insistence on architectural agency in relation to machine outputs have established essential foundations for thinking beyond naive tool-use paradigms. His framework, grounded in Wittgenstein and semiotics, approaches AI fundamentally as a linguistic and aesthetic problem. For del Campo AI is a system accessed through prompts and understood through the normative interpretations of its outputs. He theorizes what latent space contains and what architects can discover there. This lens allows del Campo to offer critical insights about how architects understand, engage with, and refine their inputs to receive better outputs.
The epistemological, linguistic framing of del Campo's work, however, leaves the ontological nature of generative AI processes largely unaddressed. I approach AI systems from this level. Where del Campo theorizes how architects understand and engage with outputs from the interface layer, I explore how architects engage with the system itself from the operational layer. Where his framework asks what AI outputs mean for architecture, I ask how AI generation operates as an ontological act of creation, and what understanding that process reveals about how we design.
These are not opposing projects but complementary ones. Matias del Campo has theorized the architect as curator selecting from the outputs of a system. I propose the architect as navigator engaging with the system itself. The epistemology of del Campo needs an ontology to support it; I am showing that ontology.
This theoretical gap is hard to ignore. The field has sophisticated output-focused theorization on one end and widespread but largely unreflective tool use on the other, with almost no critical insights into the operational mechanics of generation and what understanding those mechanics might demand of architects who engage with them. This is precisely where a linguistic approach cannot go, because as we will see, the structural mechanics of AI systems are inherently non-linguistic. Latent space can be described with language, and its outputs can be understood linguistically according to human normative justifications, but latent space is not made of language.
Neil Leach's work represents a third position in this landscape. Neither a naive tool-user nor aesthetic theorist, Leach approaches AI through the lens of professional survival, asking how it displaces architects, and raising concerns about who controls it. His frame is economic, sociological, and political: the collapse of the large firm, the democratization of design intelligence, the question of who owns the training data that shapes the latent space. As stated above, I am focused on the ontology of AI systems. I do not theorize at the same level as Leach. However, his insistence that AI represents an "alien intelligence" — a system that does not think the way architects think, and that cannot be domesticated into existing cognitive frameworks — is a point of genuine convergence. He arrives at this conclusion through professional practice. I arrive at it through operational mechanics.
1 Key works by Matias del Campo relevant to the claims made here include: Matias del Campo and Sandra Manninger, "The Generation of Architecture Through Artificial Intelligence: A Method for Semiotic Architecture," Proceedings of the 39th eCAADe Conference (2021); Matias del Campo, ed., Architectural Intelligence: Selected Papers from the 1st International Workshop on Intelligent Architecture (Singapore: Springer, 2022);. Del Campo's framework grounds architectural AI engagement in Wittgenstein's language games and Peircean semiotics, treating the prompt as a semiotic act and the output as a sign to be interpreted normatively.
Understanding generative AI at the level of its operation is the next task, without this, no philosophical structural similarities can be illustrated. This section does that by defining five terms/processes that adequately outline how these systems function. First, how training occurs and what it produces. Second, what latent space is, carefully defined in three distinct senses that are at the moment, inappropriately conflated in architectural AI theory. Third, what generation is and how it works across the two current paradigms: diffusion and flow matching. Fourth, an explanation of how these systems produce novelty. Fifth, what a prompt is, and how it works.
Training: A diffusion model is trained by being shown vast quantities of structured data in the form of images, text, and 3d geometry. The model trains by corrupting this data. To do this, it adds noise in steps until all original, recognizable structure from the input data has been completely dissolved into statistical randomness. This process is called the forward process2. The model's task during the forward process is to learn the statistical patterns that connect the structured data to unstructured data. The model does not memorize how to de-corrupt specific data. Instead, it learns the general patterns that distinguish coherent structured data, from incoherent data. By doing this millions of times, the model slowly learns which configurations of data occur in structured situations, and which configurations don’t, it crucially also learns how probable any given configuration is relative to any other configuration. Put simply, it learns coherence as a statistical condition. The result of training is a learned map of what is probable based on the patterns it learns within that training data. Training is the production of probability mappings.
Latent Space: latent space is an elusive concept because there are different kinds of it. In current architectural theory latent space is given one oversimplified definition, it’s thought of as a space, or a terrain, or a typology, all of these are misleading. I will carefully lay out three definitions here. First, latent space is a general term used within AI systems to describe a high dimensional space that has encoded data learned onto it. It is a cryptic representation of data, not actual data. Specifically, it is a high dimensional representation of the statistical probability relationships of the data encoded into it. In real life, grammar is a latent space of language, it contains relationships between letters and words that form patterns, following these patterns produces structured language, breaking these patterns produces gibberish. The latent space of grammar conditions human speech in the same way that the latent space of an AI model conditions what that model produces.
Second, a Variational Autoencoder (VAE) is a specific type of latent space utilized in generative AI models. VAE’s compress encoded data. They were developed to allow models to produce outputs at higher resolutions, which is otherwise hard to achieve due to computational limitations. The compressed nature of a VAE directly affects what the model can generate. If the entire grammatical latent space of human language was encoded onto a postage stamp in the form of symbols, it would resemble a VAE, the structural patterns of the symbols on the stamp must be decoded into the expanded grammatical space, before they can be applied to human language. The data in a VAE space must be decoded and expanded before it can produce an image.3
Third, Contrastive Language-Image-Pre-Training (CLIP) space is also a latent space. This is where both images, and corresponding descriptive texts are encoded together. In CLIP space the structural relationships of the data are aligned semantically4. The actual space is not made of language, instead it is a high dimensional representation of linguistic relationships. Distance in this space is measured by semantic similarity between the represented data. Users access the CLIP space through the prompt window, where textual input is encoded into it and used to influence the model as it navigates the VAE latent space5.
Latent space then, is a cryptic realm of statistical relationship patterns that emerge from a data set; different types of latent space are used in AI systems toward different ends. Latent space is not physical or semantic, it's not a set terrain learned in training. A model does not traverse it like a terrain, the latent space is an emergent property of the repeated approximations made by the model as it calculates probability densities. It doesn’t persist as a space, the potential of its existence persists. The criteria to generate it are encoded into the model's weights–the probability mappings learned during training– and stored in what are called tensors.6 During generation, latent space materializes and evaporates repeatedly, it is a phantom byproduct of the navigation process.
It is clear that the term Latent Space must be treated with a higher degree of precision than it currently enjoys in the field of theoretical architectural discourse. It can be better described as an ontology of the digital, or a hauntology of data. It’s real but it doesn't exist, models learn to summon it. It has different uses and manifestations for different purposes, each layer of the definition pertains to different theoretical claims, as we will see.
Generation: The generative process then, can be defined as activating the learned map of probabilities at a point within VAE latent space that is statistically random and then allowing the model to navigate that space according to the learned probability gradients that structure it. There are currently two main generation paradigms that accomplish this, diffusion, and flow matching. We can describe these as different methods for navigating the same VAE latent space.
The diffusion navigation method uses what is called a score function to navigate7. A score function is a learned approximation of the direction in which probability increases most steeply at any given point in the latent space. It looks at the level of structure currently present in latent space, and points in a direction that contains more structure. Diffusion models repeat this process for every sample in the latent space across a number of steps, moving samples from noisy regions towards regions with less noise. Flow matching models navigate in a different way, they use what is called a vector field, which is a mathematical expression that assigns both a direction and an amplitude to every point in the latent space8. The model follows this vector field using time as a path parameter. In both flow matching and diffusion we see a coarse-to-fine progression, large blocks and shapes come first, followed by finer details, this is a direct result of the shape of the probability map of the latent space.
Four things are important to clarify about generation: firstly, both diffusion and flow matching navigate the same latent space but in different ways. Secondly, generation is not teleological, at no point does the model have a meta view of the latent space, nor does the model understand an end goal to accomplish. Third: user inputs coming from CLIP are directional pressures, not targets. Fourth, an output is simply a human imposed arbitrary arrest of the navigation process. Models do not know what an output is, they only know endless local navigation.
Novelty: The generation process does not recombine the data it trained on. The generation process results in the actual production of novelty. This novelty operates within the bounds of the learned distribution — the system cannot produce what its training data made statistically inconceivable — but within those bounds, every trajectory is genuinely unrepeatable. Again the grammar analogy is helpful for understanding how this works. Actual novel language is created from within grammar space.
In AI, this claim about the production of novelty is substantiated by a concept called symmetry breaking9. In the symmetric state, no single pattern from the training data dominates the geometry of the latent space. At a point determined by the specific noise sample used to start that generation, the model undergoes a bifurcation. It excludes many possible paths in favor of one path. The bifurcation, the point where symmetry is broken, commits the model to traverse a specific “region” of latent space. Different starting noise produce different bifurcations, which in turn produce genuinely different traversals. Many generations traverse the same latent space, each one pursuing a singular trajectory. This symmetry breaking phenomenon establishes that the generative process is a non-reversible computation. The model must make a decision that cannot be unmade. The image itself can be traced back and reproduced, but the creation computation is a non-reversible decision.
Prompt: Generation takes place in a latent space that is non-linguistic and non-semantic, yet natural language text prompts are used to influence the generation process. This seemingly unintuitive connection is possible because of Contrastive Language-Image-Pre-Training (CLIP) space. When the user types a prompt, it is embedded in CLIP space and converted into a vector. This vector is a result of the probability map that formed when the CLIP space was produced via training. A model can generate images without a text input because as we have seen, the VAE latent space is already loaded with directional potential. However, the mechanism that joins these two spaces is called cross-attention, this is the precise moment when language, encoded into a vector, becomes a directional pressure on VAE latent space. Via cross attention, CLIP is projected into the generation process. The strength of this influence is adjusted via a parameter called Classifier-Free Guidance (CFG) — the higher the value, the more the prompt constrains the navigation; the lower the value, the more freely the system follows its own intrinsic geometry10. Grammar is a latent space that can be used to produce human language, CLIP can be imagined as a parameter that influences grammar to “use iambic pentameter”, or “use free verse” to determine what areas of language grammar explores in the creation process.
Conclusion: This account of how AI systems work is not exhaustive, but it provides us with the vocabulary needed to start treating AI as a field to navigate instead of a vending machine. These systems do not contain outputs waiting to be retrieved. They contain a structured field of probability. Multiple latent spaces built from multiple trainings, each charged with the statistical relationships stored in vast quantities of data, interacting with each other. Generation is a navigation of this field in a constrained but genuinely open way. Form emerges through a sequence of local decisions under probabilistic constraint, initiated by noise, directed by learned statistical probabilities, committed to at a bifurcation point that no instruction determined, and biased by but not controlled by the architect's prompt. The output is an arbitrary freezing of a process that has no conception of what an output is.
This understanding of generative AI systems presents architects with a familiar vocabulary of fields, geometries, gradients, bifurcation points, vectors, flows, charges, mappings, and pressures to engage with. It is this engagement that the remainder of this paper explores.
2 The mathematical formalization of the forward process as a fixed Markov chain that gradually adds Gaussian noise to data is established in Jonathan Ho, Ajay Jain, and Pieter Abbeel, "Denoising Diffusion Probabilistic Models," Advances in Neural Information Processing Systems 33 (2020): 6840–6851. The reverse process — learning to denoise — is the model's training objective. The forward process destroys structure deterministically; the reverse process reconstructs it probabilistically. These are asymmetric operations, which is why the creation computation described later in this paper is non-reversible even though the output can be reproduced.
3 The use of a Variational Autoencoder to compress the diffusion process into a lower-dimensional latent space — enabling high-resolution image generation at manageable computational cost — was introduced in Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer, "High-Resolution Image Synthesis with Latent Diffusion Models," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022): 10684–10695. This paper introduced Stable Diffusion's foundational architecture. The VAE encodes images into a latent space approximately 48 times smaller than pixel space, which is why the compression described in the main text is architecturally significant — it is not merely a technical convenience but a fundamental shaping of what the model can produce.
4 CLIP is trained on approximately 400 million image-text pairs using a contrastive objective: image and text representations of the same concept are pulled together in the shared embedding space while representations of different concepts are pushed apart.
5 CLIP was introduced by Alec Radford, et. al, "Learning Transferable Visual Models From Natural Language Supervision," Proceedings of the International Conference on Machine Learning (2021): 8748–8763.
6 A Tensor is a container for data that supports linear algebraic operations across specific axes; in a deep learning context, tensors represent the weights and activations that the model manipulates to map inputs to outputs.
The AI mechanics outlined above are structurally equivalent to concepts found in process philosophy. A tradition that offers the notion of becoming as an alternative to the traditional western notion of being. Originating in the pre-Socratic ideas of flux & Apeiron described by Heraclitus and Anaximander, and developed over the past 2,500 years by Lucretius, Baruch Spinoza, William James, Henri Bergson and Alfred North Whitehead, to Gilbert Simondon, Michel Serres, Gilles Deleuze, and finally to Manuel De Landa11. In this lineage, architecture finds an already familiar ground from which to approach AI. I will focus on Whitehead, Simondon, Serres and De Landa. I define structural mirroring as: the identification of the same operational logic appearing independently in two entirely different systems.
I will proceed by outlining the mirrorings across three categories. The field, the trigger and the navigation. I will maintain these categories as clearly as the material allows, the boundaries will occasionally blur however.
The Field: In his book Individuation in Light of Notions of Form and Information, Gilbert Simondon inverts the traditional western philosophical categorization of the individual as the primary category by arguing instead that the process of individuation is primary12. He asks us to stop trying to understand the individual by looking at the individual, and instead to seek understanding by looking at the process of individuation that creates it. He uses the creation of a brick to explain: The clay is inanimate matter, and the mold is form imposed on it, creating a brick is simply pressing the matter into the form. According to Simondon this is a profound conceptual error. The clay—matter—used to make the brick is not passive, it has implicit form, a specific chemical makeup and moisture level that makes it adequate brick-making-clay. Furthermore the clay must be prepared, dug up, cleared of stones, uniformly hydrated and mixed to reach a specific state of plasticity before it can become a brick. The mold—form—is also not abstract, it has an implicit form of its own, it has wood grain, joints, imperfections, physical limits, it must be dusted or oiled before receiving the clay. The operation where a craftsman presses the clay into the mold and the two push against each other is the individuation process for a brick. The precise moment of individuation happens when the clay's expansive pressure and the mold's restraining pressure equalize.
To further explain this brick example, Simondon introduces the terms preindividual and metastability. The preindividual is what something is before it becomes an individual. The prepared clay is in a preindividual state before it individuates to become a brick. The prepared clay is metastable. A metastable state contains more potential individuals than the individuation of any one individual can exhaust. He borrows the term supersaturation from chemistry to further explain this. A solution is supersaturated when it suspends more material than it normally should be able to at a given temperature, the solution will remain in liquid form until perturbed, at which point it transforms. Salt dissolved in water before it crystallizes is supersaturated.
Simondon's metastability is latent structured potential. Every possible variation of salt crystals exists in the supersaturated water, and perturbing the water will cause one specific form of those crystals to materialize. Metastable clay can be pressed into many forms, the creation of one brick in one shape does not exhaust the plurality of forms possible to the clay, it simply becomes one form among many possible forms. The preindividual state is metastable, supersaturated with individuals that are real as potential, but do not exist as a specific form.
This ontological understanding of becoming allows Simondon to redefine information as an action. Information only exists when individuation is underway — it is a sizing up of the resolution of potential in a preindividual field. Supersaturated water and salt crystals are the same thing expressed at different levels of information13. A VAE is a metastable latent space. The image the VAE outputs is the VAE at a higher resolution. This observation is the bridge between Simondon's framework and the structural mirroring that follows.
The Mirroring: We can see in Simondon a robust philosophical scaffolding for understanding generative AI as an ontological act of emergence from a metastable field. The VAE latent space is the preindividual field. As we have seen, the VAE is not a database of stored images, nor is it random noise. It is a structured field of statistical tensions whose geometry is shaped by training but whose specific outputs are not predetermined by it. It is prepared like brick-making-clay. VAE latent space is supersaturated with image-potential. Every possible outcome it can produce exists as real without being actual.
The generation process is the individuation process. The random noise sample is the perturbation that triggers the individuation process without predetermining the outcome. Just like tapping the salt water glass starts the crystallization process without determining the shape of the crystals. The coarse-to-fine emergence of structure during generation is individuation proceeding from the general to the specific, from dominant tensions resolving first to finer incompatibilities resolving within the constraints already established. Just as it happens with bricks, the clay fills major voids first, and the further into the mold it is pressed, the finer the details it picks up.
If VAE latent space is the preindividual, and generation is the individuation process, then the generated image is the individual: a temporary, specific form emerging from the preindividual field. The output from a generative AI is one specific enactment of its individuation process. Understood through this lens, the output and the latent space it comes from can be seen as the same thing at two different resolutions — the output has the same information as the VAE but at a higher degree of human legibility.
The Architectural Meaning: Manuel DeLanda's reading of Simondon gives this mirroring its architectural consequence. DeLanda draws a distinction between the virtual and the possible. The possible is simply the actual before it happens. The virtual is different: it is a real field of potential that does not resemble any of the specific forms it can produce. The latent space is virtual in this sense14. It does not contain possible images. It is a real field of potential from which actual images are individuated. This distinction reframes what architectural design means in the context of generative AI. Architects like Lynn, Hadid, and Spuybroek engaged with process philosophy at the level of the actual. They produced forms that looked like emergence, that referenced metastability, that allegorized individuation. They worked with the concepts and expressed them through form. The VAE latent space offers something different: direct access to the virtual field itself.
DeLanda urged architects to become hackers in order to engage with these processes in a robust way15. The architect is now able to do so without needing to learn to write code because AI systems offer unprecedented access to the process itself. Architects can work directly inside the process. Designing with generative AI, understood this way, is not about selecting outputs, it's not about prompt engineering. It is about shaping the conditions of individuation.
The Trigger: Lucretius articulated the clinamen as the minimal disruptive force needed to ensure the formation of the world16. Without this deviation, he posits that atoms would fall in parallel through the void, never touching, therefore never combining to form anything. The clinamen is the minimum swerve needed to break the symmetry of a system and enable it to generate form. This ancient hypothesis, once firmly locked within the humanities, was rescued by Michel Serres in 197717.
In his work The Birth of Physics, Serres shows the clinamen to be a rigorously accurate treatise on fluid dynamics, thermodynamics, and chaos theory, created nearly 2,000 years before those fields even existed. Starting with the base condition, atoms falling in parallel lines is equivalent to laminar flow in fluids. A state of zero interaction where nothing happens, nothing is created, and there are no events. Interestingly, Serres notes that in such a state there is also no information. The clinamen, or swerve, is the infinitesimally small deviation that disrupts this state. One swerving atom bumps into another atom, setting off a chain reaction that creates a vortex. Serres describes this turbulence as the birthplace of all matter, life, and form. He argues that reality is a series of higher level forms born from the disruptive act of the clinamen. People, architecture, language, bodies, all these things are stabilized vortices. The creation of these vortices is also the creation of information. From this, two things are evident: form comes as a result of disruption, and, the disruption itself does not determine a specific outcome but instead unleashes the potential for an outcome to emerge.
The mirroring: An AI system prior to the symmetry breaking bifurcation point is in a state similar to laminar flow in the sense that it is energetically balanced so that nothing happens, nothing interacts, and therefore nothing is produced. The noise sample is the clinamen for this system. It is the mathematically tiny disruption that breaks the symmetry and provokes restabilizations of higher forms.
Another structural parallel is the Lucretian concept of the clinamen being a random chance and occurring at a random time. We can see this mechanic playing out in AI systems, the starting seed produces bifurcation points in the model. Different starting seeds induce this bifurcation point at different times, randomly. The output of the model is a temporarily stabilized form of the latent space, it maintains its coherence as an individual of the field that created it. This is how Serres describes the vortex, a temporary but real structure that keeps its shape in spite of the flowing patterns that produced it. Where Simondon gives us the metastable field, Serres gives us the event that disturbs it — together they account for the two most ontologically significant moments in the generative process.
The Architectural Meaning: If we accept the Serres mirror, the architect's use of AI must be seen as designing the flow, applying the prompt as pressure and then initiating the swerve to see what vortex emerges. This would also mean that the vortex is not a result but a high resolution expression of the design process itself. The architect who understands this uses the random seed as the gateway to higher order structure. This is the critical path to moving beyond the AI-as-a-vending-machine paradox. Crucially, this mechanic is not linguistic, it’s not a text prompt or a language input, it is a random chance that allows architects to actively take part in the preconditioning of higher order structure. There are mechanisms present within AI systems that make this possible. For example: noise scheduling influences the strength and timing of the swerve, seed interpolation allows for the offset of one swerve against another. Beyond this, architects can discover and design their own methods that treat noise as a design tool instead of a slot machine lever. The field and the trigger that activates it are now clear, next we will discuss the navigation process itself.
The navigation: In Process and Reality, Alfred North Whitehead provides us with the vocabulary and conceptual framework to understand the generation process as: what happens after the metastable field is instigated by a clinamen. Much like Simondon, Whitehead rejects being in favor of becoming. He is reacting to a western philosophical tradition that operates under the stipulation that reality is made of persistent beings that events happen to. The billiard ball is a discrete thing, rolling it across a table makes rolling happen to it. Whitehead inverts this paradigm. He describes reality as a series of rapid sequential events called Actual Occasions.
In Whitehead’s universe, an actual occasion is the basic building block of reality. These actual occasions exist through a process called concrescence, an act he describes as “the many become one, and are increased by one18.” Concrescence, from the Latin concrescere, means to grow together. Using this term, he explains actual events as: an understanding of the immediate past, followed by an application of that understanding towards a subjective aim, followed by the completed execution of applying the understanding to the aim, at which point the occasion completes itself and ceases to exist. Understanding the past, he calls prehension. Completing the application of the prehension to the subjective aim he calls satisfaction. He also describes negative prehension as the selective exclusion of past data. Prehension, whether positive or negative need not be consciously understood, to prehend is to feel, sense, subjectively know. Reality according to this paradigm is an endless succession of actual occasions that are: informed by their immediate past conditions, oriented towards a subjective aim, and then reach satisfaction and perish.
For Whitehead, it is metaphysically impossible for an occasion not to reach satisfaction. Every occasion completes the process of becoming itself, if this does not happen then that occasion does not exist. Once an occasion completes the process of concrescence it stops becoming what it is because it is what it is, this cessation of the process of becoming is what causes it to vanish. It then becomes one of the many data points of the past waiting to be prehended by the next occasion. To ensure these perished occasions are not simply lost to the void, Whitehead introduces the concept of the consequent nature. Which is the persistent, accumulating reality that saves every perished occasion, weaving their objective immortality into an ever-growing, unified whole.
The Mirroring: Whitehead mirrors the generation process in AI by exposing it as a series of quantized events. AI systems use the same epochal structure to generate their outputs as Whitehead's actual occasions. At any given step, a model will prehend the data structure from the previous step and unify it into a direction for the next step, once it moves on to that next step, the previously prehended information vanishes. The subjective aim takes the form of either, the VAE itself, or the CLIP. These spaces act as the orienting condition present from the beginning of the process that biases each generation toward a region of the probability landscape. AI models are local navigators not teleological truth seekers. The final decoded output is the consequent nature of the navigation, it contains and preserves all the previous steps19. The output image contains the accumulation of every satisfied occasion that preceded it.
The architectural meaning: If we accept Whitehead’s ontology, the "vending machine" paradigm does not just misunderstand AI; it fundamentally misses how AI creates. By focusing on only the text prompt, architects have fetishized the output and ignored the process. However, if we view generative AI as a Whiteheadian process of sequential occasions locally prehending their environment, unifying, and then joining the collective accumulated past that will be prehended by whatever occasion follows, we can see that an output is nothing more than the frozen corpse of a dead mathematical process. Architecture is obligated to participate in the process of generation if it wants to thoughtfully and critically engage with AI systems. Otherwise design will remain on the surface level of prompting and output interpretation. This is a design methodology carried out post-mortem, on exhausted, inert objects that carry no vitality or productive potential. Architects who understand this will learn to, nudge, override, influence, hijack, hack into, or corrupt the process itself. Architects must leave behind the epistemic coroners they are, and become the ontological navigators they are meant to be.
The structural mirrorings identified above are philosophically precise but conceptually demanding. Especially when linked to technically dense AI mechanical terms. Three works from literature and film offer more intuitive ways to see these relationships. They are cultural moments where the operational reality of generative AI and process philosophy has already been intuited without being explicitly named. Each maps onto one of the three mirrorings in sequence.
T.S. Eliot's Four Quartets opens with a line that describes the forward and reverse process with uncanny precision: "In my beginning is my end, and in my end is my beginning." The destruction of structured data in the forward process is the precondition for generation to occur. The end is already present in the beginning. The beginning is already implicit in the end. But the full context of the poem titled East Coker extends this further: "Houses rise and fall, crumble, are extended, are removed, destroyed, restored." Eliot is describing the cyclical logic of dissolution and emergence that governs the full lifecycle of a generative model who will eventually consume its own outputs in training. If a model is trained on its own outputs, the Eliot cycle completes itself entirely. The beginning and the end renew the process together.
Goethe's Faust opens with its protagonist in a state of learned paralysis. At the beginning of the play Faust is metastable. He is supersaturated with knowledge. But Faust is stuck. He spent his life acquiring knowledge and wisdom, only to be paralyzed by the dread of realizing that it was a waste of time. Mephistopheles appears not as pure evil but as the destabilizer. When he famously says “I am part of that power which would do evil evermore, and yet creates the good.” He is exercising his power of negation as a means to generate movement. Mephistopheles has no creativity of his own, but he can unlock Faust's creative force. Just like a random seed, he intrudes from outside the system. His infiltration does not determine where Faust will go. He does not care where Faust will go. He simply breaks the symmetry of the stasis and releases the potential for a trajectory to form. Different Mephistopheleses would have produced different Fausts. What Goethe understood intuitively is what Simondon, Lucretius and Serres formalized: the trigger does not contain the outcome. It simply makes the outcome possible. In Faust, the “spirit that negates” is the catalytic spirit.
Darren Aronofsky's Black Swan is a film about concrescence. In the film, Nina's technical purity, repressed aggression, sensuality, ambition, and fragility all exist within her in an unintegrated way. Throughout the film these parts of her psyche are fractured. The use of mirrors, doubles, hallucinations, her changing body, are all signs that her many parts have not become one. The film's entire narrative arc is the process of Nina becoming what she is through the violent integration of these parts. The final performance is satisfaction. "I was perfect" is not a declaration of quality. It is the recognition of her unity. Followed immediately by her death. Once she reaches the fullness of her becoming she cannot exist anymore. What remains is the objective immortality of her performance. It is the frozen record of the process that produced it, available to be prehended by whatever comes next.
These three convergences are not coincidental20. They suggest that the operational logic this paper identifies in generative AI is not a projection onto these systems but a rediscovery of something human culture has intuited repeatedly, in different registers and different centuries, without ever having the technical vocabulary to name it with precision. Generative AI does not illustrate these intuitions. It enacts them. This quality holds vast latent potential for architectural discourse if it is able to recognize and engage with it.
11 This lineage is not exhaustive, but these thinkers have exerted great influence on architectural discourse historically and their concepts continue to do so in the present day.
12 Gilbert Simondon, L'individuation à la lumière des notions de forme et d'information (Grenoble: Millon, 1995). The English translation by Taylor Adkins — Individuation in Light of Notions of Form and Information (Minneapolis: University of Minnesota Press, 2020) — is the edition referenced here. Simondon's account of the hylomorphic schema and its critique remains the most sustained philosophical treatment of the relationship between matter, form, and individuation in twentieth-century continental philosophy.
13 Simondon's redefinition of information is developed most fully in the second part of Individuation in Light of Notions of Form and Information, where he argues that information is not a signal transmitted between a sender and receiver but the event of individuation itself — the resolution of preindividual potential into structured form. This reframing has direct consequences for communication theory, cybernetics, and — as this paper argues — for understanding the output of a generative AI system.
14 The virtual/possible distinction, as used here, follows DeLanda's reading in Intensive Science and Virtual Philosophy, which itself develops Gilles Deleuze's formulation in Difference and Repetition, trans. Paul Patton (New York: Columbia University Press, 1994), pp. 208–214.
15 Manuel DeLanda made this claim in lectures and seminars at Columbia University's Graduate School of Architecture during the early 2000s, arguing that architects needed to develop the technical literacy to intervene in computational processes directly rather than accepting the interfaces provided by commercial software.
16 Titus Lucretius Carus, De Rerum Natura [On the Nature of Things], trans. W.H.D. Rouse, rev. Martin Ferguson Smith, Loeb Classical Library (Cambridge: Harvard University Press, 1992). The clinamen is described in Book II, lines 216–293. Lucretius writes: "When atoms move straight down through the void by their own weight, at quite indeterminate times and places they swerve a little from their course, just enough that you could call it a change of direction. If it were not for this swerve, everything would fall downwards like raindrops through the abyss of space. No collision would take place and no blow would be created among the atoms: nature would never have created anything."
17 Michel Serres, La Naissance de la physique dans le texte de Lucrèce: Fleuves et turbulences (Paris: Minuit, 1977). Published in English as The Birth of Physics, trans. Jack Hawkes, ed. David Webb (Manchester: Clinamen Press, 2000). Serres's central argument is that Lucretius was not writing mythology or poetry but physics — that De Rerum Natura is a rigorous account of fluid dynamics, turbulence, and the emergence of higher-order structure from minimal deviation, produced two millennia before the mathematical apparatus to formalize these phenomena existed.
18 Alfred North Whitehead, Process and Reality: An Essay in Cosmology, corrected edition, ed. David Ray Griffin and Donald W. Sherburne (New York: Free Press, 1978). Originally published 1929. The quoted formulation — "the many become one, and are increased by one" — appears on p. 21 and is Whitehead's central description of the creative advance: each actual occasion unifies the many data of its past into a novel unity, and in doing so adds itself to the multiplicity available for the next occasion to prehend.
19 This is a functional rather than strictly metaphysical claim. Whitehead's consequent nature refers to God's preservation of all perished occasions in an everlasting unified experience — a theological concept that this paper deliberately brackets. The functional parallel being drawn is more limited: the decoded output image contains within its structure the accumulated decisions of every generation step that preceded it, in the same way that the consequent nature preserves the objective immortality of every perished occasion.
20 Eastern philosophical traditions arrive at structurally equivalent intuitions through entirely independent paths. Lao Tzu's concept of wu wei — action that works with the natural tendency of things rather than against them — maps onto the navigational relationship with generative AI with genuine precision: the architect who works with the intrinsic geometry of the latent field rather than forcing it toward a predetermined output is practicing something structurally equivalent to wu wei. The Tao Te Ching's observation in Chapter 17 that "when the work is completed, it is forgotten" mirrors both the perishing of Whitehead's occasions and the arbitrary arrest of the generative process — the output is not the work, the work is the process. Dogen's concept of uji — being-time, the inseparability of existence and temporal becoming — developed in the Shōbōgenzō (c. 1231–1253), maps onto Whitehead's actual occasions with striking structural equivalence: every being is a moment of time, and every moment of time is a being. Dogen also writes that the present moment does not simply pass into the past but persists as the present of that moment permanently — a claim that mirrors Whitehead's objective immortality and the consequent nature. That these structural parallels emerge independently across traditions developed in complete isolation from one another strengthens rather than merely coincidentally supports this paper's central claim: the operational logic identified here is not constructed. It is discovered.
In this paper I lay out the current condition of the relationship between architectural discourse and AI. I give the mechanical systems of AI the in depth analysis architectural discourse owes them. And I explore how these systems directly parallel the concepts of process philosophy that have a long, rich history of influence on the field of architecture as a whole. I provide three cultural examples to help make these points more tangible. Now I will offer a series of provocations.
I. Prompt Fetish: The prompt is the least interesting component of generative AI. It is a thin interface of control that offers the illusion of agency while masking the systemic depths below. The architect who seeks to optimize their prompt is merely polishing the glass of the vending machine; the architect who seeks meaningful engagement must move beyond the interface and into the field.
II. Design the Swerve: The noise sample is currently treated as a slot machine lever—a random roll for aesthetic variety. It is, in fact, the most architecturally significant parameter in the system. It determines which world will individuate from the metastable field. Why are we rolling for luck when we should be designing the swerve?
III. Output Necrosis: The output is a corpse. Why are architects focused on interpreting dead pixels? We must stop being epistemic coroners and start being ontological navigators.
IV. Curation vs. Navigation: Latent space is not a territory to be "discovered" through curation; it is a preindividual field to be designed. Dataset curation is far more important than output curation. Is dataset curation the most fundamental act of design available to the contemporary architect? To design the data is to architect the possibilities of the future, is it not?
V. Iteration Needs to Die: Even before generative AI, architectural practice was built on and is currently addicted to the "brute force" method of excessive iteration—a paradigm built on the underpaid labor of a precarious workforce. If we can design the navigation process itself, can we trust the system to produce a definitive result on the first shot? Can we finally kill the paradigm of "more is better" in favor of "the process is correct."?
VI. Token Drift and Vitality: AI is a condition, not a tool. We see this in the "sweet spots" and the "context drift" of LLMs. Sometimes the system is vital, sometimes it slides into incoherence or repetition. Can architects learn to design with these rhythmic drifts? Can we treat the "exhaustion" of a model’s context window as a site of architectural opportunity rather than a technical limit?
VII. Force Multiplication: The "Starchitect" model relies on the myth of the lone genius holding a monopoly on novelty. When a system can produce 3,000 viable provocations in fifteen minutes, the "black cape" loses its magic. Will AI act as the force multiplier that allows small, agile firms to dethrone the calcified mega-firms? Or will the industry’s titans simply monopolize AI and pull the ladder up behind them?
VIII. Agency of Data Flows: It is naive to believe that "prompting the right way" can counteract the structural biases of a dataset containing billions of points. Agency does not lie in asking AI to "be fair"; it lies in the deep entrenchment within the system to alter its biases from the operational layer. Architecture must move from performative prompting to the technical hijacking of the model's internal weights.
IX. Complex Navigation Over Complex Problems: If we can master the complex dynamics of AI—its flows, pressures, and bifurcations—can we finally gain the literacy required to address other hyper-complex systems? Climate change, migration, and urbanization are not "problems" to be solved; they are metastable fields to be navigated. AI is our training ground for these systems that up until now have been insurmountable to practice. Complex problems are not problems, they are complex navigations.
X. The Lab as Site of Resistance: An AI lab is not a software training facility. It is a site of resistance against the commodification of design intelligence. It is the only place where theory and practice can iteratively inform one another to ensure that architecture sees AI for what it is before the window for critical engagement closes forever.
We possess, for the first time, a technical system that does not merely represent emergence, but operates as it. The urgency of this realization cannot be overstated. As commercial interfaces continue to collapse the generative process into a single, shallow prompt box, the university must assert its role as the site where this box is pried open, or, exploded from within. This transition requires more than just new discourse; it requires a new type of space—a dedicated environment for "ontological navigation" where the technical mechanics of AI are interrogated, bent, explored, corrupted, designed. Establishing such a laboratory within the university will enable the field of architecture to produce critical, research led practices that keep architecture relevant.
The lab produces navigators — architects who understand the preindividual field they are working with, who know the geometry of its latent space, who can read a bifurcation and understand what it means for the trajectory that follows, who treat the noise sample as a design decision and the output as a high-resolution expression of a process they understand and have shaped. This lab creates a theorized practice of field navigation. The images that emerge are the evidence that it works. What the lab produces is the practice itself — repeatable, documentable, improvable, and transferable to students who will carry it into the profession as a genuinely new form of design intelligence.