{"id":607,"date":"2026-06-04T13:54:30","date_gmt":"2026-06-04T13:54:30","guid":{"rendered":"https:\/\/florentinudrea.net\/?p=607"},"modified":"2026-06-14T13:03:04","modified_gmt":"2026-06-14T13:03:04","slug":"meet-the-diffusion-policy-generative-ais-leap-into-the-physical-world","status":"publish","type":"post","link":"https:\/\/florentinudrea.net\/index.php\/2026\/06\/04\/meet-the-diffusion-policy-generative-ais-leap-into-the-physical-world\/","title":{"rendered":"Diffusion Policy: Generative AI\u2019s Leap into the Physical World"},"content":{"rendered":"\n<p>In <a href=\"https:\/\/florentinudrea.net\/index.php\/2026\/04\/17\/diffusion-models-the-secret-sauce-of-generative-ai\/\">the previous post<\/a>, we explored the rise of diffusion models in generative AI. The main takeaway was that instead of acting as simple &#8220;expected value&#8221; predictors, like classic neural networks do, diffusion models can map out complex, arbitrary distributions. This mathematical property is exactly why they are able to generate crisp, detailed images rather than averaging everything into a blurry blob.<\/p>\n\n\n\n<p>The robot learning community is no stranger to the &#8220;average the correct answers&#8221; kind of behavior either, especially in imitation learning paradigms (our way of teaching robots from expert data). Think of an autonomous car going straight towards a pole: <strong>its training dataset has seen expert drivers avoid the pole by going right or by going left<\/strong>. Both methods would be completely valid. A standard neural network, however, doesn&#8217;t see two distinct options. <strong>It tries to find a mathematical compromise between the two, averages the steering angles, and drives straight into the pole<\/strong> (<em>not a fun ride, if you ask me<\/em>).<\/p>\n\n\n\n<p>Recently, the robot learning community realized that the core strength of diffusion models could fix this. <strong>By allowing the AI to evaluate multiple valid possibilities, without averaging them into a disastrous middle ground, a diffusion policy can confidently commit to just one, higher quality path<\/strong>. Going from generating digital art to steering a physical robot might sound like a wild leap, but it actually solves a massive headache in traditional AI control.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"409\" src=\"https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image.png\" alt=\"\" class=\"wp-image-608\" srcset=\"https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image.png 600w, https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-300x205.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption class=\"wp-element-caption\">A solution to the autonomous car example\u2026 maybe.<\/figcaption><\/figure>\n<\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_to_leverage_Diffusion_Models_for_Robot_Learning\"><\/span>How to leverage Diffusion Models for Robot Learning<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Up until now we\u2019ve been talking about the world of pixels and image generation. But robots don&#8217;t output pixels, they use physical trajectories. So, how do we take an architecture built for generating DALL-E images and use it to control a thousands of dollars worth robotic arm?<\/p>\n\n\n\n<p>The conceptual leap, popularized in 2023 by researchers at Columbia and MIT (Chi et al.) and demonstrated in breakthrough systems like Stanford&#8217;s Mobile ALOHA,is actually surprisingly elegant: we simply <strong>swap out the \u201cimage\u201d for an \u201caction trajectory\u201c<\/strong>. When watching these robots fluidly cook shrimp or fold laundry, the underlying mechanics are the same as DALL-E. Instead of denoising a 2D grid of RGB colors, the diffusion model denoises a sequence of motor commands over time, like joint angles, velocities, or the XYZ coordinates of a robotic gripper. The model still starts with pure, random noise, but as it iteratively refines that noise, it refines a smooth, physically viable path for the robot to follow. The mathematical process of &#8220;drawing a picture&#8221; becomes the exact same process as &#8220;planning a complex movement.&#8221;<\/p>\n\n\n\n<p>While Diffusion Policy is incredibly powerful as a pure Imitation Learning tool (learning from human demonstrations), humans are imperfect, and our datasets don&#8217;t cover every edge case. To push a robot from &#8216;pretty good&#8217; to &#8216;superhuman consistency&#8217;, researchers are increasingly taking these base Diffusion Policies and fine-tuning them using reinforcement learning (RL). <strong>The imitation learned diffusion model provides a highly capable, safe starting point, and RL provides the final polish to optimize for speed and error recovery<\/strong>.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"567\" src=\"https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-1-1024x567.png\" alt=\"\" class=\"wp-image-615\" style=\"aspect-ratio:1.8060874823624269;width:765px;height:auto\" srcset=\"https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-1-1024x567.png 1024w, https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-1-300x166.png 300w, https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-1-768x426.png 768w, https:\/\/florentinudrea.net\/wp-content\/uploads\/2026\/05\/image-1.png 1440w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Diffusion policy showing multimodal behavior wrt. other techniques in pushing a T block with an end effector (blue dot).<br>Results from <em>&#8220;Diffusion Policy: Visuomotor Policy<br>Learning via Action Diffusion&#8221;<\/em>, Cheng et al. <\/figcaption><\/figure>\n<\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Imitation_Learning_for_Diffusion_Policies\"><\/span>Imitation Learning for Diffusion Policies<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Secret_Sauce_Action_Chunking\"><\/span>The Secret Sauce: Action Chunking<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>At this point, you might be spotting a glaring problem: Latency. Diffusion takes multiple iterative denoising steps. If a robot needs to update its motors every 10 milliseconds, running a full diffusion loop which requires tens or houndreds of inferences for every single micro-movement is computationally impossible.<\/p>\n\n\n\n<p>Enter <strong>Action Chunking<\/strong>. Instead of predicting a single action for the current millisecond (which assumes a strict Markovian state), the diffusion model predicts a &#8216;chunk&#8217; of future actions, say, the next 64 steps or 2 seconds of movement all at once. The robot executes the first few steps of this chunk while the model simultaneously works on denoising the next overlapping chunk. This buys the model time to &#8220;think&#8221; while keeping the robot&#8217;s movement fluid.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Thinking_in_Sequences\"><\/span>Thinking in Sequences<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>When driving a car, you don&#8217;t consciously tell your hands to &#8220;turn the wheel 0.5 degrees for 10 milliseconds, then re-evaluate.&#8221; You plan sequences, like turning into the left lane over three seconds.<\/p>\n\n\n\n<p>Traditional AI control did the opposite. Policies were typically <strong>markovian<\/strong>: the AI observes the current state and predicts a single action for that exact millisecond. Action chunking abandons this one-step-at-a-time convention. The neural network predicts a &#8220;chunk&#8221; of future actions all at once, say, the next 10 to 50 timesteps.<\/p>\n\n\n\n<p>In the real world, however, the robot doesn&#8217;t execute the entire sequence blindly. It uses a <strong>receding-horizon<\/strong> approach: it starts executing the first few steps, but before finishing, it re-evaluates the environment and predicts a new chunk. This provides the benefit of a longer-term plan while remaining reactive to changes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_Predict_the_Future\"><\/span>Why Predict the Future?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Predicting an entire trajectory at once solves several practical deployment issues.<\/p>\n\n\n\n<p><strong>First, it buys time<\/strong>. Vision-Language-Action (VLA) models are powerful, but computationally heavy. Forcing a large model to compute commands a hundred times a second leads to lag. Chunking allows the robot to execute the current sequence (or only part of it, if receding-horizon approach is applied) while the AI computes the next one in the background. <\/p>\n\n\n\n<p>Second, <strong>chunking mitigates covariate shift<\/strong>. When a model predicts actions every millisecond, tiny errors compound. A slight miscalculation puts the robot in an unfamiliar position, causing a larger mistake on the next step. Committing to a sequence and checking the environment less frequently reduces these compounding feedback loops.<\/p>\n\n\n\n<p>Finally, chunking <strong>forces the AI to learn meaningful representations<\/strong>. If an AI only predicts one millisecond ahead, it can cheat. Since physical motion is continuous, the AI can often just repeat its last motor command and minimize its loss. Predicting further into the future breaks this shortcut, requiring the model to understand the scene&#8217;s geometry and plan a coherent path.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Reinforcement_Learning_Finetuning\"><\/span>Reinforcement Learning Finetuning<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Imitation learning easily adopts chunking, but <strong>RL traditionally assumes that optimal policies only need to look one step ahead<\/strong>. Naively plugging action chunking into standard RL algorithms, which typically rely on Gaussian policies, causes performance to collapse.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_RL_Finetuning_is_beneficial\"><\/span>Why RL Finetuning is beneficial<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Imitation is a great start, but it has a few problems. First, the <strong>performance will only be as good as the data it is provided<\/strong>. Second, counterintuitively, if the data is taken from an expert is never wrong, then <strong>the policy we are training will never know how to recover from the mistakes<\/strong> (which are basically ensured in the world of statistical learners, such as neural networks).<\/p>\n\n\n\n<p><strong>Reinforcement learning overcomes these problems by actively exploring.<\/strong> Because RL is closed-loop and trial-and-error based, the robot inevitably makes sub-optimal moves during training. This forces the model into unfamiliar territory, directly solving the <strong>covariate shift<\/strong> problem we mentioned earlier. By actively experiencing its own mistakes, the AI maps out recovery trajectories, learning how to physically guide itself back to a successful path rather than just failing when things deviate from the human expert&#8217;s script.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_RL_Finetuning_is_hard\"><\/span>Why RL Finetuning is hard<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If action chunking and diffusion models are such a dream team for imitation learning, why can&#8217;t we just plug them into standard RL algorithms and call it a day? As it turns out, forcing a generative sequence model to learn via trial-and-error introduces three massive roadblocks: a fundamental misalignment in optimization goals and a clash between long-term sequence planning and RL&#8217;s traditional reactive nature.<\/p>\n\n\n\n<p>One of the main problems is fundamental: <strong>diffusion models are inherently built to minimize a reconstruction loss<\/strong>, they are trained via maximum likelihood to make their generated outputs look exactly like the training data. <strong>Reinforcement Learning<\/strong>, on the other hand, <strong>doesn&#8217;t care about mimicking data (it doesn&#8217;t even have target data)<\/strong>: it cares about maximizing a reward, and the optimal behavior to achieve that reward might not even exist in any fixed dataset.<\/p>\n\n\n\n<p>The second hurdle is the <strong>clash between sequence planning and real-time reactivity<\/strong>. Standard RL algorithms are inherently Markovian, they operate on the assumption that the agent observes the current state, takes <em>one<\/em> immediate action, and repeats. Forcing RL to optimize an entire multi-step <em>chunk<\/em> of actions at once breaks this mathematical foundation. If the environment changes dynamically mid-chunk, a traditional RL critic doesn&#8217;t know how to evaluate or guide a sequence that is already halfway executed, causing the robot&#8217;s behavior to become fragile and slow to adapt.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Diffusion-RL_Mental_Shift\"><\/span>The Diffusion-RL Mental Shift<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>To fully grasp how to apply RL to diffusion policies, a mental shift helps. We typically view Reinforcement Learning strictly as reward optimization. While true, <strong>RL is fundamentally a mathematical framework for optimizing any non-differentiable objective function<\/strong>.<\/p>\n\n\n\n<p>Standard deep learning requires smooth, continuous math to calculate gradients and update weights. RL bypasses this constraint. It observes final outcomes and increases the probability of the network outputting actions that led to those results. We usually define these results as task success, but the target can be anything. For example, you could use RL to minimize a robot arm&#8217;s total energy consumption or maximize the physical smoothness of a trajectory.<\/p>\n\n\n\n<p><strong>With diffusion models, the optimization changes. We are not just updating the network to output a specific final action. We are optimizing the intermediate denoising steps the network takes to build that action.<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_to_Train_Diffusion_Policies_with_RL\"><\/span>How to Train Diffusion Policies with RL?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If Imitation Learning is about mimicking a human expert, Reinforcement Learning is about maximizing a mathematical reward. But merging these two paradigms presents a massive architectural challenge. Diffusion models were originally designed purely to reconstruct data distributions (like generating an image from noise), not to maximize a reward signal. Injecting RL into a diffusion model requires clever algorithmic engineering. Over the past few years, researchers have developed three primary ways to bridge this gap.<\/p>\n\n\n\n<p>The first major breakthrough was <strong>Diffusion Q-Learning (DQL)<\/strong>, introduced by <a target=\"_blank\" rel=\"noreferrer noopener\" href=\"https:\/\/arxiv.org\/abs\/2208.06193\">Wang et al. (2022)<\/a>. In standard RL, a &#8220;Q-network&#8221; evaluates how good an action is. DQL turns the diffusion model into the actor and the Q-network into the critic. During inference, the diffusion model generates candidate action trajectories, and the Q-network scores them. During training, the gradients from the Q-network are backpropagated directly into the diffusion model&#8217;s denoising steps. Essentially, the diffusion model learns to warp its original data distribution to favor actions that the Q-network deems highly rewarding.<\/p>\n\n\n\n<p>Another elegant approach is <strong>Reward-Guided Diffusion<\/strong>, which borrows directly from text-to-image models. In systems like <a href=\"https:\/\/arxiv.org\/abs\/2211.15657\" target=\"_blank\" rel=\"noreferrer noopener\">Decision Diffuser<\/a> or <a href=\"https:\/\/arxiv.org\/abs\/2205.09991\" target=\"_blank\" rel=\"noreferrer noopener\">Planning with Diffusion<\/a>, researchers bypass the need for a separate Q-network entirely. Instead, they treat the desired &#8220;reward&#8221; as a conditioning variable, much like a text prompt in DALL-E. You essentially prompt the model with a target return: <em>&#8220;Generate a trajectory that achieves a high reward.&#8221;<\/em> The denoising process is mathematically steered (using techniques like classifier-free guidance) toward the specific subspace of trajectories that historically led to success.<\/p>\n\n\n\n<p>Most recently, the field has pushed toward online, actor-critic finetuning with algorithms like <strong>Diffusion-PPO (DPPO)<\/strong> (e.g., <a href=\"https:\/\/arxiv.org\/abs\/2309.07930\" target=\"_blank\" rel=\"noreferrer noopener\">Zhuang et al., 2023<\/a> and related works). Proximal Policy Optimization (PPO) is the highly stable algorithm. DPPO adapts this for robotics by treating the entire, multi-step denoising chain as a single stochastic policy. By calculating the log-probabilities of the generated action sequences, DPPO can update the diffusion weights based on the actual physical rewards the robot receives in simulation. This allows the robot to actively explore and learn mistake-recovery behaviors in real-time, all while preserving the multi-modal expressiveness that makes diffusion models so powerful in the first place.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"p-rc_c5ba560ff95aaf8a-89\"><span class=\"ez-toc-section\" id=\"Predicting_the_Future_with_RL\"><\/span>Predicting the Future with RL<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Before Diffusion Policies arrived, the reinforcement learning community spent years trying to force AI agents to think in continuous sequences rather than single, isolated steps. Traditional RL is fundamentally Markovian: it looks at the current state, chooses one action, and waits for the next state. But physical robots require fluid, multi-step foresight. The journey to achieve this has been a constant game of trade-offs.<\/p>\n\n\n\n<p>Early attempts focused on <strong>Hierarchical Reinforcement Learning (HRL)<\/strong>, as formalized in Sutton\u2019s <a target=\"_blank\" rel=\"noreferrer noopener\" href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0004370299000521\">Options framework<\/a>. The idea was intuitive: split the brain. A high-level &#8220;manager&#8221; policy sets long-term goals (e.g., &#8220;pick up the cup&#8221;), while a low-level &#8220;worker&#8221; policy computes the immediate joint torques to execute it. However, training both levels simultaneously proved highly unstable; if the worker failed, the manager couldn&#8217;t learn effectively, creating a crippling chicken-and-egg training dynamic.<\/p>\n\n\n\n<p>To bypass this instability, roboticists frequently turned to <strong>Movement Primitives<\/strong> (<a target=\"_blank\" rel=\"noreferrer noopener\" href=\"https:\/\/www.google.com\/search?q=https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364913488827\">Schaal, 2006<\/a>). Instead of predicting raw motor commands step-by-step, the AI outputs parameters for a mathematical curve that represents a full physical trajectory. While this guaranteed beautifully smooth robotic motions, these primitives were inherently rigid. They struggled to adapt in real-time to messy, unstructured environments where sudden corrections were necessary.<\/p>\n\n\n\n<p>Later breakthroughs shifted toward <strong>Latent Space Planning<\/strong>, famously demonstrated by DeepMind&#8217;s <a target=\"_blank\" rel=\"noreferrer noopener\" href=\"https:\/\/www.nature.com\/articles\/s41586-020-03051-4\">MuZero<\/a>. Instead of planning raw actions, the agent rolls out &#8220;mental simulations&#8221; of the future inside a compressed, learned representation of the world. While incredibly powerful for board games and discrete environments, mapping these abstract latent simulations back to the high-dimensional, continuous physics of a robotic arm remained computationally massive and prone to compounding errors.<\/p>\n\n\n\n<p>This historical struggle highlights exactly why Action Chunking and Diffusion Policies have emerged as the potential grand unifiers. They allow the AI to directly output raw, continuous sequences of action (chunks) in one go, avoiding the unstable hierarchies of HRL, the mathematical rigidity of movement primitives, and the heavy mental simulations of latent planning.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Bottlenecks_and_the_Future\"><\/span>The Bottlenecks and the Future<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>We have established that merging diffusion models, reinforcement learning, and action chunking creates a highly capable robotic brain. But we are not quite ready to put humanoid robots in every home.<\/p>\n\n\n\n<p id=\"p-rc_9b56ed931f0125f0-210\">The biggest roadblock right now is computational <strong>speed<\/strong>. Diffusion models are inherently slow because turning random noise into a crisp action plan requires dozens of mathematical steps. Earlier, we mentioned that action chunking &#8216;buys time.&#8217; To be precise, chunking <strong>masks inference latency<\/strong> at the execution level, but the sheer computational weight (FLOPs) required to iteratively denoise trajectories remains a hard ceiling for physical robots. <\/p>\n\n\n\n<p id=\"p-rc_9b56ed931f0125f0-210\">This leads to the <strong>reaction problem<\/strong>. Chunking is fantastic for smooth, predictable tasks like folding clothes, but terrible for sudden, chaotic changes, like catching a falling glass. While dynamic chunk boundaries (where the AI learns to shift between long and short predictions) will help, the reality of modern deployment is <strong>hybrid architecture<\/strong>. To survive high-frequency chaos, Diffusion Policies are increasingly deployed strictly as high-level planners (running at slower rates, like ~10Hz), while battle-tested classical robotics techniques, such as low-level impedance controllers operating at 1000Hz, run underneath the AI to handle millisecond-to-millisecond physical stabilization.<\/p>\n\n\n\n<p id=\"p-rc_9b56ed931f0125f0-211\">While I can&#8217;t predict the future (unlike the dynamic latent models) I can only guess the community will be tackling these exact problems in the foreseeable future. To solve the speed issue, researchers are pushing towards faster math, exploring techniques like <strong>Consistency Models<\/strong> that aim to compress the long denoising process down to a single leap. To fix the reaction problem, a major milestone will be creating models that <strong>automatically determine their own chunk boundaries<\/strong> rather than relying on manual guessing and tuning. The AI will learn to predict long sequences when the coast is clear, and instantly switch to short, rapid predictions when the environment gets chaotic.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To wrap things up, we are looking at a fundamental shift in how we build brains for robots. For years, robot learning was stuck trying to squeeze messy, unpredictable real-world data into strict, deterministic, one-step boxes. The result was often an AI that compromised, averaged out the best solutions, and ended up crashing the car. Diffusion policies, paired with action chunking, flip that entire script.<\/p>\n\n\n\n<p>By moving from rigid, single-step control to generative, sequence-based behavior, we are teaching robots to navigate the physical world the same way image generators navigate pixels. They no longer average their options or stumble forward millisecond by millisecond. Instead, they confidently pick one valid path and commit to a cohesive, multi-step plan, smoothly filtering out the noise along the way. We still have to iron out the latency and compute issues, but the resulting leap in physical dexterity is undeniable. The generative AI revolution has finally grown hands, and it is going to be exciting to see what it builds next.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous post, we explored the rise of diffusion models in generative AI. The main takeaway was that instead of acting as simple &#8220;expected value&#8221; predictors, like classic neural networks do, diffusion models can map out complex, arbitrary distributions. This mathematical property is exactly why they are able to generate crisp, detailed images rather [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[13],"tags":[20,22,21],"class_list":["post-607","post","type-post","status-publish","format-standard","hentry","category-blog","tag-diffusion-models","tag-generative-ai","tag-neural-networks"],"_links":{"self":[{"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/posts\/607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/comments?post=607"}],"version-history":[{"count":36,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/posts\/607\/revisions"}],"predecessor-version":[{"id":818,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/posts\/607\/revisions\/818"}],"wp:attachment":[{"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/media?parent=607"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/categories?post=607"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/florentinudrea.net\/index.php\/wp-json\/wp\/v2\/tags?post=607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}